Rag를 사용하여 Github 저장소에 대해 쉽게 질문하십시오.
이 프로젝트를 사용하면 GitHub 저장소와 직접 상호 작용하고 시맨틱 검색을 활용하여 코드베이스를 이해할 수 있습니다. 저장소의 코드에 대한 구체적인 질문을하고 의미 있고 맥락 인식 응답을받습니다.

이 프로세스는 GitHub 저장소를 /tmp 디렉토리에 로컬로 클로닝하여 시작합니다. 그런 다음 SimpleDirectoryReader 사용하여 복제 된 저장소를 인덱싱하기 위해로드합니다. 문서는 JSON , Markdown 및 다른 형식의 문장 SentenceSplitters 와 함께 코드 파일 용 CodeSplitter 사용하여 파일 유형을 기반으로 청크로 분할됩니다. 노드를 구문 분석 한 후 text-embedding-3-large 모델을 사용하여 임베딩이 생성되고 Elasticsearch에 저장됩니다. 이 설정을 통해 시맨틱 검색이 가능하여 코드에 대한 의미있는 질문을 할 수 있습니다.
저장소 복제 :
git clone https://github.com/framsouza/github-assistant.git
cd github-assistant필요한 라이브러리 설치 :
pip install -r requirements.txt 환경 변수 설정 : Elasticsearch 자격 증명 및 대상 GitHub 리포지토리 세부 정보 (예 : GITHUB_TOKEN , GITHUB_OWNER , GITHUB_REPO , GITHUB_BRANCH , ELASTIC_CLOUD_ID , ELASTIC_USER , ELASTIC_PASSWORD , ELASTIC_INDEX )로 .env 파일을 업데이트하십시오.
python index.py Elasticsearch 지수가 생성되어 임베딩을 수용합니다. 그런 다음 ESS 배포에 연결하고 인덱스에 대한 검색 쿼리를 실행할 수 있습니다. 새 필드라는 새로운 필드가 embeddings 됩니다.
python query.py예:
python query.py
Please enter your query: Give me a detailed list of the external dependencies being used in this repository
Based on the provided context, the following is a list of third-party dependencies used in the given Elastic Cloud on K8s project:
1. dario.cat/mergo (BSD-3-Clause, v1.0.0)
2. Masterminds/sprig (MIT, v3.2.3)
3. Masterminds/semver (MIT, v4.0.0)
4. go-spew (ISC, v1.1.2-0.20180830191138-d8f796af33cc)
5. elastic/go-ucfg (Apache-2.0, v0.8.8)
6. ghodss/yaml (MIT, v1.0.0)
7. go-logr/logr (Apache-2.0, v1.4.1)
8. go-test/deep (MIT, v1.1.0)
9. gobuffalo/flect (MIT, v1.0.2)
10. google/go-cmp (BSD-3-Clause, v0.6.0)
...
This list includes both direct and indirect dependencies as identified in the context.None
질문하고 싶은 질문 :
evaluation.py Code는 문서를 처리하고, 내용을 기반으로 평가 질문을 생성 한 다음, 관련성 ( 응답이 질문과 관련이 있는지 여부 )과 LLM을 사용하여 신실 ( 응답이 소스 내용에 충실한 지 )에 대한 응답을 평가합니다. 다음은 코드 사용 방법에 대한 단계별 안내서입니다.
python evaluation.py --num_documents 5 --skip_documents 2 --num_questions 3 --skip_questions 1 --process_last_questions
매개 변수없이 코드를 실행할 수 있지만 위의 예는 매개 변수를 사용하는 방법을 보여줍니다. 다음은 각 매개 변수가하는 일에 대한 분류입니다.
문서를로드 한 후 스크립트는 이러한 문서의 내용에 따라 질문 목록을 생성합니다.
Number of documents loaded: 5
All available questions generated:
0. What is the purpose of chunking monitors in the updated push command as mentioned in the changelog?
1. How does the changelog describe the improvement made to the performance of the push command?
2. What new feature is added to the synthetics project when it is created via the `init` command?
3. According to the changelog, what is the file size of the CHANGELOG.md document?
4. On what date was the CHANGELOG.md file last modified?
5. What is the significance of the example lightweight monitor yaml file mentioned in the changelog?
6. How might the changes described in the changelog impact the workflow of users creating or updating monitors?
7. What is the file path where the CHANGELOG.md document is located?
8. Can you identify the issue numbers associated with the changes mentioned in the changelog?
9. What is the creation date of the CHANGELOG.md file as per the context information?
10. What type of file is the document described in the context information?
11. On what date was the CHANGELOG.md file last modified?
12. What is the file size of the CHANGELOG.md document?
13. Identify one of the bug fixes mentioned in the CHANGELOG.md file.
14. What command is referenced in the context of creating new synthetics projects?
15. How does the CHANGELOG.md file address the issue of varying NDJSON chunked response sizes?
16. What is the significance of the number #680 in the context of the document?
17. What problem is addressed by skipping the addition of empty values for locations?
18. How many bug fixes are explicitly mentioned in the provided context?
19. What is the file path of the CHANGELOG.md document?
20. What is the file path of the document being referenced in the context information?
...
Generated questions:
1. What command is referenced in relation to the bug fix in the CHANGELOG.md?
2. On what date was the CHANGELOG.md file created?
3. What is the primary purpose of the document based on the context provided?
Total number of questions generated: 3
Processing Question 1 of 3:
Evaluation Result:
+---------------------------------------------------+-------------------------------------------------+----------------------------------------------------+----------------------+----------------------+-------------------+------------------+------------------+
| Query | Response | Source | Relevancy Response | Relevancy Feedback | Relevancy Score | Faith Response | Faith Feedback |
+===================================================+=================================================+====================================================+======================+======================+===================+==================+==================+
| What command is referenced in relation to the bug | The `init` command is referenced in relation to | Bug Fixes | Pass | YES | 1 | Pass | YES |
| fix in the CHANGELOG.md? | the bug fix in the CHANGELOG.md. | | | | | | |
| | | | | | | | |
| | | - Pick the correct loader when bundling TypeScript | | | | | |
| | | or JavaScript journey files | | | | | |
| | | | | | | | |
| | | during push command #626 | | | | | |
+---------------------------------------------------+-------------------------------------------------+----------------------------------------------------+----------------------+----------------------+-------------------+------------------+------------------+
Processing Question 2 of 3:
Evaluation Result:
+-------------------------------------------------+------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+
| Query | Response | Source | Relevancy Response | Relevancy Feedback | Relevancy Score | Faith Response | Faith Feedback |
+=================================================+================================================+==============================+======================+======================+===================+==================+==================+
| On what date was the CHANGELOG.md file created? | The date mentioned in the CHANGELOG.md file is | v1.0.0-beta-38 (20222-11-02) | Pass | YES | 1 | Pass | YES |
| | November 2, 2022. | | | | | | |
+-------------------------------------------------+------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+
Processing Question 3 of 3:
Evaluation Result:
+---------------------------------------------------+---------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+
| Query | Response | Source | Relevancy Response | Relevancy Feedback | Relevancy Score | Faith Response | Faith Feedback |
+===================================================+===================================================+==============================+======================+======================+===================+==================+==================+
| What is the primary purpose of the document based | The primary purpose of the document is to provide | v1.0.0-beta-38 (20222-11-02) | Pass | YES | 1 | Pass | YES |
| on the context provided? | a changelog detailing the features and | | | | | | |
| | improvements made in version 1.0.0-beta-38 of a | | | | | | |
| | software project. It highlights specific | | | | | | |
| | enhancements such as improved validation for | | | | | | |
| | monitor schedules and an enhanced push command | | | | | | |
| | experience. | | | | | | |
+---------------------------------------------------+---------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+
(clean_env) (base) framsouza@Frams-MacBook-Pro-2 git-assistant %
+-------------------------------------------------+------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+------+------------------+
Processing Question 3 of 3:
Evaluation Result:
+---------------------------------------------------+---------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+-----------+------------------+
| Query | Response | Source | Relevancy Response | Relevancy Feedback | Relevancy Score | Faith Response | Faith Feedback |Response | Faith Feedback |
+===================================================+===================================================+==============================+======================+======================+===================+==================+==================+===========+==================+
| What is the primary purpose of the document based | The primary purpose of the document is to provide | v1.0.0-beta-38 (20222-11-02) | Pass | YES | 1 | Pass | YES | | YES |
| on the context provided? | a changelog detailing the features and | | | | | | | | |
| | improvements made in version 1.0.0-beta-38 of a | | | | | | | | |
| | software project. It highlights specific | | | | | | | | |
| | enhancements such as improved validation for | | | | | | | | |
| | monitor schedules and an enhanced push command | | | | | | | | |
| | experience. | | | | | | | | |
+---------------------------------------------------+---------------------------------------------------+------------------------------+----------------------+----------------------+-------------------+------------------+------------------+-----------+------------------+
이 코드를 활용할 수있는 몇 가지 방법은 다음과 같습니다.
행복한 걸레!