chroma db rag 다운로드 -Croma chroma db rag 소스 코드 다운로드

chroma db rag

기타 소스코드

1.0.0

다운로드

VectordB, Hugging Face Emblededder 및 Re-Rankers로 증거 증강 세대

저장소 개요

이 저장소는 벡터 데이터베이스 인 Chroma DB의 통합을 보여 주며, 밀접한 검색 증강 생성 (RAG) 시스템을 개발하기위한 모델 임베딩 모델을 보여줍니다.

모델 옵션 임베딩

올라마 임베딩 모델 :
포옹 얼굴 텍스트 임베더 :
OpenAi 임베딩 모델 :

재 랭커 통합 (HTTP, GRPC)

Rag의 정확성을 높이기 위해 Huggingface Re-Rankers 모델을 통합 할 수 있습니다. 이러한 모델은 vectordb에서 RECERED와 쿼리 결과 사이의 유사성을 평가하고, 검색된 정보가 관련성이 높고 상황에 맞는 정확한 인덱스별로 결과를 순위에 올랐습니다.

Example:
query := " What is Deep Learning? "
retrievedResults := []string{ " Tomatos are fruits... " , " Deep Learning is not... " , " Deep learning is... " }
Response: [{ " index " :2, " score " :0.9987814},{ " index " :1, " score " :0.022949383},{ " index " :0, " score " :0.000076250595}]

이 저장소는 헝겊 시스템을 개발하기 위해 임베딩 및 재고를 결합하는 방법을 보여줍니다.

이 래그 시스템을 구현하기위한 단계가 이어졌습니다

벡터 데이터베이스 설정 :
- Chroma DB를 사용하여 문서 임베딩을 저장하십시오.
- 올라마 임베딩 모델과 포옹 페이스 테이에 대한 지원.
전처리 문서 :
- 문서를 관리 가능한 청크로 나눕니다.
- Ollama의 "nomic-embed-text"와 같은 임베딩 모델을 사용하여 각 청크에 대한 임베딩을 생성합니다.
매장 임베딩 :
- 크로마 DB 벡터 데이터베이스에 청크 및 해당 임베딩을 저장하십시오.
쿼리 처리 :
- 쿼리가있을 때 :
  - 쿼리를위한 임베딩을 생성합니다.
  - 벡터 데이터베이스 내에서 유사성 검색을 수행하여 임베딩을 기반으로 가장 관련성이 높은 청크를 식별하십시오.
  - 이 청크를 쿼리의 컨텍스트로 검색하십시오.
  - Hugging Face Reranker를 사용하여 결과를 다시 랭크하십시오
LLM 제공 업체와 통합 :
- 지원되는 LLM 제공 업체
  - 올라마
  - Openai
프롬프트 템플릿 작성 :
- 원래 쿼리와 벡터 데이터베이스에서 검색된 컨텍스트를 모두 통합 한 프롬프트 템플릿을 설계하십시오.
LLM으로 처리 :
- 쿼리 및보고 된 컨텍스트를 포함한 증강 프롬프트를 응답 처리 및 생성을 위해 LLM (Large Language Model)으로 보내십시오.

이를 통해 벡터 데이터베이스 및 고급 임베딩 모델의 전력을 활용하여 언어 처리 작업을 향상시킬 수 있습니다.

샘플 결과

<|user|> what is mirostat_tau?</s>:-
Based on the provided content, I can answer your query.

** Query Result: ** Mirostat_tau Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)

** Document Content: **

mirostat_tau Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)
float
mirostat_tau 5.0

** Additional Information on this Topic: **

Here are three main points related to Mirostat_tau:

1. ** Coherence vs Diversity: ** Mirostat_tau controls the balance between coherence and diversity of the output, which means it determines how focused or creative the generated text will be.
2. ** Lower Values Mean More Focus: ** A lower value for mirostat_tau results in more focused and coherent text, while a higher value allows for more diverse and potentially less coherent output.
3. ** Default Value: ** The default value for Mirostat_tau is 5.0, which means that if no specific value is provided, the model will generate text with a balance between coherence and diversity.

Please note that these points are based solely on the provided content and do not go beyond it.%

시작하기

전제 조건

go (> = 1.22.0)
도커
Docker Compose

설치

저장소를 복제하십시오

git clone https://github.com/yourusername/chroma-db.git
cd chroma-db

GO 패키지를 설치하십시오
Go 프로젝트를 구축하십시오

go build -o chroma-db cmd/main.go

도커 컨테이너를 설정하십시오

Docker와 Docker Compose가 설치되어 있는지 확인하십시오. docker-compose.yaml 사용하여 Chroma DB 서비스를 설정하십시오.

docker-compose up -d

프로젝트 실행

./chroma-db
Usage 
  -load
        Load and embed the data in vectordb
        Provide the path to file Eg: " test/model_params.txt "
  -query
        Query the embedded data and rerank the results
        Provide the query Eg: " what is the difference between mirostat_tau and mirostat_eta? "

프로젝트 구조

CMD/ :
- main.go : Chroma DB를 실행하기위한 진입 점.
- 채팅/ :
  - Ollama_chat.go : Ollama 채팅 모델과 상호 작용하기위한 논리가 포함되어 있습니다.
내부/ 상수/ :
- Constants.go : 프로젝트 전반에 사용 된 모든 필요한 상수를 수용합니다.
docker-compose.yaml : Chroma DB 서비스를 설정하기위한 Docker Compose 구성 파일.

구성

internal/constants/constants.go 에서 구성 값을 조정하여 필요에 맞게 조정하십시오. 여기에는 다음과 같은 설정이 포함됩니다.

Chroma DB URL, 테넌트 이름, 데이터베이스 및 네임 스페이스. 올라마 모델 유형 및 URL.

프롬프트 GO 템플릿

  < | system | > {{ . SystemPrompt }} < / s >
  < | content | > {{ . Content }} < / s >
  < | user | > {{ . Prompt }} < / s >

vectordb 실행

다음 명령으로 vectordb를 시작하십시오.

docker compose up

올라마와 채팅

채팅 관련 작업 실행 :

go run ./cmd/main.go

구성

기본 구성 값은 internal/constants/constants.go 로 제공되며 필요에 따라 조정할 수 있습니다. 이 중 일부는 다음과 같습니다.

ChromaUrl , TenantName , Database , Namespace
OllamaModel 및 OllamaUrl

특허

이 프로젝트는 BSD 3 -Clause 라이센스에 따라 라이센스가 부여됩니다. 자세한 내용은 라이센스 파일을 참조하십시오.

감사의 말

크로마 DB
올라마

문제 나 기여에 대해서는 문제를 열거나 GitHub에 풀 요청을 제출하십시오.

확장하다

추가 정보

버전 1.0.0
유형 기타 소스코드
업데이트 시간 2025-05-26
크기 218.77KB
출처 Github

chroma db rag

VectordB, Hugging Face Emblededder 및 Re-Rankers로 증거 증강 세대

이 래그 시스템을 구현하기위한 단계가 이어졌습니다

샘플 결과

시작하기

전제 조건

설치

프로젝트 실행

프로젝트 구조

구성

프롬프트 GO 템플릿

vectordb 실행

올라마와 채팅

구성

특허

감사의 말

ip location db

yugabyte db

DB 도구 상자 앱

벨루가 DB 시스템 앱

DB 메일 프로 메일 서버

ASP 페이징 클래스 Kin_Db_Pager

chat.petals.dev

GPT Prompt Templates

GPTyped

Google Dorks

shepherd

mongo express

Google Dorks

shepherd

mongo express