auto ollama 다운로드 -Auto auto ollama 소스 코드 다운로드

auto ollama

AI 소스 코드

1.0.0

다운로드

Auto-Ollama & Auto-Gguf ⚡️

단일 명령으로 로컬로 큰 언어 모델 (LLM)을 추론하거나 양자화

개요

Auto-Ollama는 지역 환경에서 직접 LLM (Lange Language Model)의 추론 또는 양자화를 단순화하도록 설계된 툴킷입니다. Auto-Ollama는 사용 편의성 및 유연성에 중점을 둔 상태에서 모델을 직접 사용 및 로컬 배치를위한 효율적인 형식으로 전환하는 것을 지원합니다.

양자화하려면 Auto-Quantllm이라는 새 패키지를 확인하십시오. 현재 개발 중이지만 다른 양자화 방법을 사용하여 LLM (Lange Language Model)을 정량화하는 간소화되고 사용자 친화적 인 접근 방식을 제공하는 것을 목표로합니다.

시작하기

설치

저장소를 복제하여 Auto-Ollama를 시작하십시오.

git clone https://github.com/monk1337/auto-ollama.git
cd auto-ollama

빠른 여행

Auto-Ollama 실행 Autollama.sh 스크립트를 사용하여 LLM을 빠르게 추론합니다. 이 스크립트에는 모델 이름과 양자화 된 파일 이름이 인수로 필요합니다.

 # Deploy Large Language Models (LLMs) locally with Auto-Ollama
# Usage:
# ./scripts/autollama.sh -m <model path> -g <gguf file name>


# Example command:
./scripts/autollama.sh -m TheBloke/MistralLite-7B-GGUF -g mistrallite.Q4_K_M.gguf

AutoGGUF로 정점이없는 모델을 처리합니다

원하는 모델이 로컬 배치에 적합한 양자 형식으로 사용할 수없는 경우 Auto-Ollama는 AutoGGUF 유틸리티를 제공합니다. 이 도구는 포옹 페이스 모델을 GGUF 형식으로 변환하여 포옹 페이스 모델 허브에 업로드 할 수 있습니다.

 # Convert your Hugging Face model to GGUF format for local deployment
# Usage:
# ./scripts/autogguf.sh -m <MODEL_ID> [-u USERNAME] [-t TOKEN] [-q QUANTIZATION_METHODS]

# Example command:
./scripts/autogguf.sh -m unsloth/gemma-2b

더 많은 옵션

 # if want to upload the gguf model to hub after the conversion, provide the user and token
# Example command:
./scripts/autogguf.sh -m unsloth/gemma-2b -u user_name -t hf_token


# if wants to provide QUANTIZATION_METHODS
# Example command:
./scripts/autogguf.sh -m unsloth/gemma-2b -u user_name -t hf_token -q " q4_k_m,q5_k_m "