flex_ai Download - flex_ai Quellcode herunterladen

flex_ai

AI-Quellcode

0.42

Herunterladen

Twitter (auch bekannt als X) Folgen Sie uns auf x ? Installation getflex/readme.md Unterstützte Modelle Flexai -Modelle

? Vollständiges Beispiel

 from flex_ai import FlexAI
from openai import OpenAI
import time

# Initialize the Flex AI client
client = FlexAI ( api_key = "your_api_key_here" )

# Create dataset - for all datasets [here](https://docs.getflex.ai/quickstart#upload-your-first-dataset)
dataset = client . create_dataset (
    "API Dataset New" ,
    "instruction/train.jsonl" ,
    "instruction/eval.jsonl"
)

# Start a fine-tuning task
task = client . create_finetune (
    name = "My Task New" ,
    dataset_id = dataset [ "id" ],
    model = "meta-llama/Llama-3.2-1B-Instruct" ,
    n_epochs = 5 ,
    train_with_lora = True ,
    lora_config = {
        "lora_r" : 64 ,
        "lora_alpha" : 8 ,
        "lora_dropout" : 0.1
    },
    n_checkpoints_and_evaluations_per_epoch = 1 ,
    batch_size = 4 ,
    learning_rate = 0.0001 ,
    save_only_best_checkpoint = True
)

# Wait for training completion
client . wait_for_task_completion ( task_id = task [ "id" ])

# Wait for last checkpoint to be uploaded
while True :
    checkpoints = client . get_task_checkpoints ( task_id = task [ "id" ])
    if checkpoints and checkpoints [ - 1 ][ "stage" ] == "FINISHED" :
        last_checkpoint = checkpoints [ - 1 ]
        checkpoint_list = [{
            "id" : last_checkpoint [ "id" ],
            "name" : "step_" + str ( last_checkpoint [ "step" ])
        }]
        break
    time . sleep ( 10 )  # Wait 10 seconds before checking again

# Create endpoint
endpoint_id = client . create_multi_lora_endpoint (
    name = "My Endpoint New" ,
    lora_checkpoints = checkpoints_list ,
    compute = "A100-40GB"
)
endpoint = client . wait_for_endpoint_ready ( endpoint_id = endpoint_id )

# Use the model
openai_client = OpenAI (
    api_key = "your_api_key_here" ,
    base_url = f" { endpoint [ 'url' ] } /v1"
)
completion = openai_client . completions . create (
    model = "meta-llama/Llama-3.2-1B-Instruct" ,
    prompt = "Translate the following English text to French" ,
    max_tokens = 60
)

print ( completion . choices [ 0 ]. text )

LLM-Modelle zur Feinabstimmung erhältlich

Diese Tabelle bietet einen Überblick über die für die Feinabstimmung verfügbaren großen Sprachmodelle (LLMs), die ungefähr am bekanntesten bis am wenigsten vertraut sind. Es listet wichtige Details für jedes Modell auf, einschließlich des Namens, seiner Familie, der Parameterzahl, der Kontextlänge und zusätzlichen Funktionen.

Modellname	Familie	Parameter (b)	Kontextlänge	VllM -Unterstützung	Lora -Unterstützung
Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF	Lama3.1	70	131.072	Ja	Ja
Meta-llama-3.2-3b-instruct	Lama3.2	3	131.072	Ja	Ja
Meta-llama-3.2-1b-instruct	Lama3.2	1	131.072	Ja	Ja
Mistral-Small-Instruct-2409	Mistral	7.2	128.000	Ja	Ja
Ministral-8b-Instruct-2410	Mistral	8	128.000	Ja	Ja
Mathstral-7b-V0.1	Mistral	7	32.000	Ja	Ja
QWEN2.5-CODER-7B-ISTRUCT	Qwen2.5	7	32.768	Ja	Ja
Aya-Expanse-32B	Aya	32	128.000	Ja	NEIN
Aya-Expanse-8b	Aya	8	8.000	Ja	NEIN
Nemotron-mini-4b-Instruktur	Nemotron	4	4,096	Ja	NEIN
Gemma-2-2b-it	Gemma2	2	8,192	Ja	Ja
Meta-llama-3.1-70b-struktur	Lama3.1	70	131.072	Ja	Ja
Meta-llama-3.1-70b-struktur	Lama3.1	70	131.072	Ja	Ja
Meta-llama-3.1-70b	Lama3.1	70	131.072	Ja	Ja
Meta-llama-3.1-8b-struktur	Lama3.1	8	131.072	Ja	Ja
Meta-llama-3.1-8b	Lama3.1	8	131.072	Ja	Ja
Meta-llama-3-70b-struktur	llama3	70	8,192	Ja	Ja
Meta-llama-3-70b	llama3	70	8,192	Ja	Ja
Meta-llama-3-8b-instruct	llama3	8	8,192	Ja	Ja
Meta-llama-3-8b	llama3	8	8,192	Ja	Ja
MIMTRAL-8X7B-ISTRUCT-V0.1	Mixtral	46,7	32.768	Ja	Ja
Mistral-7b-Instruct-V0.3	Mistral	7.2	32.768	Ja	Ja
Mistral-Nemo-Instruct-2407	Mistral	12.2	128.000	NEIN	NEIN
Mistral-Nemo-Base-2407	Mistral	12.2	128.000	NEIN	NEIN
Gemma-2-27b-it	Gemma2	27	8,192	Ja	Ja
Gemma-2-27b	Gemma2	27	8,192	Ja	Ja
Gemma-2-9b-it	Gemma2	9	8,192	Ja	Ja
Gemma-2-9b	Gemma2	9	8,192	Ja	Ja
PHI-3-MEDIUM-128K-ISTRUCT	Phi3	14	128.000	Ja	NEIN
PHI-3-MEDIUM-4K-ISTRAUT	Phi3	14	4.000	Ja	NEIN
PHI-3-Small-128K-Instruktur	Phi3	7.4	128.000	Ja	NEIN
PHI-3-Small-8K-Instruktur	Phi3	7.4	8.000	Ja	NEIN
PHI-3-mini-128K-Instruktur	Phi3	3.8	128.000	Ja	NEIN
PHI-3-Mini-4K-Instruktur	Phi3	3.8	4,096	Ja	NEIN
Qwen2-72b-instruct	Qwen2	72	32.768	Ja	Ja
Qwen2-72b	Qwen2	72	32.768	Ja	Ja
QWEN2-57B-A14B-Instruct	Qwen2	57	32.768	Ja	Ja
QWEN2-57B-A14B	Qwen2	57	32.768	Ja	Ja
Qwen2-7b-instruct	Qwen2	7	32.768	Ja	Ja
Qwen2-7b	Qwen2	7	32.768	Ja	Ja
Qwen2-1.5b-instruct	Qwen2	1.5	32.768	Ja	Ja
Qwen2-1.5b	Qwen2	1.5	32.768	Ja	Ja
Qwen2-0.5b-instruct	Qwen2	0,5	32.768	Ja	Ja
Qwen2-0.5b	Qwen2	0,5	32.768	Ja	Ja
Tinyllama_v1.1	Tinyllama	1.1	2.048	NEIN	NEIN
Deepseek-Coder-V2-Lite-Base	Deepseek-Coder-V2	16	163.840	NEIN	NEIN
Internlm2_5-7b-chat	Internlm2.5	7.74	1.000.000	Ja	NEIN
Internlm2_5-7b	Internlm2.5	7.74	1.000.000	Ja	NEIN
Jamba-V0.1	Jamba	51.6	256.000	Ja	Ja
Yi-1.5-34b-chat	yi-1.5	34.4	4.000	Ja	Ja
Yi-1.5-34b	yi-1.5	34.4	4.000	Ja	Ja
Yi-1.5-34b-32k	yi-1.5	34.4	32.000	Ja	Ja
Yi-1.5-34b-chat-16k	yi-1.5	34.4	16.000	Ja	Ja
Yi-1.5-9b-chat	yi-1.5	8.83	4.000	Ja	Ja
Yi-1.5-9b	yi-1.5	8.83	4.000	Ja	Ja
Yi-1.5-9b-32k	yi-1.5	8.83	32.000	Ja	Ja
Yi-1.5-9b-chat-16k	yi-1.5	8.83	16.000	Ja	Ja
Yi-1.5-6b-chat	yi-1.5	6	4.000	Ja	Ja
Yi-1.5-6b	yi-1.5	6	4.000	Ja	Ja
C4AI-Command-R-V01	Kommando-R	35	131.072	Ja	NEIN

Anmerkungen:

"VLLM Support" gibt an, ob das Modell mit dem Inferenz -Framework von VLLM (sehr großes Sprachmodell) kompatibel ist.
"Lora Support" gibt an, ob die VLLM -Unterstützung inferenz das Modell mit mehreren LORA -Adaptern. Mehr lesen
Die Kontextlänge wird in Token gemessen. (Der Modellkontext kann sich durch die Ziel -Inferenz -Bibliothek ändern)
Die Parameterzahl ist in Milliarden (B) angezeigt.
Links führen zur Seite des Modells auf dem Umarmen oder auf der offiziellen Website, sofern verfügbar.

Diese Tabelle bietet einen umfassenden Überblick über die verfügbaren Modelle, ihre Größen, Fähigkeiten und Unterstützung für verschiedene Feinabstimmentechniken. Betrachten Sie bei der Auswahl eines Modells zur Feinabstimmung Faktoren wie die Modellgröße, die Kontextlänge und die Unterstützung spezifischer Optimierungstechniken wie VLLM und LORA.

Expandieren

Zusätzliche Informationen