flex_ai
0.42
Twitter (aka X)
from flex_ai import FlexAI
from openai import OpenAI
import time
# Initialize the Flex AI client
client = FlexAI(api_key="your_api_key_here")
# Create dataset - for all datasets [here](https://docs.getflex.ai/quickstart#upload-your-first-dataset)
dataset = client.create_dataset(
"API Dataset New",
"instruction/train.jsonl",
"instruction/eval.jsonl"
)
# Start a fine-tuning task
task = client.create_finetune(
name="My Task New",
dataset_id=dataset["id"],
model="meta-llama/Llama-3.2-1B-Instruct",
n_epochs=5,
train_with_lora=True,
lora_config={
"lora_r": 64,
"lora_alpha": 8,
"lora_dropout": 0.1
},
n_checkpoints_and_evaluations_per_epoch=1,
batch_size=4,
learning_rate=0.0001,
save_only_best_checkpoint=True
)
# Wait for training completion
client.wait_for_task_completion(task_id=task["id"])
# Wait for last checkpoint to be uploaded
while True:
checkpoints = client.get_task_checkpoints(task_id=task["id"])
if checkpoints and checkpoints[-1]["stage"] == "FINISHED":
last_checkpoint = checkpoints[-1]
checkpoint_list = [{
"id": last_checkpoint["id"],
"name": "step_" + str(last_checkpoint["step"])
}]
break
time.sleep(10) # Wait 10 seconds before checking again
# Create endpoint
endpoint_id = client.create_multi_lora_endpoint(
name="My Endpoint New",
lora_checkpoints=checkpoints_list,
compute="A100-40GB"
)
endpoint = client.wait_for_endpoint_ready(endpoint_id=endpoint_id)
# Use the model
openai_client = OpenAI(
api_key="your_api_key_here",
base_url=f"{endpoint['url']}/v1"
)
completion = openai_client.completions.create(
model="meta-llama/Llama-3.2-1B-Instruct",
prompt="Translate the following English text to French",
max_tokens=60
)
print(completion.choices[0].text)This table provides an overview of the Large Language Models (LLMs) available for fine-tuning, ordered approximately from most well-known to least familiar. It lists key details for each model, including its name, family, parameter count, context length, and additional features.
| Model Name | Family | Parameters (B) | Context Length | vLLM Support | LoRA Support |
|---|---|---|---|---|---|
| Nvidia-Llama-3.1-Nemotron-70B-Instruct-HF | llama3.1 | 70 | 131,072 | Yes | Yes |
| Meta-Llama-3.2-3B-Instruct | llama3.2 | 3 | 131,072 | Yes | Yes |
| Meta-Llama-3.2-1B-Instruct | llama3.2 | 1 | 131,072 | Yes | Yes |
| Mistral-Small-Instruct-2409 | mistral | 7.2 | 128,000 | Yes | Yes |
| Ministral-8B-Instruct-2410 | mistral | 8 | 128,000 | Yes | Yes |
| Mathstral-7B-v0.1 | mistral | 7 | 32,000 | Yes | Yes |
| Qwen2.5-Coder-7B-Instruct | qwen2.5 | 7 | 32,768 | Yes | Yes |
| Aya-Expanse-32b | aya | 32 | 128,000 | Yes | No |
| Aya-Expanse-8b | aya | 8 | 8,000 | Yes | No |
| Nemotron-Mini-4B-Instruct | nemotron | 4 | 4,096 | Yes | No |
| Gemma-2-2b-it | gemma2 | 2 | 8,192 | Yes | Yes |
| Meta-Llama-3.1-70B-Instruct | llama3.1 | 70 | 131,072 | Yes | Yes |
| Meta-Llama-3.1-70B-Instruct | llama3.1 | 70 | 131,072 | Yes | Yes |
| Meta-Llama-3.1-70B | llama3.1 | 70 | 131,072 | Yes | Yes |
| Meta-Llama-3.1-8B-Instruct | llama3.1 | 8 | 131,072 | Yes | Yes |
| Meta-Llama-3.1-8B | llama3.1 | 8 | 131,072 | Yes | Yes |
| Meta-Llama-3-70B-Instruct | llama3 | 70 | 8,192 | Yes | Yes |
| Meta-Llama-3-70B | llama3 | 70 | 8,192 | Yes | Yes |
| Meta-Llama-3-8B-Instruct | llama3 | 8 | 8,192 | Yes | Yes |
| Meta-Llama-3-8B | llama3 | 8 | 8,192 | Yes | Yes |
| Mixtral-8x7B-Instruct-v0.1 | mixtral | 46.7 | 32,768 | Yes | Yes |
| Mistral-7B-Instruct-v0.3 | mistral | 7.2 | 32,768 | Yes | Yes |
| Mistral-Nemo-Instruct-2407 | mistral | 12.2 | 128,000 | No | No |
| Mistral-Nemo-Base-2407 | mistral | 12.2 | 128,000 | No | No |
| Gemma-2-27b-it | gemma2 | 27 | 8,192 | Yes | Yes |
| Gemma-2-27b | gemma2 | 27 | 8,192 | Yes | Yes |
| Gemma-2-9b-it | gemma2 | 9 | 8,192 | Yes | Yes |
| Gemma-2-9b | gemma2 | 9 | 8,192 | Yes | Yes |
| Phi-3-medium-128k-instruct | phi3 | 14 | 128,000 | Yes | No |
| Phi-3-medium-4k-instruct | phi3 | 14 | 4,000 | Yes | No |
| Phi-3-small-128k-instruct | phi3 | 7.4 | 128,000 | Yes | No |
| Phi-3-small-8k-instruct | phi3 | 7.4 | 8,000 | Yes | No |
| Phi-3-mini-128k-instruct | phi3 | 3.8 | 128,000 | Yes | No |
| Phi-3-mini-4k-instruct | phi3 | 3.8 | 4,096 | Yes | No |
| Qwen2-72B-Instruct | qwen2 | 72 | 32,768 | Yes | Yes |
| Qwen2-72B | qwen2 | 72 | 32,768 | Yes | Yes |
| Qwen2-57B-A14B-Instruct | qwen2 | 57 | 32,768 | Yes | Yes |
| Qwen2-57B-A14B | qwen2 | 57 | 32,768 | Yes | Yes |
| Qwen2-7B-Instruct | qwen2 | 7 | 32,768 | Yes | Yes |
| Qwen2-7B | qwen2 | 7 | 32,768 | Yes | Yes |
| Qwen2-1.5B-Instruct | qwen2 | 1.5 | 32,768 | Yes | Yes |
| Qwen2-1.5B | qwen2 | 1.5 | 32,768 | Yes | Yes |
| Qwen2-0.5B-Instruct | qwen2 | 0.5 | 32,768 | Yes | Yes |
| Qwen2-0.5B | qwen2 | 0.5 | 32,768 | Yes | Yes |
| TinyLlama_v1.1 | tinyllama | 1.1 | 2,048 | No | No |
| DeepSeek-Coder-V2-Lite-Base | deepseek-coder-v2 | 16 | 163,840 | No | No |
| InternLM2_5-7B-Chat | internlm2.5 | 7.74 | 1,000,000 | Yes | No |
| InternLM2_5-7B | internlm2.5 | 7.74 | 1,000,000 | Yes | No |
| Jamba-v0.1 | jamba | 51.6 | 256,000 | Yes | Yes |
| Yi-1.5-34B-Chat | yi-1.5 | 34.4 | 4,000 | Yes | Yes |
| Yi-1.5-34B | yi-1.5 | 34.4 | 4,000 | Yes | Yes |
| Yi-1.5-34B-32K | yi-1.5 | 34.4 | 32,000 | Yes | Yes |
| Yi-1.5-34B-Chat-16K | yi-1.5 | 34.4 | 16,000 | Yes | Yes |
| Yi-1.5-9B-Chat | yi-1.5 | 8.83 | 4,000 | Yes | Yes |
| Yi-1.5-9B | yi-1.5 | 8.83 | 4,000 | Yes | Yes |
| Yi-1.5-9B-32K | yi-1.5 | 8.83 | 32,000 | Yes | Yes |
| Yi-1.5-9B-Chat-16K | yi-1.5 | 8.83 | 16,000 | Yes | Yes |
| Yi-1.5-6B-Chat | yi-1.5 | 6 | 4,000 | Yes | Yes |
| Yi-1.5-6B | yi-1.5 | 6 | 4,000 | Yes | Yes |
| c4ai-command-r-v01 | command-r | 35 | 131,072 | Yes | No |
This table provides a comprehensive overview of the available models, their sizes, capabilities, and support for various fine-tuning techniques. When choosing a model for fine-tuning, consider factors such as the model size, context length, and support for specific optimization techniques like vLLM and LoRA.