
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.
The server implements OpenAI-compatible endpoints:
/v1/chat/completions
/v1/audio/speech - Text-to-Speech/v1/audio/transcriptions - Speech-to-Text/v1/models - List models/v1/models/{model} - Retrieve or Delete model/v1/images/generations - Image generation# Install using pip
pip install mlx-omni-server# If installed via pip as a package
mlx-omni-serveryou can use --port to specify a different port,such as: mlx-omni-server --port 10240, default port is 10240.
You can view more startup parameters by using mlx-omni-server --help.
from openai import OpenAI
# Configure client to use local server
client = OpenAI(
base_url="http://localhost:10240/v1", # Point to local server
api_key="not-needed" # API key is not required for local server
)
# Text-to-Speech Example
response = client.audio.speech.create(
model="lucasnewman/f5-tts-mlx",
input="Hello, welcome to MLX Omni Server!"
)
# Speech-to-Text Example
audio_file = open("speech.mp3", "rb")
transcript = client.audio.transcriptions.create(
model="mlx-community/whisper-large-v3-turbo",
file=audio_file
)
# Chat Completion Example
chat_completion = client.chat.completions.create(
model="meta-llama/Llama-3.2-3B-Instruct",
messages=[
{"role": "user", "content": "What can you do?"}
]
)
# Image Generation Example
image_response = client.images.generate(
model="argmaxinc/mlx-FLUX.1-schnell",
prompt="A serene landscape with mountains and a lake",
n=1,
size="512x512"
)You can view more examples in examples.
We welcome contributions! If you're interested in contributing to MLX Omni Server, please check out our Development Guide for detailed information about:
For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License - see the LICENSE file for details.
This project is not affiliated with or endorsed by OpenAI or Apple. It's an independent implementation that provides OpenAI-compatible APIs using Apple's MLX framework.