This project provides a local, OpenAI-compatible text-to-speech (TTS) API using edge-tts. It emulates the OpenAI TTS endpoint (/v1/audio/speech), enabling users to generate speech from text with various voice options and playback speeds, just like the OpenAI API.
edge-tts uses Microsoft Edge's online text-to-speech service, so it is completely free.
View this project on Docker Hub
/v1/audio/speech with similar request structure and behavior.edge-tts equivalents.requirements.txt.git clone https://github.com/travisvn/openai-edge-tts.git
cd openai-edge-tts.env file in the root directory with the following variables:API_KEY=your_api_key_here
PORT=5050
DEFAULT_VOICE=en-US-AndrewNeural
DEFAULT_RESPONSE_FORMAT=mp3
DEFAULT_SPEED=1.2
DEFAULT_LANGUAGE=en-US
REQUIRE_API_KEY=True
Or, copy the default .env.example with the following:
cp .env.example .envdocker compose up --build(Note: docker-compose is not the same as docker compose)
Run with -d to run docker compose in "detached mode", meaning it will run in the background and free up your terminal.
docker compose up -dAlternatively, run directly with Docker:
docker build -t openai-edge-tts .
docker run -p 5050:5050 --env-file .env openai-edge-ttsTo run the container in the background, add -d after the docker run command:
docker run -d -p 5050:5050 --env-file .env openai-edge-ttshttp://localhost:5050.If you prefer to run this project directly with Python, follow these steps to set up a virtual environment, install dependencies, and start the server.
git clone https://github.com/travisvn/openai-edge-tts.git
cd openai-edge-ttsCreate and activate a virtual environment to isolate dependencies:
# For macOS/Linux
python3 -m venv venv
source venv/bin/activate
# For Windows
python -m venv venv
venvScriptsactivateUse pip to install the required packages listed in requirements.txt:
pip install -r requirements.txtCreate a .env file in the root directory and set the following variables:
API_KEY=your_api_key_here
PORT=5050
DEFAULT_VOICE=en-US-AndrewNeural
DEFAULT_RESPONSE_FORMAT=mp3
DEFAULT_SPEED=1.2
DEFAULT_LANGUAGE=en-US
REQUIRE_API_KEY=True
Once configured, start the server with:
python app/server.pyThe server will start running at http://localhost:5050.
You can now interact with the API at http://localhost:5050/v1/audio/speech and other available endpoints. See the Usage section for request examples.
/v1/audio/speech
Generates audio from the input text. Available parameters:
Required Parameter:
Optional Parameters:
"tts-1").edge-tts voice (default: "en-US-AndrewNeural").mp3, opus, aac, flac, wav, pcm (default: mp3).1.2.Example request with curl and saving the output to an mp3 file:
curl -X POST http://localhost:5050/v1/audio/speech
-H "Content-Type: application/json"
-H "Authorization: Bearer your_api_key_here"
-d '{
"input": "Hello, I am your AI assistant! Just let me know how I can help bring your ideas to life.",
"voice": "echo",
"response_format": "mp3",
"speed": 1.2
}'
--output speech.mp3Or, to be in line with the OpenAI API endpoint parameters:
curl -X POST http://localhost:5050/v1/audio/speech
-H "Content-Type: application/json"
-H "Authorization: Bearer your_api_key_here"
-d '{
"model": "tts-1",
"input": "Hello, I am your AI assistant! Just let me know how I can help bring your ideas to life.",
"voice": "alloy"
}'
--output speech.mp3And an example of a language other than English:
curl -X POST http://localhost:5050/v1/audio/speech
-H "Content-Type: application/json"
-H "Authorization: Bearer your_api_key_here"
-d '{
"model": "tts-1",
"input": "じゃあ、行く。電車の時間、調べておくよ。",
"voice": "ja-JP-KeitaNeural"
}'
--output speech.mp3edge-tts voices for a given language / locale.edge-tts voices, with language support information.Contributions are welcome! Please fork the repository and create a pull request for any improvements.
This project is licensed under GNU General Public License v3.0 (GPL-3.0), and the acceptable use-case is intended to be personal use. For enterprise or non-personal use of openai-edge-tts, contact me at [email protected]
Tip
Swap localhost to your local IP (ex. 192.168.0.1) if you have issues
It may be the case that, when accessing this endpoint on a different server / computer or when the call is made from another source (like Open WebUI), you need to change the URL from localhost to your local IP (something like 192.168.0.1 or similar)
Open up the Admin Panel and go to Settings -> Audio
Below, you can see a screenshot of the correct configuration for using this project to substitute the OpenAI endpoint
Note
View the official docs for Open WebUI integration with OpenAI Edge TTS
In version 1.6.8, AnythingLLM added support for "generic OpenAI TTS providers" — meaning we can use this project as the TTS provider in AnythingLLM
Open up settings and go to Voice & Speech (Under AI Providers)
Below, you can see a screenshot of the correct configuration for using this project to substitute the OpenAI endpoint
your_api_key_here never needs to be replaced — No "real" API key is required. Use whichever string you'd like.docker run -d -p 5050:5050 -e API_KEY=your_api_key_here -e PORT=5050 travisvn/openai-edge-tts:latestPlay voice samples and see all available Edge TTS voices