A simple HTTP service that provides Text-to-Speech functionality using Microsoft Edge's TTS engine, supporting multiple languages and voices through RESTful APIs.
English | 中文
git clone https://github.com/doctoroyy/edge-tts-as-a-service
cd edge-tts-as-a-servicepip install -r requirements.txtpython main.pyThe service will be available at http://localhost:5000
docker build -t edge-tts-as-a-service .docker run -d -p 5000:5000 edge-tts-as-a-serviceRetrieve all supported voice options.
GET /voices
Response example:
{
"code": 200,
"message": "OK",
"data": [
{
"Name": "en-US-GuyNeural",
"ShortName": "en-US-GuyNeural",
"Gender": "Male",
"Locale": "en-US"
},
// ... more voices
]
}Convert text to speech and download the audio file.
POST /tts
Request body:
{
"text": "Hello, World!",
"voice": "en-US-GuyNeural", // Optional, defaults to "zh-CN-YunxiNeural"
"file_name": "hello.mp3" // Optional, defaults to "test.mp3"
}Response:
Convert text to speech with streaming output, suitable for real-time playback.
POST /tts/stream
Request body:
{
"text": "Hello, World!",
"voice": "en-US-GuyNeural" // Optional, defaults to "zh-CN-YunxiNeural"
}Response:
import requests
# Get available voices
response = requests.get('http://localhost:5000/voices')
voices = response.json()['data']
# Text-to-Speech (Download)
data = {
"text": "Hello, World!",
"voice": "en-US-GuyNeural",
"file_name": "output.mp3"
}
response = requests.post('http://localhost:5000/tts', json=data)
with open('output.mp3', 'wb') as f:
f.write(response.content)
# Text-to-Speech (Streaming)
response = requests.post('http://localhost:5000/tts/stream', json=data, stream=True)
with open('stream_output.mp3', 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)# Get available voices
curl http://localhost:5000/voices
# Text-to-Speech (Download)
curl -X POST http://localhost:5000/tts
-H "Content-Type: application/json"
-d '{"text":"Hello, World!", "voice":"en-US-GuyNeural"}'
--output output.mp3
# Text-to-Speech (Streaming)
curl -X POST http://localhost:5000/tts/stream
-H "Content-Type: application/json"
-d '{"text":"Hello, World!", "voice":"en-US-GuyNeural"}'
--output stream_output.mp3Looking for a ready-to-use frontend interface?
? Quick Link: react-audio-stream-demo
This React demo provides a fully functional frontend for seamless TTS interaction, making it easy to demonstrate and integrate the Edge-TTS service with a user-friendly interface.
Q: How do I choose the right voice?
A: Use the /voices endpoint to get a list of all available voices. Choose based on the Locale and Gender attributes.
Q: What languages are supported?
A: Multiple languages including English, Chinese, Japanese, etc. Check the /voices endpoint for a complete list.
Q: What is the audio file format?
A: The service generates MP3 audio files.
Issues and Pull Requests are welcome. Before submitting a PR, please:
MIT License