Please refer to the following deployment method.
Please Fork a copy of the code first and deploy it to your own Vercel. Refer to the demo video.
Railway has added a 500-hour limit per month and will not automatically shut down, so there will be a period of time each month that cannot be used. If possible, use docker to deploy.
Docker is required.
# 拉取镜像
docker pull wxxxcxx/ms-ra-forwarder:latest
# 运行
docker run --name ms-ra-forwarder -d -p 3000:3000 wxxxcxx/ms-ra-forwarder
# or
docker run --name ms-ra-forwarder -d -p 3000:3000 -e TOKEN:自定义TOKEN wxxxcxx/ms-ra-forwarder
# 浏览器访问 http://localhost:3000 Create docker-compose.yml to write the following content and save it.
version : ' 3 '
services :
ms-ra-forwarder :
container_name : ms-ra-forwarder
image : wxxxcxx/ms-ra-forwarder:latest
restart : unless-stopped
ports :
- 3000:3000
environment :
# 不需要可以不用设置环境变量
- TOKEN=自定义TOKEN Execute docker compose up -d in the docker-compose.yml directory.
Manual operation requires git and nodejs to be installed in advance.
# 获取代码
git clone https://github.com/wxxxcxx/ms-ra-forwarder.git
cd ms-ra-forwarder
# 安装依赖
npm install
# 运行
npm run startPlease visit the website you have deployed, and after testing on the page, click "Generate Reading (legado) Voice Engine Link", and then import it in Reading (legado).
The interface address is api/ra . The format is:
POST /api/ra
FORMAT: audio-16khz-128kbitrate-mono-mp3
Content-Type: text/plain
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="en-US">
<voice name="zh-CN-XiaoxiaoNeural">
如果喜欢这个项目的话请点个 Star 吧。
</voice>
</speak>
The requested text is in ssml format, which supports custom pronunciation and speech styles (currently only the Azure version supports custom speech styles). The following are related examples and documentation:
Text to voice
Improve synthesis with speech synthesis markup language (SSML)
The default audio format is webm. If you need to get audio in other formats, please modify FORMAT of the request header (the available options can be viewed in ra/index.ts).
If you need to prevent others from misusing your deployed services, you can add TOKEN to the application's environment variable and then add Authorization: Bearer <TOKEN> access to the request header.
Microsoft's official Azure TTS service currently has a certain free limit. If the free limit is enough for you, please support the official service.
If you only need to generate voice for fixed text, you can use audio content creation. It provides richer features to generate more natural sounds.
This project uses the interface between the Edge browser "Read Aloud" and Azure TTS demo pages, and does not guarantee subsequent availability and stability.
This project is for learning and reference only and is not commercially available.
2023-04-19: Azure has offline trial function for the demo page, which has caused the Azure version interface to be unusable. Please migrate to the Edge browser interface.
2022-11-18: Add dictionary file support, refer to https://github.com/wxxxcxx/azure-tts-lexicon-cn/blob/main/lexicon.xml for dictionary file format.
2022-09-10: Modify the docker repository address, and the docker image built later will be migrated to wxxxcxx/ms-ra-forwarder (the old version of the original repository image is still valid).
2022-09-01: The Azure TTS API seems to have been changed again. Older version users may not be able to use it normally. Please update to the latest version.
2022-07-17: Added Azure TTS API support (I haven't tested it much, I don't know if it's stable or not to use it). Because calling the Azure TTS API requires obtaining the authorization code. Other methods can be used for a period of time only or once, and Vercel needs to re-get the authorization code every time he calls the API. Not only is it easy to time out, it also increases the burden on Microsoft servers, so it is not recommended for users who deploy it in Vercel (although it is not impossible to use it~ but if Microsoft is hurt and it will be bad to change the interface again?).
2022-07-02: Edge version API The formats currently supported by tests include webm-24khz-16bit-mono-opu , audio-24khz-48kbitrate-mono-mp3 , and audio-24khz-96kbitrate-mono-mp3 . In addition, starting this afternoon, if you use the sound that is not in the drop-down list, you will see an error like "Unsupported voice zh-CN-YunyeNeural.", and it may also be cut off in the future. Use and cherish it!
2022-07-01: Services deployed on servers outside mainland China can currently only choose audio in So users who use Vercel need to redeploy it.webm-24khz-16bit-mono-opus format!
2022-06-16: The interface provided by the Edge browser can no longer set the speech style. If you find that it cannot be used normally, please refer to #12 for updates.
Thanks to the following organizations/individuals for their support for this project