tts-joinery is a Python library and CLI tool to work around length limitations in text-to-speech APIs.
Since currently-popular APIs are limited to 4096 characters, this library will:
Currently only the OpenAI API is supported, with the intent to add more in the future.
pip install tts-joineryor use pipx to install as a standalone tool.
Requires ffmpeg for the audio file processing.
Installation may vary depending on your system. On Linux you can use your system package manager. On Mac brew install ffmpeg should work.
The CLI expects to find an OpenAI API Key in a OPENAI_API_KEY environment variable, or in a .env file.
ttsjoin [OPTIONS] [COMMAND]
Options:
--input-file FILENAME Plaintext file to process into speech, otherwise stdin
--output-file FILENAME MP3 result, otherwise stdout
--model TEXT Slug of the text-to-speech model to be used
--service TEXT API service (currently only supports openai)
--voice TEXT Slug of the voice to be used
--no-cache BOOLEAN Disable caching
--help Show this message and exit.
Commands:
cache [clear, show]
ttsjoin --input-file input.txt --output-file output.mp3 --model tts-1 --service openai --voice onyxecho "Your text to be processed" | ttsjoin > output.mp3ttsjoin --input-file input.txt --output-file output.mp3 --no-cachettsjoin cache clearYou can also use tts-joinery as part of your Python project:
import nltk
from joinery.op import JoinOp
from joinery.api.openai import OpenAIApi
# Only need to download once, handled for you automatically in the CLI
nltk.download('punkt_tab', quiet=True)
tts = JoinOp(
text='This is only a test!',
api=OpenAIApi(
model='tts-1-hd',
voice='onyx',
api_key=OPENAI_API_KEY,
),
)
tts.process_to_file('output.mp3')Contributions welcome, particularly other TTS APIs, check the issues beforehand and feel free to open a PR. Code is formatted with Black.
Test can be run manually. Suite includes end-to-end tests with live API calls, ensure you have an OPENAI_API_KEY set in .env.test, and run pytest. You can install development dependencies with pip install -e .[test]
Special thanks to:
This project is licensed under the MIT License.
Copyright 2024, Adrien Delessert