
Read2Me is a FastAPI application that fetches content from provided URLs, processes the text, converts it into speech using Microsoft Azure's Edge TTS or with the local TTS models F5-TTS, StyleTTS2 or Piper TTS, and tags the resulting MP3 files with metadata. You can either turn the full text into audio or have an LLM convert the seed text into a podcast. Currently Ollama and any OpenAI compatible API is supported. You can install the provided Chromium Extension in any Chromium-based browser (e.g. Chrome or Microsoft Edge) to send current urls or any text to the sever, add sources and keywords for automatic fetching.
This is a currently a beta version but I plan to extend it to support other content types (e.g., epub) in the future and provide more robust support for languages other than English. Currently, when using the default Azure Edge TTS, it already supports other languages and tries to autodetect it from the text but quality might vary depending on the language.
requirements.txt for edge-tts, separate requirements for F5 and StyleTTS2.Clone the repository:
git clone https://github.com/WismutHansen/READ2ME.git
cd read2meCreate and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate # On Windows: .venvScriptsactivateor if you like to use uv for package management:
uv venv
source .venv/bin/activate # On Windows: .venvScriptsactivateInstall dependencies:
pip install -r requirements.txt (or uv pip install -r requirements.txt)For the local styleTTS2 text-to-speech model, please also install the additional dependencies:
pip install -r requirements_stts2.txt (or uv pip install -r requirements_stts2.txt)For the F5-TTS model, please also install the additional dependencies:
pip install -r requirements_F5.txt (or uv pip install -r requirements_F5.txt)Install playwright
playwright installIf using uv please also install:
uv pip install pipFor local piperTTS support:
python3 -m TTS.piper_tts.instalpipertts (MacOS and Linux) or python -m TTS.piper_tts.instalpipertts (on Windows)Note: ffmpeg is required when using either StyleTTS2 or PiperTTS for converting wav files into mp3. StyleTTS also requires espeak-ng to be installed on your system.
Set up environment variables:
Rename .env.example file in the root director to .env and edit the content to your preference:
OUTPUT_DIR=Output # Directory to store output files
SOURCES_FILE=sources.json # File containing sources to retrieve articles from twice a day
IMG_PATH=front.jpg # Path to image file to use as cover
OLLAMA_BASE_URL=http://localhost:11434 # Standard Port for Ollama
OPENAI_BASE_URL=http://localhost:11434/v1 # Example for Ollama Open AI compatible endpoint
OPENAI_API_KEY=skxxxxxx # Your OpenAI API Key in case of using the official OpenAI API
MODEL_NAME=llama3.2:latest
LLM_ENGINE=Ollama #Valid Options: Ollama, OpenAIYou can use either Ollama or any OpenAI compatible API for title and podcast script generation (summary function also coming soon)
Clone the repository and switch into it :
git clone https://github.com/WismutHansen/READ2ME.git && cd read2meCopy the .env.example to .env and edit the contents: Important: When using a local LLM-engine e.g. ollama, the url needs to follow this format "host.docker.internal:11434" (for Ollama) or "host.docker.internal:1234" (for LMStudio)
Build the docker container
docker build -t read2me . Note: build time takes a long time, be patient
Run the docker container
docker run -p 7777:7777 -d read2meNote: build time takes a long time, be patient
copy and rename .env.example to .env. Edit the content of this file as you wish, specifying the output directory, task file and image path to use for the mp3 file cover as well as the sources and keywords file.
Run the FastAPI application:
uvicorn main:app --host 0.0.0.0 --port 7777or, if you're connected to a Linux server e.g. via ssh and want to keep the app running after closing your session
nohup uvicorn main:app --host 0.0.0.0 --port 7777 &this will write all commandline output into a file called nohup.out in your current working directory.
Add URLs for processing:
Send a POST request to http://localhost:7777/v1/url/full with a JSON body containing the URL:
{
"url": "https://example.com/article"
}You can use curl or any API client like Postman to send this request like this:
curl -X POST http://localhost:7777/v1/url/full/
-H "Content-Type: application/json"
-d '{"url": "https://example.com/article"}'
-d '{"tts-engine": "edge"}'The repository also contains a working Chromium Extension that you can install in any Chromium-based browser (e.g. Google Chrome) when the developer settings are enabled.
Processing URLs:
The application periodically checks the tasks.json file for new Jobs to process. It fetches the content for a given url, extracts text, converts it to speech, and saves the resulting MP3 files with appropriate metadata.
Specify Sources and keywords for automatic retrieval:
Create a file called sources.json in your current working directory with URLs to websites that you want to monitor for new articles. You can also set global keywords and per-source keywords to be used as filters for automatic retrieval. If you set "*" for a source, all new articles will be retrieved. Here is an example structure:
{
"global_keywords": [
"globalkeyword1",
"globalkeyword2"
],
"sources": [
{
"url": "https://example.com",
"keywords": ["keyword1","keyword2"]
},
{
"url": "https://example2.com",
"keywords": ["*"]
}
]
}Location of both files is configurable in .env file.
To use the next.js frontend, make sure you have node.js installed on your system. Note: Frontend is currently in an early experimental stage so expect lots of bugs: First, switch into the frontend directory
cd frontendthen install the required node dependencies:
npm installthen to start the frontend run:
npm run devyou can access the frontend on http://localhost:3000
POST /v1/url/full
Adds a URL to the processing list.
Request Body:
{
"url": "https://example.com/article",
"tts-engine": "edge"
}Response:
{
"message": "URL added to the processing list"
}POST /v1/url/podcast
POST /v1/text/full
POST /v1/text/podcast
Fork the repository.
Create a new branch:
git checkout -b feature/your-feature-nameMake your changes and commit them:
git commit -m 'Add some feature'Push to the branch:
git push origin feature/your-feature-nameSubmit a pull request.
This project is licensed under the Apache License Version 2.0, January 2004, except for the styletts2 code, which is licensed under the MIT License. The F5-TTS abd styletts2 pre-trained models are under their own license.
StyleTTS2 Pre-Trained Models: Before using these pre-trained models, you agree to inform the listeners that the speech samples are synthesized by the pre-trained models, unless you have the permission to use the voice you synthesize. That is, you agree to only use voices whose speakers grant the permission to have their voice cloned, either directly or by license before making synthesized voices public, or you have to publicly announce that these voices are synthesized if you do not have the permission to use these voices.
I would like to thank the following repositories and authors for their inspiration and code: