An API Wrapper for Whisperx Library
This is a FastAPI application that provides an endpoint for video/audio transcription using the whisperx command. The application supports multiple audio and video formats. It performs the transcription, alignment, and diarization of the uploaded media files.
Follow the instructions on how to install Whisperx in the official repository
You can install these dependencies using the requirements.txt file:
pip install -r requirements.txtCreate a .env file in your root directory and add the following variables:
SECRET_KEY=your_secret_key
MASTER_KEY=your_master_key
HUGGING_FACE_TOKEN=your_hugging_face_token
API_PORT=11300SQLite is used for storing user information. The database is created automatically when the application runs.
Run the application using:
python api_whisperx.pyReplace main with the name of your Python file if it's not main.py.
/authAuthenticate a user and return a JWT token.
username: The username of the user.password: The password of the user./create_userCreate a new user.
username: Desired username.password: Desired password.master_key: Master key for authorized user creation./whisperx-transcribe/Transcribe an uploaded audio or video file.
file: The audio or video file to transcribe.lang: Language for transcription (default is "pt").model: Model to use for transcription (default is "large-v2").min_speakers: Minimum number of speakers for diarization (default is 1).max_speakers: Maximum number of speakers for diarization (default is 2).The application has built-in logging that informs about the steps being performed and any errors that occur.