BERT MB iSTFT VITS Download - BERT MB iSTFT VITS Source code download

BERT MB iSTFT VITS

AI Source Code

1.0.0

Download

BERT-MB-iSTFT-VITS

Supported Language: Korean, Japanese, Chinese, English, French, Spanish

Prerequisites

A Windows/Linux system with a minimum of 16GB RAM.
A GPU with at least 12GB of VRAM.
Python == 3.8
Anaconda installed.
PyTorch installed.
CUDA 11.x installed.

Pytorch install command:

pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117

CUDA 11.7 install: https://developer.nvidia.com/cuda-11-7-0-download-archive

Installation

Create an Anaconda environment:

conda create -n vits python=3.8

Activate the environment:

conda activate vits

Clone this repository to your local machine:

git clone https://github.com/project-elnino/BERT-MB-iSTFT-VITS.git

Navigate to the cloned directory:

cd BERT-MB-iSTFT-VITS

Install the necessary dependencies:

pip install -r requirements.txt

Create transcript

path/to/audio_001.wav |<speaker_name>|<language_code>|<text_001>

Example

../kss2/1/1_0000.wav|KR-default|KR|그는 괜찮은 척하려고 애쓰는 것 같았다.

Preprocess

python preprocess.py --metadata ./metadata.list --config_path ./configs/config.json

If your speech file is either not Mono / PCM-16, the you should resample your .wav file first.

Setting json file in configs

Model	How to set up json file in configs	Sample of json file configuration
iSTFT-VITS	`"istft_vits": true,` `"upsample_rates": [8,8],`	ljs_istft_vits.json
MB-iSTFT-VITS	`"subbands": 4,` `"mb_istft_vits": true,` `"upsample_rates": [4,4],`	ljs_mb_istft_vits.json
MS-iSTFT-VITS	`"subbands": 4,` `"ms_istft_vits": true,` `"upsample_rates": [4,4],`	ljs_ms_istft_vits.json

If you have done preprocessing, set "cleaned_text" to true.
Change training_files and validation_files to the path of preprocessed manifest files.

Training

python train.py -c <config> -m <folder>

Resume training from lastest checkpoint is automatic.

Inference

Check inference.py

python inference.py -m ./models/kss/G_64000.pth

Server Inference

python inference_server.py -m ./models/kss/G_64000.pth

Do Inference

curl -X POST -H "Content-Type: application/json" -d '{"text": "잠시 통화 괜찮으시면 전화를 끊지 말아주세요."}' http://localhost:5000/synthesize

To-Do

Korean TTS operation confirmed

References

MasayaKawamura/MB-iSTFT-VITS
myshell-ai/MeloTTS

Expand

Additional Information

Version 1.0.0
Type AI Source Code
Update Time 2025-08-23
size 7.51MB
From Github

Related Applications

MB Lab

2024-11-12
GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
JOKE

2024-02-26

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
ML stack

AI Source Code

1.0.0
awesome free chatgpt

AI Source Code

1.0.0
pywin_contextmenu

AI Source Code

Version update
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Other source code

v1.1.0-rc-3

Related Information All

BERT MB iSTFT VITS