XphoneBert_Vits2 Download - XphoneBert_Vits2 Source code download

XphoneBert_Vits2

AI Source Code

1.0.0

Download

VITS2 extended with XPhoneBERT encoder

Credits

This repo based on the great work of VITS2 repo and XPhoneBERT.

Prerequisites

Python >= 3.10
Tested on Pytorch version 1.13.1 with Google Colab and LambdaLabs cloud.
Clone this repository
Install python requirements. Please refer requirements.txt
Download datasets
1. Download and extract the LJ Speech dataset, then rename or create a link to the dataset folder: ln -s /path/to/LJSpeech-1.1/wavs DUMMY
2. Note: This repo do not supported training multi-speaker dataset
Move/copy your .txt training, validation and test files to the filelists directory, and then run the preprocess.py file (similar to as run for the LJSpeech dataset), for example:
- Please refer to XPhoneBERT for more information. They using text2phonemesequence for converting raw text to phoneme sequence.
- Initializing text2phonemesequence for each language requires its corresponding ISO 639-3 code. The ISO 639-3 codes of supported languages are available at HERE.
- text2phonemesequence takes a word-segmented sequence as input. And users might also perform text normalization on the word-segmented sequence before feeding into text2phonemesequence.

Note: For languages such as Chinese, Korean, Japanese (CJK languages) and some southeast Asian languages, words are not separated by spaces. An external tokenizers must be used before feeding words into this model. In this case, write a script to normalize and segment your input before feeding to text2phonemesequence (vie_preprocess.py is in my case)

 # In Case languages, words are not separated by spaces such as Vietnamese.
python vie_preprocess.py --out_extension cleaned --filelists filelists/train.txt filelists/val.txt
python preprocess.py --input_file filelists/train.txt.cleaned --output_file filelists/train.list --language vie-n --batch_size 64 --cuda
python preprocess.py --input_file filelists/val.txt.cleaned --output_file filelists/val.list --language vie-n --batch_size 64 --cuda

# In Case languages English.
python preprocess.py --input_file filelists/train.txt.cleaned --output_file filelists/train.list --language eng-us --batch_size 64 --cuda
python preprocess.py --input_file filelists/val.txt.cleaned --output_file filelists/val.list --language eng-us --batch_size 64 --cuda

Build Monotonic Alignment Search and run preprocessing if you use your own datasets.

# Cython-version Monotonoic Alignment Search
cd monotonic_align
python setup.py build_ext --inplace

Training Example

More info about config refer to configs/config.json

# LJ Speech
python train.py -c configs/config.json -m ljs_base

Expand

Additional Information

Version 1.0.0
Type AI Source Code
Update Time 2025-08-22
size 24.62MB
From Github

Related Applications

OpenCore_NO_ACPI_Build

2024-11-13
nspanel_pro_tools_apk

2024-11-12
zkwork_aleo_gpu_worker

2024-11-11
nextcloud_share_url_downloader

2024-11-01
Dog_Fox_Bunny

2022-08-01
Lihua data analysis engine free version 3.0_search_navigation_collection_public opinion_ranking_api

2022-06-28

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
ML stack

AI Source Code

1.0.0
awesome free chatgpt

AI Source Code

1.0.0
pywin_contextmenu

AI Source Code

Version update
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Other source code

v1.1.0-rc-3

Related Information All