shanghainese tts Download - shanghainese tts Source code download

shanghainese tts

AI Source Code

2023.06.06

Download

Shanghainese TTS

Dartmouth LING 48 Final Project: Improving TTS for Shanghainese
Yuanhao Chen [email protected] Spring 2023

Goal

To build a text-to-speech (TTS) system for Shanghainese from scratch, seeking to improve the production of tone sandhi compared to existing models by paying special attention to preprocessing of text.

Description

See writeup/main.pdf.

Dependencies

pip install -r phonemisation/requirements.txt
pip install -r speech_synthesis/requirements.txt
pip install -r comparison_questionnaire/requirements.txt  # for analysis of questionnaire results

Usage

See speech_synthesis/README.md.

Structure

phonemisation/: contains the phonemisation module
- See explanation of output in phonemisation/__init__.py
- Usage: python -m phonemisation "text to phonemise"
- Mechanism: Chinese sentence — word segmentation ⟶ Chinese words — romanisation ⟶ Shanghainese pinyin — phonemisation ⟶ Shanghainese phonemes
  - jieba is used for word segmentation
  - A Shanghainese dictionary I previously made is used for romanisation
    - Uses Qieyun module to add the tone number 1 to syllables of 陰平 yinping/inbin tone; other tones are phonologically unmarked
  - The romanisation_to_ipa function in romanisation.py contains the phonemisation function
make_metadata.py: uses the phonemisation module to convert transcription into IPA and generate metadata for training
- See below in data/
data/: contains the dataset used for training
- The transcriptions and audio files are adapted from this repo
  - Downsampled to 16kHz for training
  - Currently, only shh.dict.cn/ is used for training
- The */metadata.txt files are generated by make_metadata.py
training/
- Juptyer notebook for training the model
- Intended to be uploaded and run in Google Colab environment; needs to be modified for local use
- Uses the coqui-ai/TTS repo, which contains an implementation of VITS
writeup/: the write-up
speech_synthesis/: contains the speech synthesis model
- See speech_synthesis/README.md for more details
comparison_questionnaire/: contains the questionnaire and audio files used to compare speech produced by this model, the Apple model, and a human speaker
- *-1.wav: produced by this model
- *-2.wav: produced by Apple VoiceOver (MacBook Pro 14-inch, 2021; MacOS Ventura 13.0.1)
- *-3.wav: spoken by myself
- stats.ipynb: Jupyter notebook for analysing the questionnaire results

Expand

Additional Information

Version 2023.06.06
Type AI Source Code
Update Time 2025-08-22
size 581.42MB
From Github

Related Applications

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
F5 TTS ComfyUI

2024-11-02
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
English information on voice development (TTS User Guide Delphi version)

2009-05-28

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
ML stack

AI Source Code

1.0.0
awesome free chatgpt

AI Source Code

1.0.0
pywin_contextmenu

AI Source Code

Version update
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Other source code

v1.1.0-rc-3

Related Information All