AdaSpeech下載 - AdaSpeech源代碼下載

AdaSpeech

Ai源碼

1.0.0

下載

AdaSpeech：自適應文本對自定義語音的語音[WIP]

非正式的Pytorch Adaspeech實施。

筆記：

我不考慮多揚聲器用例，而是只關注單個揚聲器。
我將僅使用Utterance level encoder和Phoneme level encoder而不是條件層規範（這是Adaspeech紙的靈魂），它決定了Adaspeech的適應性，但我的重點是改善FastSpeech 2聲學概括而不是適應性。

引用

 @misc { chen2021adaspeech ,
      title = { AdaSpeech: Adaptive Text to Speech for Custom Voice } , 
      author = { Mingjian Chen and Xu Tan and Bohan Li and Yanqing Liu and Tao Qin and Sheng Zhao and Tie-Yan Liu } ,
      year = { 2021 } ,
      eprint = { 2103.00993 } ,
      archivePrefix = { arXiv } ,
      primaryClass = { eess.AS }
}

要求：

所有代碼以Python 3.6.2編寫。

安裝Pytorch

在安裝Pytorch之前，請通過運行以下命令來檢查您的CUDA版本： nvcc --version

 pip install torch torchvision

在此存儲庫中，我使用了pytorch 1.6.0用於torch.bucketize功能，這在pytorch的先前版本中不存在。

安裝其他要求：

 pip install -r requirements.txt

使用張量板安裝tensorboard version 1.14.0分別使用受支持的tensorflow (1.14.0)

用於預處理：

filelists文件夾包含MFA（Motreal Force Aligner）處理的LJSpeech數據集文件，因此您無需將文本與LJSpeech數據集的音頻（用於提取持續時間）對齊。對於其他數據集，請在此處遵循指令。對於其他預處理運行以下命令：

 python nvidia_preprocessing.py -d path_of_wavs

查找F0和能量的最小和最大

 python compute_statistics.py

在hparams.py中更新以下內容，按min和最大的f0和能量更新

 p_min = Min F0/pitch
p_max = Max F0
e_min = Min energy
e_max = Max energy

進行培訓

 python train_fastspeech.py --outdir etc -c configs/default.yaml -n "name"

筆記

有關更多完整和端到端語音克隆或文本到語音（TTS）工具箱，請訪問DeepSync Technologies。

展開

附加信息

版本 1.0.0
類型 Ai源碼
更新時間 2025-08-21
大小 4.13MB
來自於 Github

相關應用

ML stack

2025-07-01
awesome free chatgpt

2025-01-04
pywin_contextmenu

2025-08-31
promptl

2025-02-17
tick.chat

2025-09-16
FastLoRAChat

2025-09-03

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
ML stack

Ai源碼

1.0.0
awesome free chatgpt

Ai源碼

1.0.0
pywin_contextmenu

Ai源碼

Version update
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3

相關資訊全部

AdaSpeech

AdaSpeech：自適應文本對自定義語音的語音[WIP]

筆記：

引用

要求 ：

用於預處理：

進行培訓

筆記

要求：