NATSpeech下載NATSpeech源代碼下載

NATSpeech

Ai源碼

v0.1

下載

Natspeech：非自動回報的文本到語音框架

| | 中文文檔

該回購包含官方的Pytorch實施：

PortAspeech：便攜式和高質量的生成文本到語音（Neurips 2021）
演示頁面|擁抱面？演示
DIFFSINGER：通過淺擴散機制（Difffepeech）（AAAI 2022）唱歌聲音綜合
演示頁面|項目頁面|擁抱面？演示

關鍵功能

我們在此框架中實現以下功能：

使用蒙特利爾強制對準器的非自動回歸文本到語音的數據處理。
方便且可擴展的培訓框架。
簡單但有效的隨機訪問數據集實現。

安裝依賴項

 # # We tested on Linux/Ubuntu 18.04. 
# # Install Python 3.6+ first (Anaconda recommended).

export PYTHONPATH=.
# build a virtual env (recommended).
python -m venv venv
source venv/bin/activate
# install requirements.
pip install -U pip
pip install Cython numpy==1.19.1
pip install torch==1.9.0 # torch >= 1.9.0 recommended
pip install -r requirements.txt
sudo apt install -y sox libsox-fmt-mp3
bash mfa_usr/install_mfa.sh # install forced alignment tool

文件

關於框架
運行portaspeech
運行diffspeech

引用

如果您發現這對您的研究有用，請引用以下論文：

PortAspeech

 @article { ren2021portaspeech ,
  title = { PortaSpeech: Portable and High-Quality Generative Text-to-Speech } ,
  author = { Ren, Yi and Liu, Jinglin and Zhao, Zhou } ,
  journal = { Advances in Neural Information Processing Systems } ,
  volume = { 34 } ,
  year = { 2021 }
}

diffspeech

 @article { liu2021diffsinger ,
  title = { Diffsinger: Singing voice synthesis via shallow diffusion mechanism } ,
  author = { Liu, Jinglin and Li, Chengxi and Ren, Yi and Chen, Feiyang and Liu, Peng and Zhao, Zhou } ,
  journal = { arXiv preprint arXiv:2105.02446 } ,
  volume = { 2 } ,
  year = { 2021 }
 }