NATSpeech下载NATSpeech源代码下载

NATSpeech

Ai源码

v0.1

下载

Natspeech：非自动回报的文本到语音框架

| | 中文文档

该回购包含官方的Pytorch实施：

PortAspeech：便携式和高质量的生成文本到语音（Neurips 2021）
演示页面|拥抱面？演示
DIFFSINGER：通过浅扩散机制（Difffepeech）（AAAI 2022）唱歌声音综合
演示页面|项目页面|拥抱面？演示

关键功能

我们在此框架中实现以下功能：

使用蒙特利尔强制对准器的非自动回归文本到语音的数据处理。
方便且可扩展的培训框架。
简单但有效的随机访问数据集实现。

安装依赖项

 # # We tested on Linux/Ubuntu 18.04. 
# # Install Python 3.6+ first (Anaconda recommended).

export PYTHONPATH=.
# build a virtual env (recommended).
python -m venv venv
source venv/bin/activate
# install requirements.
pip install -U pip
pip install Cython numpy==1.19.1
pip install torch==1.9.0 # torch >= 1.9.0 recommended
pip install -r requirements.txt
sudo apt install -y sox libsox-fmt-mp3
bash mfa_usr/install_mfa.sh # install forced alignment tool

文件

关于框架
运行portaspeech
运行diffspeech

引用

如果您发现这对您的研究有用，请引用以下论文：

PortAspeech

 @article { ren2021portaspeech ,
  title = { PortaSpeech: Portable and High-Quality Generative Text-to-Speech } ,
  author = { Ren, Yi and Liu, Jinglin and Zhao, Zhou } ,
  journal = { Advances in Neural Information Processing Systems } ,
  volume = { 34 } ,
  year = { 2021 }
}

diffspeech

 @article { liu2021diffsinger ,
  title = { Diffsinger: Singing voice synthesis via shallow diffusion mechanism } ,
  author = { Liu, Jinglin and Li, Chengxi and Ren, Yi and Chen, Feiyang and Liu, Peng and Zhao, Zhou } ,
  journal = { arXiv preprint arXiv:2105.02446 } ,
  volume = { 2 } ,
  year = { 2021 }
 }