voicefixer_mainダウンロードvoicefixer_mainソースコードのダウンロード

voicefixer_main

AI ソースコード

1.0.0

ダウンロード

2021-11-06：コード構造を更新して理解しやすくしました。今、潜在的なバグがあるかもしれません。後でテストトレーニングを行います。

~~2021-11-01：コードを更新し、後で使いやすくします。~~

VoiceFixer

VoiceFixerは、一般的な音声修復のフレームワークです。私たちは、ひどく劣化したスピーチと歴史的なスピーチの回復を目指しています。

VoiceFixer
- 材料
- 使用法
  - 環境（最初はこれを行います）
  - 一般的な音声修復のためのVoiceFixer
  - 一般的な音声修復のための復元
  - 単一タスクの音声復元のためのresunet
- 引用

材料

arxiv preprint：https：//arxiv.org/abs/2109.13731
デモページには、単一のタスク音声修復、一般的な音声復元、およびVoiceFixerの比較が含まれています。
VoiceFixer用のPIPパッケージを書きました。
このレポで使用するデータセット：トレーニングとテストデータセット

使用法

環境（最初はこれを行います）

 # Download dataset and prepare running environment
git clone https://github.com/haoheliu/voicefixer_main.git
cd voicefixer_main
source init.sh

一般的な音声修復のためのVoiceFixer

ここでは、 vf_unet （UNETを備えたVoiceFixerを分析モジュールとして）と例に取ります。

トレーニング

 # pass in a configuration file to the training script
python3 train_gsr_voicefixer.py -c config/vctk_base_voicefixer_unet.json # you can modify the configuration file to personalize your training

チェックポイント、ログ、検証結果についてログディレクトリをチェックアウトできます。

評価

すべてのテストセットで.csvファイルを自動評価し、生成します。

たとえば、すべてのテストセット（デフォルト）で評価したい場合。

python3 eval_gsr_voicefixer.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint >

たとえば、GSRテストセットで評価したい場合。

python3 eval_gsr_voicefixer.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint > 
                    --testset  general_speech_restoration  
                    --description  general_speech_restoration_eval

通常、 - テストセットに渡すことができる7つのテストセットがあります：

ベース：すべてのテストセット
クリップ：0.1、0.25、および0.5のクリッピングしきい値を持つ音声を使用したテストセット
リバーブ：反響した音声によるテストセット
general_speech_restoration ：あらゆる種類のランダムな歪みを含む音声を使用したテストセット
拡張：騒々しい音声によるテストセット
speech_super_resolution ：サンプリングレートが2kHz、4kHz、8kHz、16kHz、および24kHzの低解像度の音声を使用したテストセット。

また、データのごく一部を評価したい場合は、たとえば10発言。番号を-limit_numbers引数に渡すことができます。

python3 eval_gsr_voicefixer.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint > 
                    --limit_numbers 10

評価結果は、 exp_resultsフォルダーに表示されます。

一般的な音声修復のための復元

トレーニング

 # pass in a configuration file to the training script
python3 train_gsr_voicefixer.py -c config/vctk_base_voicefixer_unet.json

チェックポイント、ログ、検証結果についてログディレクトリをチェックアウトできます。

評価（VoiceFixer評価と同様）

python3 eval_ssr_unet.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint > 
                    --limit_numbers < int-test-only-on-a-few-utterance > 
                    --testset  < the-testset-you-want-to-use >  
                    --description  < describe-this-test >

単一タスクの音声復元のためのresunet

トレーニング

除去

 # pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_denoising.json

逆方向

 # pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_dereverberation.json

スーパー解像度

 # pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_super_resolution.json

レビッピング

 # pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_declipping.json

チェックポイント、ログ、検証結果についてログディレクトリをチェックアウトできます。

評価（VoiceFixer評価と同様）

python3 eval_ssr_unet.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint > 
                    --limit_numbers < int-test-only-on-a-few-utterance > 
                    --testset  < the-testset-you-want-to-use >  
                    --description  < describe-this-test >

引用

 @misc { liu2021voicefixer ,   
     title = { VoiceFixer: Toward General Speech Restoration With Neural Vocoder } ,   
     author = { Haohe Liu and Qiuqiang Kong and Qiao Tian and Yan Zhao and DeLiang Wang and Chuanzeng Huang and Yuxuan Wang } ,  
     year = { 2021 } ,  
     eprint = { 2109.13731 } ,  
     archivePrefix = { arXiv } ,  
     primaryClass = { cs.SD }  
 }