تنزيل voicefixer_main - تنزيل رمز المصدر voicefixer

voicefixer_main

كود الذكاء الاصطناعي

1.0.0

تنزيل

2021-11-06: لقد قمت فقط بتحديث بنية الكود لتسهيل فهمها. قد يكون لها خطأ محتمل الآن. سأقوم ببعض التدريب على الاختبار لاحقًا.

~~2021-11-01: سأقوم بتحديث الرمز وأسهل استخدامه لاحقًا.~~

Voicefixer

VoiceFixer هو إطار لاستعادة الكلام العام. نحن نهدف إلى استعادة الكلام المتدهور الشديد والكلام التاريخي.

Voicefixer
- مواد
- الاستخدام
  - البيئة (افعل هذا في البداية)
  - صوتي لترميم الكلام العام
  - Resunet لاستعادة الكلام العام
  - resunet لاستعادة خطاب المهمة الفردية
- اقتباس

مواد

Arxiv preprint: https://arxiv.org/abs/2109.13731
تحتوي الصفحة التجريبية على مقارنة بين استعادة خطاب المهمة الفردية ، واستعادة الكلام العامة ، و FoiceFixer.
كتبنا حزمة PIP لـ VoiceFixer.
مجموعة البيانات التي نستخدمها في هذا الريبو: مجموعات بيانات التدريب والاختبار

الاستخدام

البيئة (افعل هذا في البداية)

 # Download dataset and prepare running environment
git clone https://github.com/haoheliu/voicefixer_main.git
cd voicefixer_main
source init.sh

صوتي لترميم الكلام العام

نحن هنا نأخذ VF_Unet (VoiceFixer مع UNET كوحدة تحليل) كمثال.

تمرين

 # pass in a configuration file to the training script
python3 train_gsr_voicefixer.py -c config/vctk_base_voicefixer_unet.json # you can modify the configuration file to personalize your training

يمكنك الخروج من دليل السجلات لنقاط التفتيش ونتائج التسجيل والتحقق من الصحة.

تقييم

التقييم التلقائي وتوليد ملف .CSV على جميع الاختبارات.

على سبيل المثال ، إذا كنت ترغب في التقييم على جميع TestSet (افتراضي).

python3 eval_gsr_voicefixer.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint >

على سبيل المثال ، إذا كنت تريد فقط التقييم على اختبار GSR.

python3 eval_gsr_voicefixer.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint > 
                    --testset  general_speech_restoration  
                    --description  general_speech_restoration_eval

هناك عمومًا سبعة اختبارات يمكنك نقلها إلى -الاختبار :

القاعدة : كل اختبار
مقطع : اختبار مع الكلام الذي يحتوي على عتبة القطع 0.1 و 0.25 و 0.5
تردد : اختبار مع خطاب تردد
General_speech_restoration : اختبار مع الكلام الذي يحتوي على جميع أنواع التشوهات العشوائية
تعزيز : اختبار مع خطاب صاخب
الكلام _super_resolution : Testset مع خطاب منخفض الدقة الذي يحتوي على معدل أخذ العينات من 2 كيلو هرتز ، 4 كيلو هرتز ، 8 كيلو هرتز ، 16 كيلو هرتز ، و 24 كيلو هرتز.

وإذا كنت ترغب في تقييم جزء صغير من البيانات ، مثل 10 الكلام. يمكنك تمرير الرقم إلى وسيطة -limit_numbers .

python3 eval_gsr_voicefixer.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint > 
                    --limit_numbers 10

سيتم تقديم نتائج التقييم في مجلد exp_results .

Resunet لاستعادة الكلام العام

تمرين

 # pass in a configuration file to the training script
python3 train_gsr_voicefixer.py -c config/vctk_base_voicefixer_unet.json

يمكنك الخروج من دليل السجلات لنقاط التفتيش ونتائج التسجيل والتحقق من الصحة.

التقييم (على غرار تقييم VoiceFixer)

python3 eval_ssr_unet.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint > 
                    --limit_numbers < int-test-only-on-a-few-utterance > 
                    --testset  < the-testset-you-want-to-use >  
                    --description  < describe-this-test >

resunet لاستعادة خطاب المهمة الفردية

تمرين

تقلل

 # pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_denoising.json

dereverberation

 # pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_dereverberation.json

قرار سوبر

 # pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_super_resolution.json

الإبلاغ

 # pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_declipping.json

يمكنك الخروج من دليل السجلات لنقاط التفتيش ونتائج التسجيل والتحقق من الصحة.

التقييم (على غرار تقييم VoiceFixer)

python3 eval_ssr_unet.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint > 
                    --limit_numbers < int-test-only-on-a-few-utterance > 
                    --testset  < the-testset-you-want-to-use >  
                    --description  < describe-this-test >

اقتباس

 @misc { liu2021voicefixer ,   
     title = { VoiceFixer: Toward General Speech Restoration With Neural Vocoder } ,   
     author = { Haohe Liu and Qiuqiang Kong and Qiao Tian and Yan Zhao and DeLiang Wang and Chuanzeng Huang and Yuxuan Wang } ,  
     year = { 2021 } ,  
     eprint = { 2109.13731 } ,  
     archivePrefix = { arXiv } ,  
     primaryClass = { cs.SD }  
 }