ดาวน์โหลด voicefixer_main - ดาวน์โหลดซอร์สโค้ด voicefixer

voicefixer_main

โค้ดแหล่งที่มา AI

1.0.0

ดาวน์โหลด

2021-11-06: ฉันเพิ่งอัปเดตโครงสร้างรหัสเพื่อให้เข้าใจได้ง่ายขึ้น มันอาจมีข้อผิดพลาดที่อาจเกิดขึ้นในขณะนี้ ฉันจะทำการฝึกทดสอบในภายหลัง

~~2021-11-01: ฉันจะอัปเดตรหัสและทำให้ใช้งานได้ง่ายขึ้นในภายหลัง~~

เครื่องเสียง

VoiceFixer เป็นกรอบสำหรับการฟื้นฟูการพูดทั่วไป เราตั้งเป้าหมายที่การฟื้นฟูคำพูดที่เสื่อมโทรมอย่างรุนแรงและการพูดทางประวัติศาสตร์

เครื่องเสียง
- วัสดุ
- การใช้งาน
  - สภาพแวดล้อม (ทำสิ่งนี้ในตอนแรก)
  - Voicefixer สำหรับการฟื้นฟูคำพูดทั่วไป
  - Resunet สำหรับการฟื้นฟูการพูดทั่วไป
  - Resunet สำหรับการฟื้นฟูคำพูดเดียว
- การอ้างอิง

วัสดุ

arxiv preprint: https://arxiv.org/abs/2109.13731
หน้าสาธิตประกอบด้วยการเปรียบเทียบระหว่างการฟื้นฟูคำพูดงานเดียวการฟื้นฟูคำพูดทั่วไปและเสียง
เราเขียนแพ็คเกจ PIP สำหรับ VoiceFixer
ชุดข้อมูลที่เราใช้ในชุดข้อมูลการฝึกอบรมและการทดสอบ

การใช้งาน

สภาพแวดล้อม (ทำสิ่งนี้ในตอนแรก)

 # Download dataset and prepare running environment
git clone https://github.com/haoheliu/voicefixer_main.git
cd voicefixer_main
source init.sh

Voicefixer สำหรับการฟื้นฟูคำพูดทั่วไป

ที่นี่เราใช้ VF_UNET (VoiceFixer พร้อม UNET เป็นโมดูลการวิเคราะห์) เป็นตัวอย่าง

การฝึกอบรม

 # pass in a configuration file to the training script
python3 train_gsr_voicefixer.py -c config/vctk_base_voicefixer_unet.json # you can modify the configuration file to personalize your training

คุณสามารถชำระเงินไดเรกทอรี บันทึก สำหรับจุดตรวจสอบการบันทึกและผลการตรวจสอบความถูกต้อง

การประเมิน

การประเมินและสร้างไฟล์. csv โดยอัตโนมัติในชุดทดสอบทั้งหมด

ตัวอย่างเช่นหากคุณต้องการประเมินในชุดทดสอบทั้งหมด (ค่าเริ่มต้น)

python3 eval_gsr_voicefixer.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint >

ตัวอย่างเช่นหากคุณต้องการประเมินในชุดทดสอบ GSR

python3 eval_gsr_voicefixer.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint > 
                    --testset  general_speech_restoration  
                    --description  general_speech_restoration_eval

โดยทั่วไปมีชุดทดสอบเจ็ดชุดที่คุณสามารถส่งไปยัง -ทดสอบรายการ :

ฐาน : ชุดทดสอบทั้งหมด
คลิป : ชุดทดสอบพร้อมคำพูดที่มีการตัดเกณฑ์ 0.1, 0.25 และ 0.5
พัดโบก : ชุดทดสอบด้วยคำพูดที่ดังก้อง
General_speech_restoration : ชุดทดสอบพร้อมคำพูดที่มีการบิดเบือนแบบสุ่มทุกชนิด
การปรับปรุง : ชุดทดสอบด้วยคำพูดที่มีเสียงดัง
Speech_super_resolution : ชุดทดสอบที่มีคำพูดความละเอียดต่ำที่มีอัตราการสุ่มตัวอย่าง 2kHz, 4kHz, 8kHz, 16kHz และ 24kHz

และถ้าคุณต้องการประเมินข้อมูลส่วนเล็ก ๆ เช่น 10 คำพูด คุณสามารถส่งหมายเลขไปยัง -limit_numbers อาร์กิวเมนต์

python3 eval_gsr_voicefixer.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint > 
                    --limit_numbers 10

ผลการประเมินจะถูกนำเสนอในโฟลเดอร์ exp_results

Resunet สำหรับการฟื้นฟูการพูดทั่วไป

การฝึกอบรม

 # pass in a configuration file to the training script
python3 train_gsr_voicefixer.py -c config/vctk_base_voicefixer_unet.json

คุณสามารถชำระเงินไดเรกทอรี บันทึก สำหรับจุดตรวจสอบการบันทึกและผลการตรวจสอบความถูกต้อง

การประเมินผล (คล้ายกับการประเมินค่าเสียง)

python3 eval_ssr_unet.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint > 
                    --limit_numbers < int-test-only-on-a-few-utterance > 
                    --testset  < the-testset-you-want-to-use >  
                    --description  < describe-this-test >

Resunet สำหรับการฟื้นฟูคำพูดเดียว

การฝึกอบรม

การปฏิเสธ

 # pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_denoising.json

การทำให้เป็นเครื่อง

 # pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_dereverberation.json

ความละเอียดสุดยอด

 # pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_super_resolution.json

การปฏิเสธ

 # pass in a configuration file to the training script
python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_declipping.json

คุณสามารถชำระเงินไดเรกทอรี บันทึก สำหรับจุดตรวจสอบการบันทึกและผลการตรวจสอบความถูกต้อง

การประเมินผล (คล้ายกับการประเมินค่าเสียง)

python3 eval_ssr_unet.py  
                    --config  < path-to-the-config-file > 
                    --ckpt  < path-to-the-checkpoint > 
                    --limit_numbers < int-test-only-on-a-few-utterance > 
                    --testset  < the-testset-you-want-to-use >  
                    --description  < describe-this-test >

การอ้างอิง

 @misc { liu2021voicefixer ,   
     title = { VoiceFixer: Toward General Speech Restoration With Neural Vocoder } ,   
     author = { Haohe Liu and Qiuqiang Kong and Qiao Tian and Yan Zhao and DeLiang Wang and Chuanzeng Huang and Yuxuan Wang } ,  
     year = { 2021 } ,  
     eprint = { 2109.13731 } ,  
     archivePrefix = { arXiv } ,  
     primaryClass = { cs.SD }  
 }