bark voice cloning HuBERT quantizer下载 - bark voice cloning HuBERT quantizer源代码下载

bark voice cloning HuBERT quantizer

其他源码

1.0.0

下载

树皮声音克隆

请阅读

该代码可在Python 3.10上使用，我尚未在其他版本上对其进行测试。一些较旧的版本会遇到问题。

语音用高质量的树皮克隆？

现在有可能。

examples_biden_example.mov

我如何克隆声音？

对于开发人员：

拥抱面模型页面上的代码示例

为每个人：

带有树皮和语音克隆的音频播放
在线拥抱面语音克隆空间
交互式Python笔记本

克隆的声音不是很令人信服，为什么别人的克隆声音比我的声音更好？

确保这些事情不在您的声音输入中：（没有特定顺序）

噪音（您之前可以使用降噪剂）
音乐（除非您想在后台音乐）
最后一个截止（这将导致它尝试并继续一代）
培训数据的1秒钟以下（我个人建议大约10秒钟的潜力约10秒，但我的效果也很高，5秒也是如此。）

是什么使及时的音频变得好？（没有特定顺序）

显然说
没有奇怪的背景噪音
只有一位发言人
句子结束后结束的音频
普通/普通的声音（它们通常取得更大的成功，它仍然能够克隆复杂的声音，但擅长于此）
大约10秒的数据

预验证的模型

官方的

姓名	休伯特模特	Quantizer版本	时代	语言	数据集
ventifier_hubert_base_ls960.pth	休伯特基地	0	3	工程	gitmylo/bark-semantic训练
ventifier_hubert_base_ls960_14.pth	休伯特基地	0	14	工程	gitmylo/bark-semantic训练
Quantifier_v1_hubert_base_ls960_23.pth	休伯特基地	1	23	工程	gitmylo/bark-semantic训练

社区

作者	姓名	休伯特模特	Quantizer版本	时代	语言	数据集
Hobispl	波兰 - 赫伯特 - Quantizer_8_epoch.pth	休伯特基地	1	8	pol	Hobis/Bark-Polish-semantic-Wav-Training
c0untfloyd	德国 - 赫伯特 - Quantizer_14_epoch.pth	休伯特基地	1	14	Ger	Countfloyd/bark-german-emantic-wav-training

对于开发人员：在树皮项目中实现语音克隆

只需将文件从此目录复制到您的项目中。
Hubert Manager包含下载Hubert和自定义量化器模型的方法。
加载Customhubert应该很简单
该笔记本包含在CUDA或CPU上使用的代码。而不仅仅是CPU。

 from hubert . pre_kmeans_hubert import CustomHubert
import torchaudio

# Load the HuBERT model,
# checkpoint_path should work fine with data/models/hubert/hubert.pt for the default config
hubert_model = CustomHubert ( checkpoint_path = 'path/to/checkpoint' )

# Run the model to extract semantic features from an audio file, where wav is your audio file
wav , sr = torchaudio . load ( 'path/to/wav' ) # This is where you load your wav, with soundfile or torchaudio for example

if wav . shape [ 0 ] == 2 :  # Stereo to mono if needed
    wav = wav . mean ( 0 , keepdim = True )

semantic_vectors = hubert_model . forward ( wav , input_sample_hz = sr )

加载和运行自定义Kmeans

 import torch
from hubert . customtokenizer import CustomTokenizer

# Load the CustomTokenizer model from a checkpoint
# With default config, you can use the pretrained model from huggingface
# With the default setup from HuBERTManager, this will be in data/models/hubert/tokenizer.pth
tokenizer = CustomTokenizer . load_from_checkpoint ( 'data/models/hubert/tokenizer.pth' )  # Automatically uses the right layers

# Process the semantic vectors from the previous HuBERT run (This works in batches, so you can send the entire HuBERT output)
semantic_tokens = tokenizer . get_token ( semantic_vectors )

# Congratulations! You now have semantic tokens which can be used inside of a speaker prompt file.

我自己如何训练？

只需运行训练命令即可。

创建语义数据和WAV进行训练的一种简单方法是我的脚本：Bark-data-Gen。但是请记住，即使不超过语义的创建，波浪的创建将大约相同的时间。因此，这可能需要一段时间才能生成。

例如，如果您有一个包含音频文件的zips的数据集，一个用于语义的zip和一个用于WAV文件的zip。在一个名为“文学”的文件夹中

您应该运行process.py --path Literature --mode prepare将所有数据提取到一个目录

您应该运行process.py --path Literature --mode prepare2用于创建Hubert语义向量，准备培训

您应该运行process.py --path Literature --mode train

而且，如果您的模型进行了足够的培训，则可以运行process.py --path Literature --mode test以测试最新模型。

免责声明

我对使用该模型创建的语义生成的音频不承担任何责任。只是不要将其用于非法目的。

展开

附加信息

版本 1.0.0
类型其他源码
更新时间 2025-02-25
大小 88.29KB
来自于 Github

bark voice cloning HuBERT quantizer

树皮声音克隆

请阅读

语音用高质量的树皮克隆？

我如何克隆声音？

克隆的声音不是很令人信服，为什么别人的克隆声音比我的声音更好？

预验证的模型

官方的

社区

对于开发人员：在树皮项目中实现语音克隆

我自己如何训练？

免责声明

BARK

GitHub sgrebnov/cordova plugin background download

GLM 4 Voice

wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

Retrieval based Voice Conversion WebUI

GOOGLE VOICE无限短信接口

chat.petals.dev

GPT Prompt Templates

GPTyped

Google Dorks

shepherd

hidusbf

Google Dorks

shepherd

hidusbf