TTS MultiLingualダウンロードTTS MultiLingualソースコードのダウンロード

TTS MultiLingual

AI ソースコード

1.0.0

ダウンロード

グラデーションプラグインを使用したテキスト（TTS）

TTSをインストールします

pip install TTS

モデルをコーディングまたはトレーニングすることを計画している場合は、TTSをクローンしてローカルにインストールします。

git clone https://github.com/saba99/TTS-MultiLingual
pip install -e .[all,dev,notebooks]  # Select the relevant extras

ubuntu（debian）にいる場合は、インストールのために次のコマンドを実行することもできます。

$ make system-deps  # intended to be used on Ubuntu (Debian). Let us know if you have a different OS.
$ make install

TTSによるスピーチの合成

？ Python API

 from TTS . api import TTS

# Running a multi-speaker and multi-lingual model

# List available TTS models and choose the first one
model_name = TTS . list_models ()[ 0 ]
# Init TTS
tts = TTS ( model_name )
# Run TTS
# Since this model is multi-speaker and multi-lingual, we must set the target speaker and the language
# Text to speech with a numpy output
wav = tts . tts ( "This is a test! This is also a test!!" , speaker = tts . speakers [ 0 ], language = tts . languages [ 0 ])
# Text to speech to a file
tts . tts_to_file ( text = "Hello world!" , speaker = tts . speakers [ 0 ], language = tts . languages [ 0 ], file_path = "output.wav" )

# Running a single speaker model

# Init TTS with the target model name
tts = TTS ( model_name = "tts_models/de/thorsten/tacotron2-DDC" , progress_bar = False , gpu = False )
# Run TTS
tts . tts_to_file ( text = "Ich bin eine Testnachricht." , file_path = OUTPUT_PATH )

# Example voice cloning with YourTTS in English, French and Portuguese:
tts = TTS ( model_name = "tts_models/multilingual/multi-dataset/your_tts" , progress_bar = False , gpu = True )
tts . tts_to_file ( "This is voice cloning." , speaker_wav = "my/cloning/audio.wav" , language = "en" , file_path = "output.wav" )
tts . tts_to_file ( "C'est le clonage de la voix." , speaker_wav = "my/cloning/audio.wav" , language = "fr-fr" , file_path = "output.wav" )
tts . tts_to_file ( "Isso é clonagem de voz." , speaker_wav = "my/cloning/audio.wav" , language = "pt-br" , file_path = "output.wav" )


# Example voice conversion converting speaker of the `source_wav` to the speaker of the `target_wav`

tts = TTS ( model_name = "voice_conversion_models/multilingual/vctk/freevc24" , progress_bar = False , gpu = True )
tts . voice_conversion_to_file ( source_wav = "my/source.wav" , target_wav = "my/target.wav" , file_path = "output.wav" )

# Example voice cloning by a single speaker TTS model combining with the voice conversion model. This way, you can
# clone voices by using any model in TTS.

tts = TTS ( "tts_models/de/thorsten/tacotron2-DDC" )
tts . tts_with_vc_to_file (
    "Wie sage ich auf Italienisch, dass ich dich liebe?" ,
    speaker_wav = "target/speaker.wav" ,
    file_path = "ouptut.wav"
)

# Example text to speech using [Coqui Studio](https://coqui.ai) models. You can use all of your available speakers in the studio.
# [Coqui Studio](https://coqui.ai) API token is required. You can get it from the [account page](https://coqui.ai/account).
# You should set the `COQUI_STUDIO_TOKEN` environment variable to use the API token.

# If you have a valid API token set you will see the studio speakers as separate models in the list.
# The name format is coqui_studio/en/<studio_speaker_name>/coqui_studio
models = TTS (). list_models ()
# Init TTS with the target studio speaker
tts = TTS ( model_name = "coqui_studio/en/Torcull Diarmuid/coqui_studio" , progress_bar = False , gpu = False )
# Run TTS
tts . tts_to_file ( text = "This is a test." , file_path = OUTPUT_PATH )
# Run TTS with emotion and speed control
tts . tts_to_file ( text = "This is a test." , file_path = OUTPUT_PATH , emotion = "Happy" , speed = 1.5 )

出力オーディオ

簡単な例	簡単な例	簡単な例
Rainbow.mp4	Windy.mp4	Windy-2.mp4
レインボーは、光の反射、屈折、分散によって引き起こされる気象現象です	運転手は彼のレッスンを学びました。彼は二度と風に乗ることはありません	外の人々は屈みます。風は歩くのが難しくなります
長い例	長い例	長い例
an.apple.pie.mp4	cat.and.a.dog.mp4	the.farmer.mp4
木は赤いリンゴでいっぱいでした。農夫は彼の茶色の馬に乗っていました。彼は木の下で止まった。彼は手を伸ばして、枝からリンゴを選びました。彼は生のリンゴに噛みついた。彼はリンゴを楽しんだ。彼の馬は彼を見るために頭を回した。農民は木から別のリンゴを選びました。彼はそれを馬に与えた。馬は生のリンゴを食べました。馬はリンゴを楽しんだ。農家はバッグに数十のリンゴを入れました。彼は家に帰って馬に乗った。彼は馬を納屋に入れた。彼は彼の家に歩いた。猫は彼の足にこすりつけました。彼は猫に温かい牛乳のボウルを与えました。 /td>	黒い猫は椅子に飛び上がりました。それは白い犬を見下ろしました。犬は骨を噛んでいた。猫は犬にジャンプしました。犬は骨を噛み続けました。猫は犬の尾で遊んだ。犬は骨を噛み続けました。猫は椅子に戻って跳ね返った。それは足をなめ始めました。犬は立ち上がった。猫を見ました。猫の毛皮をなめました。猫は犬の鼻をなめました。犬は骨に戻りました。少年が部屋を駆け抜けた。彼は黄色いシャツを着ていました。彼はほとんど椅子に出くわした。猫は椅子から飛び降りました。猫はソファにジャンプしました。	農家はトラクターを運転します。トラクターが地面を掘ります。彼は地面に黄色のトウモロコシを植えます。彼は春に黄色のトウモロコシを植えます。とうもろこしは夏に成長します。雨はトウモロコシの成長を助けます。雨が降らない場合、トウモロコシは死にます。雨が多い場合は、トウモロコシがたくさんあります。彼は夏の終わりに黄色のトウモロコシを収穫します。彼は野菜のスタンドでトウモロコシを売っています。彼は片耳を25セントで売っています。彼は4つの耳を1ドルで売っています。彼はわずか1か月ですべてのトウモロコシを売っています。隣人は彼のトウモロコシが大好きです。とうもろこしは新鮮です。明るい黄色です。おいしいです。おいしいです。鳥も彼のトウモロコシも大好きです。彼らはそれを支払いません。彼らはそれがフィールドにある間にそれを食べます
多言語サポート：英語	多言語サポート：フランス語	多言語サポート：オランダ語
Rain.and.hail.mp4	fr-rain.and.hail.mp4	nl-rain.and.hail.mp4
暗い雲が空にありました。太陽が沈んだ。天気は寒くなりました。風が吹き始めました。葉は木から吹き飛ばされました。紙が空中を飛んだ。人々はジャケットをボタン留めしました。雨が降り始めました。最初は静かでした。それからそれは大きくなりました	unarcoíriso arco iris es unfenómenoÓpticoymeteorológicoQue en laapariciónen el cielo de un un arco de luzマルチカラー	een regenboog is een gekleurde cirkelboog die aan de hemel waargenomen kan worden als de、laagstaande

コマンドライン`tts`

単一スピーカーモデル

リストされているモデル：
```
 $ tts --list_models
```

モデル情報を取得します（tts_modelsとvocoder_modelsの両方）：

 ```
 $ tts --model_info_by_name tts_models/tr/common-voice/glow-tts
  ```
  ```
  $ tts --model_info_by_name vocoder_models/en/ljspeech/hifigan_v2
  ```

デフォルトモデルでTTSを実行します：

例えば：

 $ tts --text "Text for TTS" --model_name "tts_models/en/ljspeech/glow-tts" --out_path output/path/speech.wav

マルチスピーカーモデル

独自のマルチスピーカーTTSモデルを実行します。

 $ tts --text "Text for TTS" --out_path output/path/speech.wav --model_path path/to/model.pth --config_path path/to/config.json --speakers_file_path path/to/speaker.json --speaker_idx <speaker_id>

ディレクトリ構造

 |- notebooks/       (Jupyter Notebooks for model evaluation, parameter selection and data analysis.)
|- utils/           (common utilities.)
|- TTS
    |- bin/             (folder for all the executables.)
      |- train*.py                  (train your target model.)
      |- ...
    |- tts/             (text to speech models)
        |- layers/          (model layer definitions)
        |- models/          (model definitions)
        |- utils/           (model specific utilities.)
    |- speaker_encoder/ (Speaker Encoder models.)
        |- (same)
    |- vocoder/         (Vocoder models.)
        |- (same)

拡大する

追加情報