podcast_tts下載podcast_tts源代碼下載

podcast_tts

Ai源碼

1.0.0

下載

播客TTS

podcast_tts是一個Python庫，用於使用文本到語音（TTS）生成播客和對話。它支持多個揚聲器，背景音樂和精確的音頻混合，以獲得專業質量的結果。

示例播客

您可以收聽下面的示例播客：

示例podcast_01.mp4

特徵

多演講者支持：生成具有不同揚聲器配置文件的對話。
預製聲音：使用圖書館包含的預製揚聲器配置文件（Male1，male2，aenly2）或創建自定義配置文件。
動態揚聲器的生成：如果不存在指定的說話者，則會自動生成新的揚聲器配置文件，從而保存voices文件夾中的配置文件以備將來使用。
一致的角色分配：通過根據說話者名稱分配和重複使用說話者配置文件來確保一致性。
特定於頻道的播放：允許在左，右或兩個頻道上播放音頻以進行空間分離。
文本歸一化：自動使文本，處理收縮和格式特殊情況歸一化。
背景音樂集成：添加淡入淡入/輸出和音量控制的背景音樂。
MP3和URL支持：使用本地MP3/WAV文件或從帶有緩存的URL下載音樂。
輸出格式：將生成的音頻保存為WAV或MP3文件。

安裝

 # ensure to have sox, or ffmpeg installed
brew install sox
# install the package
pip install podcast_tts

用法

為單個揚聲器生成音頻

 import asyncio
from podcast_tts import PodcastTTS

async def main ():
    tts = PodcastTTS ( speed = 5 )
    await tts . generate_tts (
        text = "Hello! Welcome to our podcast." ,
        speaker = "male1" ,
        filename = "output_audio.wav" ,
        channel = "both"
    )

if __name__ == "__main__" :
    asyncio . run ( main ())

示例：與音樂一起生成播客

Generate_podcast方法結合了對話和背景音樂，以進行無縫的播客製作。

 import asyncio
from podcast_tts import PodcastTTS

async def main ():
    tts = PodcastTTS ( speed = 5 )

    # Define speakers and text
    texts = [
        { "male1" : [ "Welcome to the podcast!" , "both" ]},
        { "female2" : [ "Today, we discuss AI advancements." , "left" ]},
        { "male2" : [ "Don't miss our exciting updates." , "right" ]},
    ]

    # Define background music (local file or URL)
    music_config = [ "https://example.com/background_music.mp3" , 10 , 3 , 0.3 ]

    # Generate the podcast
    output_file = await tts . generate_podcast (
        texts = texts ,
        music = music_config ,
        filename = "podcast_with_music.mp3" ,
        pause_duration = 0.5 ,
        normalize = True
    )

    print ( f"Podcast saved to: { output_file } " )

if __name__ == "__main__" :
    asyncio . run ( main ())

音樂配置：

[file/url，full_volume_duration，fade_duration，target_volume]
- 文件/URL ：通往本地mp3/wav文件的路徑或要下載的URL。
- full_volume_duration ：在對話開始之前和結束後，全卷的時間（秒）。
- fade_duration ：時間（秒）用於淡入/輸出效果。
- target_volum E：對話播放期間的音量級別（0.0至1.0）。

預製聲音

Podcasttts包括以下預製演講者資料：

MAL1
male2
女性2

這些配置文件包含在軟件包的Default_voices目錄中，並且可以在沒有其他設置的情況下使用。

動態的演講者生成

當指定說話者配置文件但不存在時，庫將自動生成新的揚聲器配置文件並將其保存在Voices子文件夾中。這確保了對話中不同轉彎的一致語音角色。例如：

 texts = [
    { "Narrator" : [ "Welcome to this exciting episode." , "left" ]},
    { "Expert" : [ "Today, we'll explore AI's impact on healthcare." , "right" ]},
]
# If "Narrator" or "Expert" profiles do not exist, they will be generated dynamically.

配置文件保存在腳本的Voices目錄中，並自動重複使用，如果將來使用相同的說話者以保持一致性。

加載現有的揚聲器配置文件

您可以通過指定其文件名來加載任何揚聲器配置文件（沒有.txt擴展名）。配置文件存儲在Voices子文件夾中，因此您無需明確指定路徑。

 # Assuming a speaker profile "Host.txt" exists in the voices subfolder
await tts . generate_tts ( "This is a test for an existing speaker." , "Host" , "existing_speaker.wav" )