podcast_tts下载podcast_tts源代码下载

podcast_tts

Ai源码

1.0.0

下载

播客TTS

podcast_tts是一个Python库，用于使用文本到语音（TTS）生成播客和对话。它支持多个扬声器，背景音乐和精确的音频混合，以获得专业质量的结果。

示例播客

您可以收听下面的示例播客：

示例podcast_01.mp4

特征

多演讲者支持：生成具有不同扬声器配置文件的对话。
预制声音：使用图书馆包含的预制扬声器配置文件（Male1，male2，aenly2）或创建自定义配置文件。
动态扬声器的生成：如果不存在指定的说话者，则会自动生成新的扬声器配置文件，从而保存voices文件夹中的配置文件以备将来使用。
一致的角色分配：通过根据说话者名称分配和重复使用说话者配置文件来确保一致性。
特定于频道的播放：允许在左，右或两个频道上播放音频以进行空间分离。
文本归一化：自动使文本，处理收缩和格式特殊情况归一化。
背景音乐集成：添加淡入淡入/输出和音量控制的背景音乐。
MP3和URL支持：使用本地MP3/WAV文件或从带有缓存的URL下载音乐。
输出格式：将生成的音频保存为WAV或MP3文件。

安装

 # ensure to have sox, or ffmpeg installed
brew install sox
# install the package
pip install podcast_tts

用法

为单个扬声器生成音频

 import asyncio
from podcast_tts import PodcastTTS

async def main ():
    tts = PodcastTTS ( speed = 5 )
    await tts . generate_tts (
        text = "Hello! Welcome to our podcast." ,
        speaker = "male1" ,
        filename = "output_audio.wav" ,
        channel = "both"
    )

if __name__ == "__main__" :
    asyncio . run ( main ())

示例：与音乐一起生成播客

Generate_podcast方法结合了对话和背景音乐，以进行无缝的播客制作。

 import asyncio
from podcast_tts import PodcastTTS

async def main ():
    tts = PodcastTTS ( speed = 5 )

    # Define speakers and text
    texts = [
        { "male1" : [ "Welcome to the podcast!" , "both" ]},
        { "female2" : [ "Today, we discuss AI advancements." , "left" ]},
        { "male2" : [ "Don't miss our exciting updates." , "right" ]},
    ]

    # Define background music (local file or URL)
    music_config = [ "https://example.com/background_music.mp3" , 10 , 3 , 0.3 ]

    # Generate the podcast
    output_file = await tts . generate_podcast (
        texts = texts ,
        music = music_config ,
        filename = "podcast_with_music.mp3" ,
        pause_duration = 0.5 ,
        normalize = True
    )

    print ( f"Podcast saved to: { output_file } " )

if __name__ == "__main__" :
    asyncio . run ( main ())

音乐配置：

[file/url，full_volume_duration，fade_duration，target_volume]
- 文件/URL ：通往本地mp3/wav文件的路径或要下载的URL。
- full_volume_duration ：在对话开始之前和结束后，全卷的时间（秒）。
- fade_duration ：时间（秒）用于淡入/输出效果。
- target_volum E：对话播放期间的音量级别（0.0至1.0）。

预制声音

Podcasttts包括以下预制演讲者资料：

MAL1
male2
女性2

这些配置文件包含在软件包的Default_voices目录中，并且可以在没有其他设置的情况下使用。

动态的演讲者生成

当指定说话者配置文件但不存在时，库将自动生成新的扬声器配置文件并将其保存在Voices子文件夹中。这确保了对话中不同转弯的一致语音角色。例如：

 texts = [
    { "Narrator" : [ "Welcome to this exciting episode." , "left" ]},
    { "Expert" : [ "Today, we'll explore AI's impact on healthcare." , "right" ]},
]
# If "Narrator" or "Expert" profiles do not exist, they will be generated dynamically.

配置文件保存在脚本的Voices目录中，并自动重复使用，如果将来使用相同的说话者以保持一致性。

加载现有的扬声器配置文件

您可以通过指定其文件名来加载任何扬声器配置文件（没有.txt扩展名）。配置文件存储在Voices子文件夹中，因此您无需明确指定路径。

 # Assuming a speaker profile "Host.txt" exists in the voices subfolder
await tts . generate_tts ( "This is a test for an existing speaker." , "Host" , "existing_speaker.wav" )