gnuspeech_sa下載gnuspeech

gnuspeech_sa

Ai源碼

1.0.0

下載

gnuspeechsa（獨立）

gnuspeechsa是一種命令行關節合成器，可將文本轉換為語音。

Gnuspeechsa是由David R. Hill，Leonard Manzara，Craig Schock和Condutors提供的原始Gnuspeech系統中TTS_Server的C ++端口。基礎是Gnuspeech的顛覆存儲庫修訂版672上的代碼，該代碼於2014-08-02下載。源代碼是從目錄中獲得的：

 nextstep/trunk/ObjectiveC/Monet.realtime
nextstep/trunk/src/SpeechObject/postMonet/server.monet

該軟件用多平台C ++編寫。

gnuspeech

gnuspeech是一種發音的語音合成器。該項目實施了第一個發音文本到語音（TTS）軟件（據我所知）。它是在大約30年前（2023年）的90年代開發的。合成器以前是一個封閉的源商業軟件，僅適用於下一個計算機。下一步滅亡後，該軟件被捐贈給GNU項目。它使用一個簡單的聲帶模型，因為下一個是一台非常慢的計算機。 90年代的CPU以數十MHz的頻率（不是錯字）運行，比2023年的技術慢約100倍。該模型的相對低複雜性允許現代個人計算機的低潛伏期合成。

原始的TTS系統具有兩個在56k DSP上執行的人聲道模型（管模型）的實現，該模型以彙編編寫，另一個在CPU上執行，用C編寫。 DSP管模型產生更好的語音，具有更平衡的摩擦劑/plosives。該存儲庫使用C管模型。

合成示例

下面的聲音是從Gerard NolstTrenité的《混亂（簡短版本》）的文字中合成的。

原始代碼（下一個 - 不在此存儲庫中）使用DSP聲道模型

英語 - 男性

gnuspeechsa 0.1.8

英語 - 男性
英語 - 女
英語 - 大孩子
英語 - 小孩
英語 - 寶貝

地位

維護

僅支持英語。

執照

該程序是免費的軟件：您可以根據自由軟件基金會發布的GNU通用公共許可證的條款對其進行重新分配和/或修改它，該版本是該許可證的版本3，或（按您的選項）任何以後的版本。

該程序的分佈是希望它將有用的，但沒有任何保修；即使沒有對特定目的的適銷性或適合性的隱含保證。有關更多詳細信息，請參見copying.txt文件。

外部代碼

該軟件包括RapidXML的代碼。有關詳細信息，請參見文件SRC/Rapidxml/license.txt。

使用`gnuspeech_sa`

gnuspeech_sa將輸入文本轉換為語音。

 ./gnuspeech_sa [-v] -c config_dir -p trm_param_file.txt -o output_file.wav 
        "Hello world."
    Synthesizes text from the command line.
    -v : verbose

    config_dir is the directory that stores the configuration data,
        e.g. data/en.
    trm_param_file.txt will be generated, containing the tube model
        parameters.
    output_file.wav will be generated, containing the synthesized speech.

./gnuspeech_sa [-v] -c config_dir -i input_text.txt -p trm_param_file.txt 
        -o output_file.wav
    Synthesizes text from a file.
    -v : verbose

    config_dir is the directory that stores the configuration data,
        e.g. data/en.
    input_text.txt contains the input text.
    trm_param_file.txt will be generated, containing the tube model
        parameters.
    output_file.wav will be generated, containing the synthesized speech.

使用`gnuspeech_sa_trm`

gnuspeech_sa_trm僅執行管子模型。

 ./gnuspeech_sa_trm [-v] trm_param_file.txt output_file.wav
    -v : verbose

    trm_param_file.txt is the file generated by gnuspeech_sa, containing the
        tube model parameters.
    output_file.wav will be generated, containing the synthesized speech.

數據/en的內容

`monet.xml`

包含關節數據庫。

`intonation.txt`

控制語調。

如果在trm_control_model.txt中random_intonation = 0 ，則僅使用每個音調組中的第一行。如果random_intonation = 1 ，則將隨機選擇該行。

`MainDictionary.txt`

包含主要詞典，將單詞與姿勢相關聯。

`trm.txt`

包含管模型的參數。

有趣的參數是：

    vocal_tract_length_offset
        This value is added to the vocal tract length.
    loss_factor
        Defines the acoustic loss inside the vocal tract.

`trm_control_model.txt`

包含管模型控制器的參數。

有趣的參數是：

    voice_name
        Defines the voice used in the synthesis.
        It selects which of the voice_*.txt files will be
        loaded.
    tempo
        Values greater than 1 will speed up the speech.
    pitch_offset
        Modifies the voice pitch.

    drift_deviation
    drift_lowpass_cutoff
        Control the random perturbations in the intonation
        (requires intonation_drift = 1).

    dictionary_1_file
    dictionary_2_file
    dictionary_3_file
        Indicate the dictionaries (the dictionaries will be
        searched in the order 1, 2, 3).

筆記：

目前尚未使用以下參數：

notional_pitch
pietonic_range
pietonic_lift
tonic_range
tonic_movement

`voice_baby.txt`

`voice_female.txt`

`voice_large_child.txt`

`voice_male.txt`

`voice_small_child.txt`

包含語音參數。

有趣的參數是：

    vocal_tract_length

    glottal_pulse_tp
        Rise time, in % of the period.
    glottal_pulse_tn_min
        Fall time, in % of the period - for the highest pulse
        amplitude.
    glottal_pulse_tn_max
        Fall time, in % of the period - for the lowest pulse
        amplitude.

        These parameters modify the glottal pulse shape.

    reference_glottal_pitch
        Modify the voice pitch.

    breathiness