CTCDecoder下載CTCDecoder源代碼下載

CTCDecoder

Ai源碼

1.0.0

下載

CTC解碼算法

更新2021：可安裝的Python軟件包

Python實現了一些共同的連接派時間分類（CTC）解碼算法。提供了簡約的語言模型。

安裝

轉到存儲庫的根級
執行pip install .
去tests/執行pytest以檢查安裝是否有效

用法

基本用法

這是一個簡約的可執行示例：

 import numpy as np
from ctc_decoder import best_path , beam_search

mat = np . array ([[ 0.4 , 0 , 0.6 ], [ 0.4 , 0 , 0.6 ]])
chars = 'ab'

print ( f'Best path: " { best_path ( mat , chars ) } "' )
print ( f'Beam search: " { beam_search ( mat , chars ) } "' )

CTC訓練的神經網絡的輸出mat （NUMPY陣列，已應用的SoftMax）有望具有Shape TXC，並將第一個參數傳遞給解碼器。 t是時間步長的數量，c字符數（CTC-Blank是最後一個元素）。神經網絡可以預測的字符作為chars傳遞給解碼器。解碼器返回解碼的字符串。
運行代碼輸出：

 Best path: ""
Beam search: "a"

要查看有關如何使用解碼器的更多示例，請查看tests/文件夾中的腳本。

語言模型和BK-Tree

光束搜索可以選擇整合字符級的語言模型。 Beam搜索使用文本統計（Bigrams）來提高閱讀精度。

 from ctc_decoder import beam_search , LanguageModel

# create language model instance from a (large) text
lm = LanguageModel ( 'this is some text' , chars )

# and use it in the beam search decoder
res = beam_search ( mat , chars , lm = lm )

詞典搜索解碼器以最佳路徑解碼計算第一個近似值。然後，它使用BK-Tree檢索類似的單詞，得分並最終返回最佳得分單詞。通過提供字典單詞列表來創建BK-TREE。公差參數定義了從查詢單詞到返回的字典單詞的最大編輯距離。

 from ctc_decoder import lexicon_search , BKTree

# create BK-tree from a list of words
bk_tree = BKTree ([ 'words' , 'from' , 'a' , 'dictionary' ])

# and use the tree in the lexicon search
res = lexicon_search ( mat , chars , bk_tree , tolerance = 2 )

用深度學習框架使用

一些註釋：

沒有提供張量或pytorch的適配器
在模型中應用軟磁
轉換為numpy陣列
通常rnn_output
- 解碼器在形狀TXC的單批元素上工作
- 因此，迭代所有批次元素，並將解碼器分開應用於每個元素
- 示例：批處理元素的提取矩陣0 mat = rnn_output[:, 0, :]
CTC-Blank有望成為沿字符維度的最後一個元素
- TensorFlow具有CTC-BLANK作為最後一個元素，因此在這裡無事可做
- 但是，pytorch默認情況下將CTC-Blank作為第一個元素，因此您必須將其移動到結尾，或更改默認設置