CTCDecoder下载CTCDecoder源代码下载

CTCDecoder

Ai源码

1.0.0

下载

CTC解码算法

更新2021：可安装的Python软件包

Python实现了一些共同的连接派时间分类（CTC）解码算法。提供了简约的语言模型。

安装

转到存储库的根级
执行pip install .
去tests/执行pytest以检查安装是否有效

用法

基本用法

这是一个简约的可执行示例：

 import numpy as np
from ctc_decoder import best_path , beam_search

mat = np . array ([[ 0.4 , 0 , 0.6 ], [ 0.4 , 0 , 0.6 ]])
chars = 'ab'

print ( f'Best path: " { best_path ( mat , chars ) } "' )
print ( f'Beam search: " { beam_search ( mat , chars ) } "' )

CTC训练的神经网络的输出mat （NUMPY阵列，已应用的SoftMax）有望具有Shape TXC，并将第一个参数传递给解码器。 t是时间步长的数量，c字符数（CTC-Blank是最后一个元素）。神经网络可以预测的字符作为chars传递给解码器。解码器返回解码的字符串。
运行代码输出：

 Best path: ""
Beam search: "a"

要查看有关如何使用解码器的更多示例，请查看tests/文件夹中的脚本。

语言模型和BK-Tree

光束搜索可以选择整合字符级的语言模型。 Beam搜索使用文本统计（Bigrams）来提高阅读精度。

 from ctc_decoder import beam_search , LanguageModel

# create language model instance from a (large) text
lm = LanguageModel ( 'this is some text' , chars )

# and use it in the beam search decoder
res = beam_search ( mat , chars , lm = lm )

词典搜索解码器以最佳路径解码计算第一个近似值。然后，它使用BK-Tree检索类似的单词，得分并最终返回最佳得分单词。通过提供字典单词列表来创建BK-TREE。公差参数定义了从查询单词到返回的字典单词的最大编辑距离。

 from ctc_decoder import lexicon_search , BKTree

# create BK-tree from a list of words
bk_tree = BKTree ([ 'words' , 'from' , 'a' , 'dictionary' ])

# and use the tree in the lexicon search
res = lexicon_search ( mat , chars , bk_tree , tolerance = 2 )

用深度学习框架使用

一些注释：

没有提供张量或pytorch的适配器
在模型中应用软磁
转换为numpy阵列
通常rnn_output
- 解码器在形状TXC的单批元素上工作
- 因此，迭代所有批次元素，并将解码器分开应用于每个元素
- 示例：批处理元素的提取矩阵0 mat = rnn_output[:, 0, :]
CTC-Blank有望成为沿字符维度的最后一个元素
- TensorFlow具有CTC-BLANK作为最后一个元素，因此在这里无事可做
- 但是，pytorch默认情况下将CTC-Blank作为第一个元素，因此您必须将其移动到结尾，或更改默认设置