luga
v0.2.7

盧加是斯瓦希里語的語言詞。 FastText提供了燃燒的語言檢測工具。令人難以置信的是,FastText的API無美容,並且文檔有點模糊。我們必須手動下載和加載模型也很時髦。
這是盧加(Luga)進來的地方。我們抽像不必要的步驟,並允許您精確地做一件事:檢測文本語言。
站著。保持沉默 - 米娜·桑德伯格(Minna Sundberg)的印歐語和烏拉爾語之間的關係。

python -m pip install -U luga from luga import language
print ( language ( "the world ended yesterday" ))
# Language(name='en', score=0.98)有了文本列表,我們可以為過濾管道創建一個掩碼,例如,可以使用DataFrames
from luga import language
import pandas as pd
examples = [ "Jeg har ikke en rød reje" , "Det blæser en halv pelican" , "We are not robots yet" ]
languages ( texts = examples , only_language = True , to_array = True ) == "en"
# output
# array([False, False, True])
dataf = pd . DataFrame ({ "text" : examples })
dataf . loc [ lambda d : languages ( texts = d [ "text" ]. to_list (), only_language = True , to_array = True ) == "en" ]
# output
# 2 We are not robots yet
# Name: text, dtype: object下載模型
wget https://dl.fbaipublicfiles.com/fasttext/supervised-models/lid.176.bin -O /tmp/lid.176.bin加載和使用
import fasttext
PATH_TO_MODEL = '/tmp/lid.176.bin'
fmodel = fasttext . load_model ( PATH_TO_MODEL )
fmodel . predict ([ "the world has ended yesterday" ])
# ([['__label__en']], [array([0.98046654], dtype=float32)])poetry run pre-commit install # assumes git push is completed
git tag -l # lists tags
git tag v * . * . * # Major.Minor.Fix
git push origin tag v * . * . *
# to delete tag:
git tag -d v * . * . * && git push origin tag -d v * . * . *
# change project_toml and __init__.py to reflect new version artifacts.py Line 111鑄件列出引起問題的[str]