tagged wiki2019zh Download - tagged wiki2019zh Source code download

English

中文(简体) 中文(繁体) 한국어 日本語 English Português Español Русский العربية Indonesia Deutsch Français ภาษาไทย

tagged wiki2019zh

AI Source Code

v1.0.0

Download

Download the corpus

2019 Chinese Wiki Corpus with Partialized Annotation

Based on the 2019 Chinese wiki corpus wiki2019zh.zip, the COARSE_ELECTRA_SMALL_ZH model in hanlp was used for word segmentation.

The word participle results were sequenced using 4-tag BMES annotation method, and the format is as follows:

Suppose the corpus of the participle is:你好Tom。我喜欢吃羊肉串。 , the labeling result is:

你 B
好 E
T B
o M
m E
。 S
SENTENCE END
我 S
喜 B
欢 E
吃 S
羊 B
肉 M
串 E
。 S
SENTENCE END
TEXT END

During use, you may need to pay attention to how embeddings and punctuation are handled, as well as the flags SENTENCE END and TEXT END for endings of statements and corpus.

The script used by participle is process_wiki_data.py.

It takes a lot of time to run this script:

CPU model: Intel Xeon (Cascade Lake) Platinum 8269CY
CPU main frequency: 2.5Ghz/3.2Ghz
Time spent: 7 days, 11 hours, 2 minutes

Expand

Additional Information

Version v1.0.0
Type AI Source Code
Update Time 2025-09-11
size 2.55KB
From Github

Related Applications

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

2024-11-03
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
GitHub actions/download artifact

2024-11-01

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
ML stack

AI Source Code

1.0.0
awesome free chatgpt

AI Source Code

1.0.0
pywin_contextmenu

AI Source Code

Version update
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Other source code

v1.1.0-rc-3

Related Information All