Decomposition refers to decomposing a character into multiple characters in basic components such as strokes and characters. The decomposition of characters referers to breaking down a single character into multiple characters based on its basic components, such as strokes and structural elements.
The Chinese character disassembly allows characters with similar font shapes to have similar disassembly results. Hanzi decomposition yields similar decomposition results for characters with similar structures.
This feature can be used by deep learning models as one of the features of a word: the features of a glyph. This feature can be used by deep learning models as one of the features of characters: the structural feature.
pip install hanzi_chaizi from hanzi_chaizi import HanziChaizi
hc = HanziChaizi ()
result = hc . query ( '名' )
print ( result )Output:
['夕', '口']
Data from this project: Chinese dictionary
pytohn dev_scripts/parse.pyData from this project: Chinese dictionary
@misc{kong2018hanzichaizi,
title={Hanzi Chaizi},
author={Xiaoquan Kong},
howpublished={https://github.com/howl-anderson/hanzi_chaizi},
year={2018}
}
If the package is cited in books, seminars, and academic research papers, or used in company products, you are welcome (but not required) to email me about this. I'm glad to see the package being used and valuable to everyone.