langumo ko
v0.1.4
Langumo parser collection for Korean words
langumo-ko provides Korean Parser available in the langumo library. When building a dataset using langumo , you can simply use various horses data using Parser implemented in that library. langumo-ko 's horses are as follows.
langumo_ko.NamuWikiParser : parses the wooden wiki dump file. You must use the original JSON file included in the compressed dump file in the 7Z format.langumo_ko.ModuNewsParser : Focus on the newspaper horses provided by everyone's words.langumo_ko.ModuWrittenParser : Parse the octopus horses provided by all of everyone's horses.langumo_ko.ModuWebParser : Parse the web horse data provided by all of everyone's horses. langumo-ko is distributed in the Pypi repository. You can install it as follows using pip .
$ pip install langumo-ko Instead of using pip , you can download the repository and build and install it yourself.
$ git clone https://github.com/affjljoo3581/langumo-ko.git
$ cd langumo-ko
$ python setup.py install To build the horses listed above with langumo you can modify build.yml as follows:
langumo :
inputs :
- path : src/NIKL_NEWSPAPER(v1.0).zip
parser : langumo_ko.ModuNewsParser
# other configurations... langumo-ko library has an Apache-2.0 license.