BERT AttributeExtraction下載BERT AttributeExtraction源代碼下載

BERT AttributeExtraction

其他源碼

1.0.0

下載

BERT-Attribute-Extraction

基於bert的知識圖譜屬性抽取

USING BERT FOR Attribute Extraction in KnowledgeGraph with two method,fine-tuning and feature extraction.

知識圖譜百度百科人物詞條屬性抽取，使用基於bert的微調fine-tuning和特徵提取feature-extraction方法進行實驗。

Prerequisites

 Tensorflow >=1.10
scikit-learn

Pre-trained models

BERT-Base, Chinese : Chinese Simplified and Traditional, 12-layer, 768-hidden, 12-heads, 110M parameters

Installing

None

Dataset

The dataset is constructed according to Baidu Encyclopedia character entries. Filter out corpus that does not contain entities and attributes.

Entities and attributes are obtained from name entity recognition.

Labels are obtained from the Baidu Encyclopedia infobox, and most of them are labeled manually,so some are not very good.
For example:

黄维#1904年#1#黄维（1904年-1989年），字悟我，出生于江西贵溪一农户家庭。        
陈昂#山东省滕州市#1#邀请担任诗词嘉宾。1992年1月26日，陈昂出生于山东省滕州市一个普通的知识分子家庭，其祖父、父亲都
陈伟庆#肇庆市鼎湖区#0#长。任免信息2016年10月21日下午，肇庆市鼎湖区八届人大一次会议胜利闭幕。陈伟庆当选区人民政府副区长。

Getting Started

run strip.py can get striped data
run data_process.py can process data to get numpy file input
parameters file is the parameters that run model need

Running the tests

For example with birthplace dataset：

fine-tuning

run run_classifier.py to get predicted probability outputs

python run_classifier.py 
        --task_name=my 
        --do_train=true 
        --do_predict=true 
        --data_dir=a 
        --vocab_file=/home/tiny/zhaomeng/bertmodel/vocab.txt 
        --bert_config_file=/home/tiny/zhaomeng/bertmodel/bert_config.json 
        --init_checkpoint=/home/tiny/zhaomeng/bertmodel/bert_model.ckpt 
        --max_seq_length=80 
        --train_batch_size=32 
        --learning_rate=2e-5 
        --num_train_epochs=1.0 
        --output_dir=./output

then run proba2metrics.py to get final result with wrong classification

feature-extraction

run extract_features.py to get the vector representation of train and test data in json file format

python extract_features.py 
        --input_file=../data/birth_place_train.txt 
        --output_file=../data/birth_place_train.jsonl 
        --vocab_file=/home/tiny/zhaomeng/bertmodel/vocab.txt 
        --bert_config_file=/home/tiny/zhaomeng/bertmodel/bert_config.json 
        --init_checkpoint=/home/tiny/zhaomeng/bertmodel/bert_model.ckpt 
        --layers=-1 
        --max_seq_length=80 
        --batch_size=16

then run json2vector.py to transfer json file to vector representation
finally run run_classifier.py to use machine learning methods to do classification,MLP usually peforms best

Result

The predicted results and misclassified corpus are saved in result dir.

For example with birthplace dataset using fine-tuning method,the result is:

            precision    recall  f1-score   support

     0      0.963     0.967     0.965       573
     1      0.951     0.946     0.948       389

Authors

zhao meng

License

This project is licensed under the MIT License

Acknowledgments

etc

展開

附加信息

版本 1.0.0
類型其他源碼
更新時間 2025-04-18
大小 3.32MB
來自於 Github

相關應用

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

2024-11-03
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
GitHub the via/releases

2024-11-01

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3

相關資訊全部