SpanLuke
1.0.0

このプロジェクトの目標は、法的テキストでエンティティを特定することです。ここで論文を読んでください。このリポジトリは、「SEMVAL-2023タスク6のPolitoHfi:法人認識および裁判所の判決予測のためのエンティティアウェアおよび階層変圧器の活用」のコードから始まります。
まず、依存関係をインストールします(python == 3.10が必要)
# clone project
git clone https://github.com/lambdavi/L-NER.git
# install requirements
cd L-NER
pip install -r requirements.txt
# reproduce best results
python main.py
--dataset legal_ner
--ds_train_path data/NER_TRAIN/NER_TRAIN_ALL.json
--ds_valid_path data/NER_DEV/NER_DEV_ALL.json
--output_folder results/
--batch 8
--acc_step 2
--num_epochs 5
--lr 1e-4
--weight_decay 0.01
--warmup_ratio 0.06
--model_path lambdavi/span-marker-luke-legal
--use_span
python main.py
--dataset ener
--ds_train_path data/ener/train.json
--ds_valid_path data/ener/test.json
--output_folder results/
--batch 8
--acc_step 2
--num_epochs 5
--lr 1e-4
--weight_decay 0.01
--warmup_ratio 0.06
--model_path lambdavi/span-marker-luke-legal
--use_span @inproceeding{benedetto-etal-2023-politohfi,
title = "{P}oli{T}o{HFI} at {S}em{E}val-2023 Task 6: Leveraging Entity-Aware and Hierarchical Transformers For Legal Entity Recognition and Court Judgment Prediction",
author = "Benedetto, Irene and
Koudounas, Alkis and
Vaiani, Lorenzo and
Pastor, Eliana and
Baralis, Elena and
Cagliero, Luca and
Tarasconi, Francesco",
editor = {Ojha, Atul Kr. and
Do{u{g}}ru{"o}z, A. Seza and
Da San Martino, Giovanni and
Tayyar Madabushi, Harish and
Kumar, Ritesh and
Sartori, Elisa},
booktitle = "Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)",
month = jul,
year = "2023",
address = "Toronto, Canada",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.semeval-1.194",
doi = "10.18653/v1/2023.semeval-1.194",
pages = "1401--1411",
abstract = "The use of Natural Language Processing techniques in the legal domain has become established for supporting attorneys and domain experts in content retrieval and decision-making. However, understanding the legal text poses relevant challenges in the recognition of domain-specific entities and the adaptation and explanation of predictive models. This paper addresses the Legal Entity Name Recognition (L-NER) and Court judgment Prediction (CPJ) and Explanation (CJPE) tasks. The L-NER solution explores the use of various transformer-based models, including an entity-aware method attending domain-specific entities. The CJPE proposed method relies on hierarchical BERT-based classifiers combined with local input attribution explainers. We propose a broad comparison of eXplainable AI methodologies along with a novel approach based on NER. For the L-NER task, the experimental results remark on the importance of domain-specific pre-training. For CJP our lightweight solution shows performance in line with existing approaches, and our NER-boosted explanations show promising CJPE results in terms of the conciseness of the prediction explanations.",
}
@software{Aarsen_SpanMarker,
author = {Aarsen, Tom},
license = {Apache-2.0},
title = {{SpanMarker for Named Entity Recognition}},
url = {https://github.com/tomaarsen/SpanMarkerNER}
}
@misc{au2022ener,
title={E-NER -- An Annotated Named Entity Recognition Corpus of Legal Text},
author={Ting Wai Terence Au and Ingemar J. Cox and Vasileios Lampos},
year={2022},
eprint={2212.09306},
archivePrefix={arXiv},
primaryClass={cs.CL}
}