UnifiedSKG
1.0.0
EMNLP 2022(口頭)紙unifiedskg的代碼:統一和多任任務結構化知識接地與文本到文本語言模型。請參閱我們的項目頁面以獲取最新的相關資源(例如,論文,代碼,工具,教程),以進行結構化知識接地。從HuggingFace Model Hub加載我們的檢查點。
.
├── configure # Config files for experiments, tasks, and settings
│ ├── META_TUNING # Config files for tasks and settings
│ └── Salesforce # Config files for experiments (see Misc)
│
├── metrics # Code for evaluation
│ └── ... # Please check the README of the ./seq2seq_construction.
├── models # Code for models
│ ├── adapter # Code for T5 and BART with adapters (based on HuggingFace Transformers)
│ ├── prompt # Code for T5 and BART with prefix-tuning (based on HuggingFace Transformers)
│ └── unified
│ ├── base.py # Code for the base model that enables an arbitrary model to be pushed to HuggingFace Model Hub (namely, PushToHubFriendlyModel)
│ ├── finetune.py # Code for finetuning
│ ├── adaptertuning.py # Code for adapter-tuning
│ ├── prefixtuning.py # Code for prefix-tuning
│ └── combined_prefixtuning.py # Code for combined prefix-tuning (not used in our paper, see Misc)
│
├── seq2seq_construction # Code for converting raw data into sequences
│ └── ... # Please check the README in this directory.
│
├── tasks # Code for loading raw data
│ └── ... # Please check the README in this directory.
│
├── third_party # Packages from third parties
│ └── ... # Please check the README in this directory.
│
├── utils # Code for some (probably) useful stuff
│ ├── processor # Adopted from Tapex: the processor that handles table truncation and linearization
│ └── ...
│ ├── configure.py # Code for parsing config files in ./configure
│ ├── dataset.py # Code for converting input and output sequences into Datasets for training
│ ├── tool.py # Code for loading models, seq2seq constructors, and evaluators
│ ├── trainer.py # Code for EvaluationFriendlyTrainer. If you want make training-specific modifications, you may want to change something here.
│ └── training_arguments.py # Code for seq2seq training arguments
│
├── .gitignore
├── .gitmodules
├── py3.7pytorch1.8.yaml # Anaconda environment config file
├── README.md # The README file you are looking at :)
└── train.py # Entry code, which controls train, eval, test, storage, and logging
(在./tasks ./seq2seq_construction可以有用) ./configure也./metrics有用)
./tasks下添加原始數據的“加載程序”。您可以搜索HuggingFace數據集以獲取可能有用的腳本。如果沒有,您可以成為該項目和擁抱面社區的貢獻者。./seq2seq_construction下添加一個“序列包裝器”,以構建序列輸入(用戶請求和結構化知識),並從原始數據中以統一的數據輸出。./metrics下為您的任務添加一個“評估器”。如果使用第三方存儲庫,請記住將其添加到.gitModules中。./models或新的學習算法添加一個新的“模型”。./configure/META_TUNING下為您的任務添加配置文件。./configure/Salesforce下的每個實驗添加一個配置文件。 ./models/unified/combined_prefixtuning.py在我們的論文中不使用。該文件包含單個訓練循環中多個前綴之間相互作用的代碼。我們嘗試了這種互動的一些變體,但沒有找到它們的任何一個以優於我們論文中使用的基於轉移學習的方法。但是,我們開源失敗的嘗試,並要求未來的探索。 
就是這樣:D
如果您發現我們的工作有幫助,請引用
@article{UnifiedSKG,
title={UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models},
author={Tianbao Xie and Chen Henry Wu and Peng Shi and Ruiqi Zhong and Torsten Scholak and Michihiro Yasunaga and Chien-Sheng Wu and Ming Zhong and Pengcheng Yin and Sida I. Wang and Victor Zhong and Bailin Wang and Chengzu Li and Connor Boyle and Ansong Ni and Ziyu Yao and Dragomir Radev and Caiming Xiong and Lingpeng Kong and Rui Zhang and Noah A. Smith and Luke Zettlemoyer and Tao Yu},
journal={EMNLP},
year={2022},
}