UnifiedSKG
1.0.0
EMNLP 2022(口头)纸unifiedskg的代码:统一和多任任务结构化知识接地与文本到文本语言模型。请参阅我们的项目页面以获取最新的相关资源(例如,论文,代码,工具,教程),以进行结构化知识接地。从HuggingFace Model Hub加载我们的检查点。
.
├── configure # Config files for experiments, tasks, and settings
│ ├── META_TUNING # Config files for tasks and settings
│ └── Salesforce # Config files for experiments (see Misc)
│
├── metrics # Code for evaluation
│ └── ... # Please check the README of the ./seq2seq_construction.
├── models # Code for models
│ ├── adapter # Code for T5 and BART with adapters (based on HuggingFace Transformers)
│ ├── prompt # Code for T5 and BART with prefix-tuning (based on HuggingFace Transformers)
│ └── unified
│ ├── base.py # Code for the base model that enables an arbitrary model to be pushed to HuggingFace Model Hub (namely, PushToHubFriendlyModel)
│ ├── finetune.py # Code for finetuning
│ ├── adaptertuning.py # Code for adapter-tuning
│ ├── prefixtuning.py # Code for prefix-tuning
│ └── combined_prefixtuning.py # Code for combined prefix-tuning (not used in our paper, see Misc)
│
├── seq2seq_construction # Code for converting raw data into sequences
│ └── ... # Please check the README in this directory.
│
├── tasks # Code for loading raw data
│ └── ... # Please check the README in this directory.
│
├── third_party # Packages from third parties
│ └── ... # Please check the README in this directory.
│
├── utils # Code for some (probably) useful stuff
│ ├── processor # Adopted from Tapex: the processor that handles table truncation and linearization
│ └── ...
│ ├── configure.py # Code for parsing config files in ./configure
│ ├── dataset.py # Code for converting input and output sequences into Datasets for training
│ ├── tool.py # Code for loading models, seq2seq constructors, and evaluators
│ ├── trainer.py # Code for EvaluationFriendlyTrainer. If you want make training-specific modifications, you may want to change something here.
│ └── training_arguments.py # Code for seq2seq training arguments
│
├── .gitignore
├── .gitmodules
├── py3.7pytorch1.8.yaml # Anaconda environment config file
├── README.md # The README file you are looking at :)
└── train.py # Entry code, which controls train, eval, test, storage, and logging
(在./tasks ./seq2seq_construction可以有用) ./configure也./metrics有用)
./tasks下添加原始数据的“加载程序”。您可以搜索HuggingFace数据集以获取可能有用的脚本。如果没有,您可以成为该项目和拥抱面社区的贡献者。./seq2seq_construction下添加一个“序列包装器”,以构建序列输入(用户请求和结构化知识),并从原始数据中以统一的数据输出。./metrics下为您的任务添加一个“评估器”。如果使用第三方存储库,请记住将其添加到.gitModules中。./models或新的学习算法添加一个新的“模型”。./configure/META_TUNING下为您的任务添加配置文件。./configure/Salesforce下的每个实验添加一个配置文件。 ./models/unified/combined_prefixtuning.py在我们的论文中不使用。该文件包含单个训练循环中多个前缀之间相互作用的代码。我们尝试了这种互动的一些变体,但没有找到它们的任何一个以优于我们论文中使用的基于转移学习的方法。但是,我们开源失败的尝试,并要求未来的探索。 
就是这样:D
如果您发现我们的工作有帮助,请引用
@article{UnifiedSKG,
title={UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models},
author={Tianbao Xie and Chen Henry Wu and Peng Shi and Ruiqi Zhong and Torsten Scholak and Michihiro Yasunaga and Chien-Sheng Wu and Ming Zhong and Pengcheng Yin and Sida I. Wang and Victor Zhong and Bailin Wang and Chengzu Li and Connor Boyle and Ansong Ni and Ziyu Yao and Dragomir Radev and Caiming Xiong and Lingpeng Kong and Rui Zhang and Noah A. Smith and Luke Zettlemoyer and Tao Yu},
journal={EMNLP},
year={2022},
}