BERT of Theseus BERT of Theseus下載

BERT of Theseus

其他源碼

1.0.0

下載

這些伯特

紙的代碼“這些伯特：通過漸進模塊更換來壓縮伯特”。

這些伯特（Bert）是一種新的壓縮伯特（Bert），它逐漸替換了原始伯特的組件。

Theseus的伯特

引用

如果您在研究中使用此代碼，請引用我們的論文：

 @inproceedings { xu-etal-2020-bert ,
    title = " {BERT}-of-Theseus: Compressing {BERT} by Progressive Module Replacing " ,
    author = " Xu, Canwen  and
      Zhou, Wangchunshu  and
      Ge, Tao  and
      Wei, Furu  and
      Zhou, Ming " ,
    booktitle = " Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) " ,
    month = nov,
    year = " 2020 " ,
    address = " Online " ,
    publisher = " Association for Computational Linguistics " ,
    url = " https://www.aclweb.org/anthology/2020.emnlp-main.633 " ,
    pages = " 7859--7869 "
}

新：我們上傳了一個腳本，用於對膠水任務進行預測並準備排行榜提交。在這裡查看！

如何運行這些人

要求

我們的代碼建立在擁抱面/變壓器上。要使用我們的代碼，您必須克隆並安裝擁抱面/變壓器。

壓縮一個伯特

您應該按照huggingface的指令微調一個前身模型，然後如果沒有這樣做，則將其保存到目錄中。
按照以下示例進行壓縮：

 # For compression with a replacement scheduler
export GLUE_DIR=/path/to/glue_data
export TASK_NAME=MRPC

python ./run_glue.py 
  --model_name_or_path /path/to/saved_predecessor 
  --task_name $TASK_NAME 
  --do_train 
  --do_eval 
  --do_lower_case 
  --data_dir " $GLUE_DIR / $TASK_NAME " 
  --max_seq_length 128 
  --per_gpu_train_batch_size 32 
  --per_gpu_eval_batch_size 32 
  --learning_rate 2e-5 
  --save_steps 50 
  --num_train_epochs 15 
  --output_dir /path/to/save_successor/ 
  --evaluate_during_training 
  --replacing_rate 0.3 
  --scheduler_type linear 
  --scheduler_linear_k 0.0006

 # For compression with a constant replacing rate
export GLUE_DIR=/path/to/glue_data
export TASK_NAME=MRPC

python ./run_glue.py 
  --model_name_or_path /path/to/saved_predecessor 
  --task_name $TASK_NAME 
  --do_train 
  --do_eval 
  --do_lower_case 
  --data_dir " $GLUE_DIR / $TASK_NAME " 
  --max_seq_length 128 
  --per_gpu_train_batch_size 32 
  --per_gpu_eval_batch_size 32 
  --learning_rate 2e-5 
  --save_steps 50 
  --num_train_epochs 15 
  --output_dir /path/to/save_successor/ 
  --evaluate_during_training 
  --replacing_rate 0.5 
  --steps_for_replacing 2500

有關參數的詳細說明，請參閱源代碼。

在MNLI上負載預估計的模型

我們在MNLI上提供了一個6層預審預週化的模型作為通用模型，該模型可以轉移到其他句子分類任務上，在六個膠水任務（DEV SET）上超過了（具有相同的6層結構）的句子分類任務（具有相同的6層結構）。

方法	mnli	MRPC	Qnli	QQP	rte	SST-2	STS-B
伯特基	83.5	89.5	91.2	89.8	71.1	91.5	88.9
蒸餾廠	79.0	87.5	85.3	84.9	59.9	90.7	81.2
這些伯特	82.1	87.5	88.8	88.8	70.1	91.8	87.8

您可以使用擁抱面/變壓器輕鬆加載我們的通用模型。

 from transformers import AutoTokenizer , AutoModel

tokenizer = AutoTokenizer . from_pretrained ( "canwenxu/BERT-of-Theseus-MNLI" )

model = AutoModel . from_pretrained ( "canwenxu/BERT-of-Theseus-MNLI" )

錯誤報告和貢獻

如果您想貢獻並添加更多任務（目前只有膠水），請提交拉動請求並與我聯繫。另外，如果您發現任何問題或錯誤，請報告問題。謝謝！

第三方實施

我們在此處列出了社區的一些第三方實施。請將您的實現添加到此列表中：

Tensorflow Implementation (tested on NER) ：https：//github.com/qiufengyuyi/bert-oft------theseus-tf
Keras Implementation (tested on text classification) ：https：//github.com/bojone/bert-of-theseus

展開

附加信息

版本 1.0.0
類型其他源碼
更新時間 2025-04-17
大小 567.33KB
來自於 Github

相關應用

英雄連：英雄傳說

2022-09-04
神話時代：龍的傳說

2022-08-29
惡魔之書

2022-07-25
命運之戰

2022-07-25
工業隊長

2022-07-24
魯博神廟

2022-07-24

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3

相關資訊全部