roberta2longformer下載roberta2longformer源代碼下載

roberta2longformer

Ai源碼

1.0.0

下載

Roberta2longformer

該倉庫包含各種功能，將驗證的RoBerta模型的編碼器部分轉換為長期序列變壓器。

具有香草自我注意力的語言模型的記憶消耗和運行時，雙向增長到輸入序列的長度。提出了通過使用稀疏或局部注意模式或使用分解方法近似完整的自我發項矩陣來放鬆此問題的各種模型。該存儲庫包含一些功能，可以用驗證的RoBerta檢查點的權重初始化其中一些模型，因此，可以有效地為長期文檔任務創建新模型。

請注意，用預估計的重量初始化這些模型並不能直接使用它們，更不用說競爭力了。在大多數情況下，需要至少幾千個“繼續進行”步驟以在任何下游任務上取得令人滿意的結果。

羅伯塔

將驗證的RoBerta模型轉換為Longformer模型

Longformer模型（I. Beltagy，Peters，ME和Cohan，A。（2020）。）用局部注意力模式和特定於任務的全球關注代替了全注意機制。除此之外， Longformer模型使用RoBerta （Liu，Y.，Ott，M。，Goyal，et。Al（2019）。）建築。因此，可以很容易地將預告片的RoBerta模型的重量加載到Longformer中。

 from roberta2longformer import convert_roberta_to_longformer

from transformers import RobertaModel , RobertaTokenizerFast
from transformers import LongformerModel , LongformerTokenizerFast

roberta_model = RobertaModel . from_pretrained ( "uklfr/gottbert-base" )
roberta_tokenizer = RobertaTokenizerFast . from_pretrained ( "uklfr/gottbert-base" )

longformer_model , longformer_tokenizer = convert_roberta_to_longformer (
    roberta_model = roberta_model ,
    roberta_tokenizer = roberta_tokenizer ,
    longformer_max_length = 8192
)

print ( list ( longformer_model . encoder . state_dict (). items ())[ 0 ])
print ( list ( roberta_model . encoder . state_dict (). items ())[ 0 ])

inputs = longformer_tokenizer ( "Er sah eine irdische Zentralregierung, und er erblickte Frieden, Wohlstand und galaktische Anerkennung."
                              "Es war eine Vision, doch er nahm sie mit vollen Sinnen in sich auf."
                              "Im Laderaum der STARDUST begann eine rätselhafte Maschine zu summen."
                              "Die dritte Macht nahm die Arbeit auf."
                              "Da lächelte Perry Rhodan zum blauen Himmel empor."
                              "Langsam löste er die Rangabzeichen von dem Schulterstück seiner Kombination." ,
                              return_tensors = "pt" )
outputs = longformer_model ( ** inputs )

# Or to finetune the model on a task:
from transformers import LongformerForSequenceClassification

longformer_model . save_pretrained ( "tmp/longformer-gottbert" )
longformer_tokenizer . save_pretrained ( "tmp/longformer-gottbert" )

seqclass_model = LongformerForSequenceClassification . from_pretrained ( "tmp/longformer-gottbert/" )
...

NYSTRömformer

Nyströmformer -Archittucter（Xionget。Al（2021））使用NyStröm矩陣分解近似自我注意的機制。因此，無需處理特殊的注意模式，從理論上講，這些模型適用於多種多樣的任務。與Longformer模型相比， Nyströmformers似乎會消耗更多的內存。

 from roberta2nyströmformer import convert_roberta_to_nystromformer

from transformers import RobertaModel , RobertaTokenizerFast
from transformers import NystromformerTokenizerFast , NystromformerModel

roberta_model = RobertaModel . from_pretrained ( "uklfr/gottbert-base" )
roberta_tokenizer = RobertaTokenizerFast . from_pretrained ( "uklfr/gottbert-base" )

nystromformer_model , nystromformer_tokenizer = convert_roberta_to_nystromformer (
    roberta_model = roberta_model ,
    roberta_tokenizer = roberta_tokenizer ,
    nystromformer_max_length = 8192
)

print ( list ( nystromformer_model . encoder . state_dict (). items ())[ 0 ])
print ( list ( roberta_model . encoder . state_dict (). items ())[ 0 ])

inputs = nystromformer_tokenizer ( "Er sah eine irdische Zentralregierung, und er erblickte Frieden, Wohlstand und galaktische Anerkennung."
                                 "Es war eine Vision, doch er nahm sie mit vollen Sinnen in sich auf."
                                 "Im Laderaum der STARDUST begann eine rätselhafte Maschine zu summen."
                                 "Die dritte Macht nahm die Arbeit auf."
                                 "Da lächelte Perry Rhodan zum blauen Himmel empor."
                                 "Langsam löste er die Rangabzeichen von dem Schulterstück seiner Kombination." ,
                                 return_tensors = "pt" )
outputs = nystromformer_model ( ** inputs )

# Or to finetune the model on a task:
from transformers import NystromformerForSequenceClassification

nystromformer_model . save_pretrained ( "tmp/nystromformer-gottbert" )
nystromformer_model . save_pretrained ( "tmp/nystromformer-gottbert" )

seqclass_model =  NystromformerForSequenceClassification . from_pretrained ( "tmp/nystromformer-gottbert/" )
...

展開

附加信息

版本 1.0.0
類型 Ai源碼
更新時間 2025-09-06
大小 12.55KB
來自於 Github

相關應用

ML stack

2025-07-01
awesome free chatgpt

2025-01-04
pywin_contextmenu

2025-08-31
promptl

2025-02-17
tick.chat

2025-09-16
FastLoRAChat

2025-09-03

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
ML stack

Ai源碼

1.0.0
awesome free chatgpt

Ai源碼

1.0.0
pywin_contextmenu

Ai源碼

Version update
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3

相關資訊全部