selective peft toolkit
1.0.0
歡迎來到selective-peft-toolkit ,這是“逐步揭示大型語言模型參數有效微調”的官方實施。該工具包提供了一個靈活的框架,用於使用不同選擇性參數有效的微調(PEFT)方法選擇性微調大型語言模型。
該工具包包括以下PEFT方法:
這些方法是通過稱為selective_optimizers的軟件包暴露的,可以通過PIP安裝:
pip install selective_optimizers選擇性優化器:圍繞標準優化器的包裝器(Torch.optim.optimizer的子類),可有選擇地更新模型中預算數量的參數。
基於啟發式的選擇:根據各種啟發式和選擇策略,選擇性優化器更新參數。
與變壓器集成:與變壓器兼容。
有效的存儲:將修改的權重存儲在僅佔用o(b)空間的摘要對像中,其中b為預算。
要安裝selective_optimizers軟件包,只需運行:
pip install selective-optimizers這是通過選擇性優化器進行培訓的基本工作流程:
from selective_optimizers . wrap import get_selective_optimizer
from selective_optimizers . load_store import write_summary_to_disk
from torch . optim import AdamW
# Choose your base optimizer
opt = AdamW
# Specify the PEFT method to use (can be one of “fft”, “id3”, “bitfit”, or “pafi”)
peft_to_use = "id3"
# Get the selective optimizer class
optimizer_class = get_selective_optimizer ( opt , peft_to_use )
# Initialize the optimizer with additional selective parameters
optimizer = optimizer_class (
params = model . parameters (),
lr = 0.0001 ,
budget = 100000 ,
exp = 0 ,
eps = 1e-3 ,
max_steps = 1000
)
# Usual training loop
...
...
# Optional post-training work for validation
optimizer . post_train_work ()
print ( "Budget used:" , optimizer . get_budget_used ())
# Save the summary of modified weights
summary = optimizer . get_summary ( model )
write_summary_to_disk ( "path/to/summary.pth" , summary ) from selective_optimizers . load_store import load_summary_from_disk , load_weights_from_summary
# Load your model as usual
...
model = ...
...
# Load the summary from disk
summary = load_summary_from_disk ( "path/to/summary.pth" )
# Apply the modified weights from the summary to the model
load_weights_from_summary ( model , summary )
# Usual inference code
...
...Transformers.Trainer類接受外部優化器,從而易於將選擇性優化器集成到您的工作流程中:
我們歡迎對Selective_optimizers軟件包的貢獻!如果您想添加新的選擇性優化器,請按照以下步驟:
該項目已根據MIT許可獲得許可。有關詳細信息,請參見許可證文件。
如果您在研究中使用此工具包,請引用我們的論文:
@article{Agarwal2024_step_by_step,
title = {Step - by - Step Unmasking for Parameter - Efficient Fine - tuning of Large Language Models},
author = {Aradhye Agarwal and Suhas Kamasetty Ramesh and Ayan Sengupta and Tanmoy Chakraborty}
journal = {arXiv preprint arXiv: 2408 . 14470 },
year = { 2024 },
}
對於任何問題或問題,請隨時在GitHub存儲庫上打開問題,或直接與我們聯繫。