ดาวน์โหลด AmoebaLLM - ดาวน์โหลดซอร์สโค้ด AmoebaLLM

AmoebaLLM

โค้ดแหล่งที่มา AI

1.0.0

ดาวน์โหลด

amoeballm: การสร้างแบบจำลองภาษาขนาดใหญ่รูปทรงใด ๆ เพื่อการปรับใช้ที่มีประสิทธิภาพและทันที

Yonggan Fu, Zhongzhi Yu, Junwei Li, Jiayi Qian, Yongan Zhang, Xiangchi Yuan, Dachuan Shi, Roman Yakunin และ Yingyan (Celine) Lin

ยอมรับที่ Neurips 2024 [Paper | สไลด์].

Amoeballm: ภาพรวม

วิธีการฝึกอบรมครั้งเดียวและได้รับ LLM ที่มีประสิทธิภาพมากมาย? เราแนะนำ Amoeballm ซึ่งเป็นกรอบนวนิยายที่ออกแบบมาเพื่อรับซับเน็ต LLM ของรูปร่างโดยพลการทันทีซึ่งจะได้รับพรมแดนที่มีประสิทธิภาพอย่างแม่นยำและสามารถสกัดได้หลังจากการปรับแต่งครั้งเดียวเพียงครั้งเดียว ด้วยวิธีนี้ amoeballm อำนวยความสะดวกในการปรับใช้อย่างรวดเร็วซึ่งปรับให้เหมาะกับแพลตฟอร์มที่แตกต่างกันและข้อกำหนดที่ขับเคลื่อนด้วยแอปพลิเคชัน โดยเฉพาะอย่างยิ่ง amoeballm บรรลุเป้าหมายนี้โดยการสกัดซับเน็ตที่มีประสิทธิภาพสูงอย่างมีกลยุทธ์และฝึกอบรมพวกเขาร่วมกันเพื่อหลีกเลี่ยงความขัดแย้ง

ผลการทดลอง: amoeballm ไม่เพียง แต่กำหนดมาตรฐานใหม่ในการปรับตัว LLM แต่ยังประสบความสำเร็จในการมอบซับเน็ตที่บรรลุการแลกเปลี่ยน SOTA ระหว่างความแม่นยำและประสิทธิภาพ

การใช้รหัส

การตั้งค่าสภาพแวดล้อม

ใช้ conda เพื่อตั้งค่าสภาพแวดล้อมตาม env.yml ที่ให้ไว้:

 conda env create -f env.yml

ขั้นตอนที่ 1: การเลือกชุดย่อยความรู้

ขั้นตอนที่ 1 : กลยุทธ์การเลือกเลเยอร์โดยใช้การโปรแกรมแบบไดนามิก:

 CUDA_VISIBLE_DEVICES=0 python main.py --model_name_or_path meta-llama/Llama-2-7b-hf --fp16 --output_dir ./output/calib_dp --do_train False --do_eval False --no_eval_orig --layer_calib_dp --calib_dataset mmlu --enable_shrinking --num_calib_sample 40 --calib_metric acc --min_num_layer 20 --dp_keep_last_layer 1

ขั้นตอนที่ 2 : Derive Neuron (ความกว้าง) กลยุทธ์การเลือกโดยใช้ตัวชี้วัดความสำคัญใน FLAP:

 CUDA_VISIBLE_DEVICES=0 python main.py --model_name_or_path meta-llama/Llama-2-7b-hf --fp16 --output_dir ./output/width_calib --do_train False --do_eval False --use_auth_token --no_eval_orig --width_calib --num_calib_sample 512 --prune_width_method flap

ขั้นตอนที่ 3 : รวมกลยุทธ์การเลือกเลเยอร์และเซลล์ประสาทเข้ากับไฟล์เดียวกัน dp_selection_strategy.npy (เราได้จัดเตรียมไฟล์นี้สำหรับ llama2-7b ใน repo):

 python utils/merge_depth_width.py

ขั้นตอนที่ 2: การปรับแต่งแบบเดียวสำหรับทุกคน

เปิดใช้งานการปรับแต่งแบบเดียวสำหรับการปรับแต่ง --do_train True และ --enable_shrinking และระบุกลยุทธ์การเลือกชุดย่อยที่จัดทำโดยขั้นตอนที่ 1 ด้วย --shrinking_file dp_selection_strategy.npy :

 CUDA_VISIBLE_DEVICES=0 python main.py --model_name_or_path meta-llama/Llama-2-7b-hf --output_dir ./output/ft --dataset alpaca-gpt4 --use_auth_token --do_train True --do_eval True --do_mmlu_eval True --do_eval_wikitext2 True --lora_modules all --fp16 --source_max_len 384 --target_max_len 128 --gradient_accumulation_steps 4 --logging_steps 10 --max_steps 10000 --save_strategy steps --data_seed 42 --save_steps 1000 --save_total_limit 1 --evaluation_strategy steps --eval_dataset_size 1024  --max_eval_samples 1000 --eval_steps 1000 --optim paged_adamw_32bit --ddp_find_unused_parameters --enable_shrinking --kd_weight 1 --min_num_layer 20 --random_sample_num_layer 2 --distill_method sp --shrinking_method calib_dp --shrinking_file dp_selection_strategy.npy --shrinkable_width --width_choice [1,7/8,3/4,5/8] --prune_width_method flap --use_moe_lora --moe_num_expert 5 --moe_topk 2

การประเมิน

นอกเหนือจากโมเดลที่ปรับแต่งของคุณที่สร้างขึ้นโดยใช้กระบวนการสองขั้นตอนที่อธิบายไว้ข้างต้นแล้วเรายังได้จัดทำโมเดล Llama2-7B ที่ปรับแต่งอะมีบาของเรา amoeba_llama2 ที่นี่ คุณสามารถดาวน์โหลดและคลายซิปโดยใช้คำสั่งต่อไปนี้:

 pip install gdown
gdown 1lwOiQa-UOYOXn72wo5gvzUvFat_PTg6b
unzip amoeba_llama2.zip

ระบุ --output_dir เป็นพา ธ ไปยังโมเดลที่ปรับแต่งและระบุอัตราส่วนความลึกและความกว้างของเป้าหมายโดยใช้ --eval_num_layer และ --eval_num_width ตามลำดับ:

 CUDA_VISIBLE_DEVICES=0 python main.py --model_name_or_path meta-llama/Llama-2-7b-hf --output_dir amoeba_llama2 --do_train False --do_eval True --do_mmlu_eval True --bits 8 --bf16 --enable_shrinking --min_num_layer 20 --shrinking_method calib_dp --shrinking_file dp_selection_strategy.npy --shrinkable_width --width_choice [1,7/8,3/4,5/8] --prune_width_method flap --use_moe_lora --moe_num_expert 5 --moe_topk 2  --eval_num_layer 24 --eval_num_width 0.875 --do_lm_eval True --do_lm_eval_task arc_easy,piqa,hellaswag

การรับทราบ

เราอ้างถึงการใช้งานใน Qlora

การอ้างอิง

 @inproceedings{fuamoeballm,
  title={AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment},
  author={Fu, Yonggan and Yu, Zhongzhi and Li, Junwei and Qian, Jiayi and Zhang, Yongan and Yuan, Xiangchi and Shi, Dachuan and Yakunin, Roman and Lin, Yingyan Celine},
  booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems}
}

ขยาย

ข้อมูลเพิ่มเติม

เวอร์ชัน 1.0.0
ประเภท โค้ดแหล่งที่มา AI
เวลาอัปเดต 2025-09-19
ขนาด 36.02MB
มาจาก Github

แอปที่เกี่ยวข้อง

ML stack

2025-07-01
awesome free chatgpt

2025-01-04
pywin_contextmenu

2025-08-31
promptl

2025-02-17
tick.chat

2025-09-16
FastLoRAChat

2025-09-03

แนะนำสำหรับคุณ

chat.petals.dev

ซอร์สโค้ดอื่น ๆ

1.0.0
GPT Prompt Templates

ซอร์สโค้ดอื่น ๆ

1.0.0
GPTyped

ซอร์สโค้ดอื่น ๆ

GPTyped 1.0.5
ML stack

โค้ดแหล่งที่มา AI

1.0.0
awesome free chatgpt

โค้ดแหล่งที่มา AI

1.0.0
pywin_contextmenu

โค้ดแหล่งที่มา AI

Version update
Google Dorks

ซอร์สโค้ดอื่น ๆ

1.0
shepherd

ซอร์สโค้ดอื่น ๆ

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

ซอร์สโค้ดอื่น ๆ

v1.1.0-rc-3

ข้อมูลที่เกี่ยวข้อง ทั้งหมด