Use qlora to fine-tune the Chinese large language model.
Use qlora to fine-tune baichuan-7b, and the code is more concise: https://github.com/taishan1994/baichuan-Qlora-Tuning
numpy == 1.24 . 2
pandas == 2.0 . 0
nltk == 3.7
transformers == 4.30 . 0. dev0
accelerate == 0.20 . 0. dev0
deepspeed == 0.9 . 2
peft == 0.4 . 0. dev0
datasets == 2.12 . 0
evaluate == 0.2 . 2
sentencepiece == 0.1 . 97
scipy == 1.10 . 1
icetk
cpm_kernels
mpi4py == 3.1 . 4 - - output #训练保存lora权重
- - - - chatglm
- - - - alpaca
- - - - bloom
- - data
- - - - msra
- - - - - - instruct_data
- - - - - - - - train . json #指令数据
- - model_hub
- - - - BELLE - 7 B - 2 M #bloom权重
- - - - chatglm - 6 b #chatGLM权重
- - - - 7 B : #英文LLaMA原始权重
- - - - 7 B - hf : #英文权重转换为hugging face格式权重
- - - - chinese - llama - plus - lora - 7 b : #中文llama-7b的lora权重
- - - - chinese - alpaca - plus - lora - 7 b : #中文alpaca-7b的lora权重
- - - - chinese - alpaca - 7 b : #合并lora后的最终的模型
- - - - tokenizer . model : #原始llama的7B文件
- - - - convert_llama_weights_to_hf . py #llama转换为hugging face格式
- - - - merge_llama_with_chinese_lora . py #合并lora到预训练模型
- - tools
- - - - get_version . py #获取python包版本
- - - - get_used_gpus . py #循环打印使用的GPU显卡
- - chat . py # 闲聊
- - qlora . py # 4bit训练
- - process . py # 测试处理数据ChatGLM-6B download address: Tsinghua University Cloud Disk (tsinghua.edu.cn)
python qlora . py - - model_name = "chatglm" - - model_name_or_path = "./model_hub/chatglm-6b" - - trust_remote_code = True - - dataset = "msra" - - source_max_len = 128 - - target_max_len = 64 - - do_train - - save_total_limit = 1 - - padding_side = "left" - - per_device_train_batch_size = 8 - - do_eval - - bits = 4 - - save_steps = 10 - - gradient_accumulation_steps = 1 - - learning_rate = 1e-5 - - output_dir = "./output/chatglm/" - - lora_r = 8 - - lora_alpha = 32The LLaMA model officially released by Facebook is prohibited from commercial use, and the official does not have official open source model weights (although there are already many third-party download addresses on the Internet). In order to follow the corresponding permission, it is currently impossible to publish the complete model weights. Please understand (the same is true abroad at present). Search for the download address yourself.
python convert_llama_weights_to_hf.py --input_dir ./ --model_size 7B --output_dir ./7B-hf . If an error is reported: If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0 , you can pip install --upgrade protobuf==3.20.1 , and then: python convert_llama_weights_to_hf.py --input_dir ./ --model_size tokenizer_only --output_dir ./7B-hf . Finally we can get 7B-hf.python merge_llama_with_chinese_lora.py --base_model "./7B-hf" --lora_model "./chinese-llama-plus-lora-7b,chinese-alpaca-plus-lora-7b" --output_type "huggingface" --output_dir "./chinese-alpaca-7b" . Finally we can get chinese-alpaca-7b. python qlora . py - - model_name = "chinese_alpaca" - - model_name_or_path = "./model_hub/chinese-alpaca-7b" - - trust_remote_code = False - - dataset = "msra" - - source_max_len = 128 - - target_max_len = 64 - - do_train - - save_total_limit = 1 - - padding_side = "right" - - per_device_train_batch_size = 8 - - do_eval - - bits = 4 - - save_steps = 10 - - gradient_accumulation_steps = 1 - - learning_rate = 1e-5 - - output_dir = "./output/alpaca/" - - lora_r = 8 - - lora_alpha = 32BELLE-7B-2M download address: BelleGroup/BELLE-7B-2M at main (huggingface.co)
python qlora . py - - model_name = "chinese_bloom" - - model_name_or_path = "./model_hub/BELLE-7B-2M" - - trust_remote_code = False - - dataset = "msra" - - source_max_len = 128 - - target_max_len = 64 - - do_train - - save_total_limit = 1 - - padding_side = "left" - - per_device_train_batch_size = 8 - - do_eval - - bits = 4 - - save_steps = 10 - - gradient_accumulation_steps = 1 - - learning_rate = 1e-5 - - output_dir = "./output/bloom/" - - lora_r = 8 - - lora_alpha = 32 python chat . py - - model_name "chatglm" - - base_model "./model_hub/chatglm-6b" - - tokenizer_path "./model_hub/chatglm-6b" - - lora_model "./output/chatglm/adapter_model" - - with_prompt - - interactiveHow to train your own data? The data format is:
{
"data" : [
{ "instruction" : "" , "input" : "" , "output" : "" },
{ "instruction" : "" , "input" : "" , "output" : "" },
...
]
}Then add your own data set in the place where the data is defined in qlora.py. Finally, define the relevant parameters yourself when running the command.
liucongg/ChatGLM-Finetuning: Based on the ChatGLM-6B model, fine-tuning downstream specific tasks are carried out, involving Freeze, Lora, P-tuning, etc. (github.com)
THUDM/ChatGLM-6B: ChatGLM-6B: An Open Bilingual Dialogue Language Model | Open Source Bilingual Dialogue Language Model (github.com)
huggingface/peft: ? PEFT: State-of-the-art Parameter-Efficient Fine-Tuning. (github.com)
ymcui/Chinese-LLaMA-Alpaca: Chinese LLaMA&Alpaca large language model + local CPU/GPU training deployment (Chinese LLaMA & Alpaca LLMs) (github.com)
LianjiaTech/BELLE: BELLE: Be Everyone's Large Language model Engine (Open source Chinese dialogue model) (github.com)
artidoro/qlora: QLoRA: Efficient Finetuning of Quantized LLMs (github.com)
Which kind-hearted person, Kado, can you try the final effect? Renting AutoDL is not enough to make ends meet