llama2 lora fine tuning 다운로드 -Llama2 llama2 lora fine tuning 소스 코드 다운로드

llama2 lora fine tuning

AI 소스 코드

1.0.0

다운로드

Lora와 DeepSpeed와 Llama2-Chat를 미세 조정합니다

2 개의 P100 (16G)에서 LLAMA-2-7B-Chat 모델을 미세 조정하십시오.

데이터 소스는 Alpaca 형식을 채택하며 두 가지 데이터 소스, 즉 열차 및 검증으로 구성됩니다.

1. 그래픽 카드 요구 사항

16G 비디오 메모리 이상 (P100 또는 T4 이상), 하나 이상의 블록.

2. 클론 소스 코드

git clone https://github.com/git-cloner/llama2-lora-fine-tuning
cd llama2-lora-fine-tuning

3. 설치 종속 환경

 # 创建虚拟环境
conda create -n llama2 python=3.9 -y
conda activate llama2
# 下载github.com上的依赖资源（需要反复试才能成功，所以单独安装）
export GIT_TRACE=1
export GIT_CURL_VERBOSE=1
pip install git+https://github.com/PanQiWei/AutoGPTQ.git -i https://pypi.mirrors.ustc.edu.cn/simple --trusted-host=pypi.mirrors.ustc.edu.cn
pip install git+https://github.com/huggingface/peft -i https://pypi.mirrors.ustc.edu.cn/simple
pip install git+https://github.com/huggingface/transformers -i https://pypi.mirrors.ustc.edu.cn/simple
# 安装其他依赖包
pip install -r requirements.txt -i https://pypi.mirrors.ustc.edu.cn/simple
# 验证bitsandbytes
python -m bitsandbytes

4. 원본 모델을 다운로드하십시오

python model_download.py --repo_id daryl149/llama-2-7b-chat-hf

5. 중국어 단어 목록을 확장하십시오

 # 使用了https://github.com/ymcui/Chinese-LLaMA-Alpaca.git的方法扩充中文词表
# 扩充完的词表在merged_tokenizes_sp（全精度）和merged_tokenizer_hf（半精度）
# 在微调时，将使用--tokenizer_name ./merged_tokenizer_hf参数
python merge_tokenizers.py 
  --llama_tokenizer_dir ./models/daryl149/llama-2-7b-chat-hf 
  --chinese_sp_model_file ./chinese_sp.model

6. 미세 조정 매개 변수 설명

조정할 수있는 몇 가지 매개 변수가 있습니다.

매개 변수	설명	가치를 얻으십시오
load_in_bits	모델 정확도	비디오 메모리가 오버플로되지 않으면 고정식 8을 선택하십시오.
block_size	토큰의 최대 길이	첫 번째 선택 2048, 메모리 오버플로, 1024, 512 등
per_device_train_batch_size	훈련하는 동안 매번로드 된 카드 당 배치 횟수	기억이 넘치지 않는 한, 총선에 가십시오.
per_device_eval_batch_size	평가 중에 매번로드 된 카드 당 배치 수	기억이 넘치지 않는 한, 총선에 가십시오.
포함하다	사용 된 그래픽 카드 시퀀스	예를 들어, 두 조각 : LocalHost : 1,2 (순서는 Nvidia-Smi가 보는 것과 반드시 동일하지 않음)
NUM_TRAIN_EPOCHS	훈련 라운드 수	최소 3 라운드

7. 미세 조정

chmod +x finetune-lora.sh
# 微调
./finetune-lora.sh
# 微调（后台运行）
pkill -9 -f finetune-lora
nohup ./finetune-lora.sh > train.log  2>&1 &
tail -f train.log

8. 테스트

CUDA_VISIBLE_DEVICES=0 python generate.py 
    --base_model ' ./models/daryl149/llama-2-7b-chat-hf ' 
    --lora_weights ' output/checkpoint-2000 ' 
    --load_8bit #不加这个参数是用的4bit

확장하다

추가 정보