pip install -r requirements.txtexport KAGGLE_USERNAME="your_kaggle_username"
export KAGGLE_KEY="your_api_key"
export HF_TOKEN="your_hf_token"sudo apt install unzipkaggle datasets download -d lizhecheng/lmsys-datasets
unzip lmsys-datasets.zipkaggle datasets download -d lizhecheng/lmsys-lora
unzip lmsys-lora.zipcd src
cd team gemma / cd team llama
python train_xxx.pyClick full-training-code
Check our code at LMSYS GitHub.
We employ instruction tuning, making the input format crucial. After experimenting with various formats, we identified the optimal approach:
First, we define a maximum length. Then, we concatenate multiple turns of prompt-response pairs within this limit. If a previous prompt-response pair exceeds the maximum length, the new prompt-response pair is placed in a separate row. For example, consider prompts [P1, P2, P3] with corresponding responses [A1, A2, A3] and [B1, B2, B3]. This method allows us to generate two rows: (P1, A1, B1) and (P2, A2, B2, P3, A3, B3), assuming (P1, A1, B1) does not exceed the maximum length. However, for training, we only use the last turn of the prompt-response pair for each ID.
This approach offers two key advantages:
<start_of_turn>user
Here are two question-answering dialogues. Compare two models' performance on answering questions, determine which is better.
#Prompt1
xxxxx
#Response
##Model A
xxxxx
##Model B
xxxx
#Prompt2
xxxxx
#Response
............
###options
A. Model A
B. Model B
C. Tie
<end_of_turn>
<start_of_turn>model
A<eos>
4bit QLoRA on gemma-2-9b-it and Meta-Llama-3.1-8B-Instruct, parameters: r = 32, modules = ["q_proj", "k_proj", "v_proj", "o_proj"].
Instruction-tuning instead of classification.
No gradient_checkpointing_enable() to reduce the training time.
Used additional 33k data for fine-tuning and sample 10k data to do TTA.
Great CV split (80% / 20%) to avoid duplicates between train and validation.
GPU: multiple 80GB A100 GPUs + multiple A40 GPUs.
Set temperature = 1.03 for inference.
Submission1: gemma-2-9b-it + llama-3.1-8b-it + gemma-2-2b-it & Submission2: gemma-2-9b-it + llama-3.1-8b-it + tta.
Because of some malicious people, what was once a very stable and meaningful competition has turned into one of the worst in Kaggle's history.
Thanks to the entire team for everyone's hard work. Let's keep moving forward!