Bitune
1.0.0
Bitune: Bidirectional Instruction-Tuning
[Paper] [Website]
This source code contains the implementation of Bitune, and it's sufficient to reproduce the results from the paper. Please note that it was used to explore different ideas, and many components have different names or refer to concepts not mentioned in the paper.
We plan to release a clean repo for Bitune in the near future.
The lm-evaluation-harness directory contains the repository from EleutherAI/lm-evaluation-harness, adapted to our method. You can install it with the following command:
pip install -e lm-evaluation-harnesscommon_0.sh file.wandb for logging. Update line 57 of eval.py with your wandb username.instruct.sh script.downstream.sh script. Ensure to set the correct number of update steps (based on the values provided in the appendix), and uncomment the appropriate lines for the dataset name, evaluations (at the very bottom), and the method name.ablations.sh and run the script.models directory:
pass_scale_k, pass_scale_v).enforce_bidir parameter of the forward() function.forward() function responsible for calling the Bitune wrapper._pass_fn() in the passes.py file):
pass_scale_k, pass_scale_v).peft library sets inactive adapters as non-trainable.PassScale defined in models/think_gemma.py):
forward() function that applies the mixing operation based on the variant specified in the config (config.pass_type). Our final method is defined by the variant 607 (the one used for experiments), and its simplified version 801.The following versions of the libraries have been used:
transformers==4.38.2peft==0.11.1datasets==2.18.0evaluate==0.4.0@misc{kopiczko2024bitune,
title={Bitune: Bidirectional Instruction-Tuning},
author={Dawid J. Kopiczko and Tijmen Blankevoort and Yuki M. Asano},
year={2024},
eprint={2405.14862},
archivePrefix={arXiv},
primaryClass={cs.CL}
}