Chinese LLaMA Alpaca Download - Chinese LLaMA Alpaca Source code download

Chinese LLaMA Alpaca

Other source code

v5.0

Download

Chinese-LLaMA-Alpaca-3 project started!

This project open source the Chinese LLaMA model and the instruction-tuned Alpaca model to further promote the open research of the big model in the Chinese NLP community. These models have expanded the Chinese vocabulary list based on the original LLaMA and used Chinese data for secondary pre-training, further improving the basic semantic understanding ability of Chinese. At the same time, the Chinese Alpaca model further uses Chinese instruction data for fine adjustment, which significantly improves the model's understanding and execution capabilities of instructions.

Technical Report (V2) : [Cui, Yang, and Yao] Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca

Main content of this project:

Expand the Chinese vocabulary list for the original LLaMA model, improving the efficiency of Chinese encoding and codec
Open source Chinese LLaMA pre-trained using Chinese text data and Chinese Alpaca that is finely tuned by instruction
Open source pre-training scripts and instruction fine-tuning scripts, and users can further train models as needed.
Quickly use laptops (personal PCs) to locally quantize and deploy experience models
Supports ?transformers, llama.cpp, text-generation-webui, LlamaChat, LangChain, privateGPT and other ecology
Currently open source model versions: 7B (basic version, Plus version , Pro version ), 13B (basic version, Plus version , Pro version ), 33B (basic version, Plus version , Pro version )

The figure below shows the actual experience speed and effect of the Chinese Alpaca-Plus-7B model after local CPU quantitative deployment.

news

[2024/04/30] Chinese-LLaMA-Alpaca-3 has been officially released, with open source Llama-3-Chinese-8B and Llama-3-Chinese-8B-Instruct based on Llama-3. It is recommended that all phase one and phase two project users upgrade to the third generation model, please refer to: https://github.com/ymcui/Chinese-LLaMA-Alpaca-3

[2024/03/27] This project has been deployed in the Heart of Machine SOTA! model platform, welcome to follow: https://sota.jiqizhixin.com/project/chinese-llama-alpaca

[2023/08/14] Chinese-LLaMA-Alpaca-2 v2.0 version has been officially released, open source Chinese-LLaMA-2-13B and Chinese-Alpaca-2-13B. It is recommended that all first-phase users upgrade to the second-generation model, please refer to: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2

[2023/07/31] Chinese-LLaMA-Alpaca-2 v1.0 version has been officially released, please refer to: https://github.com/ymcui/Chinese-LLaMA-Alpaca-2

[2023/07/19] v5.0 version: Release the Alpaca-Pro series model, significantly improving the reply length and quality; at the same time, release the Plus-33B series model.

[2023/07/19] Launch the Chinese LLaMA-2 and Alpaca-2 open source big model projects, welcome to follow and learn about the latest information.

[2023/07/10] Beta test preview, learn about the upcoming updates in advance: see the discussion area for details

[2023/07/07] The Chinese-LLaMA-Alpaca family has added another member, launching a multimodal Chinese LLaMA&Alpaca model for visual question and answer and dialogue, and released a 7B test version.

[2023/06/30] 8K context support under llama.cpp (no modification to the model is required). Please refer to the discussion area for related methods and discussions; please refer to PR#705 for codes that support 4K+ context under transformers.

[2023/06/16] v4.1 version: Release a new version of the technical report, add C-Eval decoding scripts, add low-resource model merge scripts, etc.

[2023/06/08] v4.0 version: Release Chinese LLaMA/Alpaca-33B, add privateGPT usage examples, add C-Eval results, etc.

Content guidance

chapter	describe
⏬Model Download	Chinese LLaMA and Alpaca big model download address
?Merge the model	(Important) Introduce how to merge the downloaded LoRA model with the original LLaMA
Local reasoning and rapid deployment	Introduces how to quantify models and deploy and experience large models using a personal computer
?System effects	The user experience effect under some scenarios and tasks is introduced
Training details	Introduced the training details of Chinese LLaMA and Alpaca models
❓FAQ	Replies to some FAQs
️ limitation	This project involves limitations of the model

Model download

User Instructions (must read)

The LLaMA model officially released by Facebook is prohibited from commercial use, and the official does not have official open source model weights (although there are already many third-party download addresses on the Internet). In order to comply with the corresponding permission, the LoRA weights released here can be understood as a "patch" on the original LLaMA model. The two can be combined to obtain the full copyright. The following Chinese LLaMA/Alpaca LoRA model cannot be used alone and needs to be matched with the original LLaMA model. Please refer to the merge model steps given in this project to reconstruct the model.

Model List

The following figure shows the relationship between this project and all the big models launched by the second phase project.

Model selection guidelines

Below is a basic comparison of Chinese LLaMA and Alpaca models and recommended usage scenarios (including but not limited to). For more information, see the training details.

Comparison items	Chinese LLaMA	Chinese Alpaca
Training method	Traditional CLM	Instruction fine adjustment
Model Type	Base model	Instruction understanding model (class ChatGPT)
Training materials	Unmarked general essay	Labeled instruction data
Vocabulary size ^[3]	4995 3	4995 4 =49953+1 (pad token)
Input template	unnecessary	Need to meet the template requirements ^[1]
Applicable scenarios ✔️	Text continuation: Given the above content, let the model generate the following text	Instruction understanding (questions and answers, writing, suggestions, etc.); multi-round context understanding (chat, etc.)
Not applicable	Command understanding, multiple rounds of chat, etc.	Unlimited text generation
llama.cpp	Use the `-p` parameter to specify the above	Use the `-ins` parameter to start the instruction understanding + chat mode
text-generation-webui	Not suitable for chat mode	Use `--cpu` to run without graphics card
LlamaChat	Select "LLaMA" when loading the model	Select "Alpaca" when loading the model
HF inference code	No additional startup parameters required	Add parameters at startup `--with_prompt`
web-demo code	not applicable	Just provide the Alpaca model position directly; support multiple rounds of conversations
LangChain Example/privateGPT	not applicable	Just provide the Alpaca model location directly
Known issues	If no control terminates, it will continue to write until the upper output length limit is reached. ^[2]	Please use the Pro version to avoid the problem of the Plus version being too short.

[1] llama.cpp/LlamaChat/HF inference code/web-demo code/LangChain examples etc are embedded, no need to add templates manually.
[2] If the model answer quality is particularly low, nonsense, or not understanding the problem, please check whether the correct model and startup parameters are used.
[3] Alpaca with fine-tuned instructions will have one more pad token than LLaMA, so please do not mix the LLaMA/Alpaca vocabulary list .

Recommended model download

The following is a list of models recommended for this project. Usually, more training data and optimized model training methods and parameters are used. Please give priority to these models (see other models for the rest of the models). If you want to experience ChatGPT dialogue interaction, please use the Alpaca model instead of the LLaMA model. For the Alpaca model, the Pro version has improved the problem of the response content that is too short, and the model reply effect has been significantly improved; if you prefer short replies, please select the Plus series.

Model name	type	Training data	Reconstructing the model ^[1]	Size ^[2]	LoRA download ^[3]
Chinese-LLaMA-Plus-7B	Base model	General purpose 120G	Original LLaMA-7B	790M	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-LLaMA-Plus-13B	Base model	General purpose 120G	Original LLaMA-13B	1.0G	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-LLaMA-Plus-33B?	Base model	General purpose 120G	Original LLaMA-33B	1.3G ^[6]	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-Alpaca-Pro-7B?	Instruction Model	Instruction 4.3M	Original LLaMA-7B & LLaMA-Plus-7B ^[4]	1.1G	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-Alpaca-Pro-13B ?	Instruction Model	Instruction 4.3M	Original LLaMA-13B & LLaMA-Plus-13B ^[4]	1.3G	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-Alpaca-Pro-33B ?	Instruction Model	Instruction 4.3M	Original LLaMA-33B & LLaMA-Plus-33B ^[4]	2.1G	[Baidu] [Google] [?HF] [?ModelScope]

[1] Refactoring requires the original LLaMA model, go to the LLaMA project to apply for use or refer to this PR. Due to copyright issues, this project cannot provide a download link.
[2] The reconstructed model size is larger than the original LLaMA of the same magnitude (mainly because of the expanded vocabulary list).
[3] After downloading, be sure to check whether the SHA256 of the model file in the compressed package is consistent. Please check SHA256.md.
[4] The Alpaca-Plus model needs to download the corresponding LLaMA-Plus model at the same time, please refer to the merge tutorial.
[5] Some places call it 30B, but in fact Facebook wrote it wrongly when publishing the model, and the paper still wrote 33B.
[6] Use FP16 storage, so the model size is small.

The file directory in the compressed package is as follows (taking Chinese-LLaMA-7B as an example):

 chinese_llama_lora_7b/
  - adapter_config.json		# LoRA权重配置文件
  - adapter_model.bin		# LoRA权重文件
  - special_tokens_map.json	# special_tokens_map文件
  - tokenizer_config.json	# tokenizer配置文件
  - tokenizer.model		# tokenizer文件

Other models download

Due to factors such as training methods and training data, the following models are no longer recommended (may still be useful in specific scenarios) . Please give priority to the recommended models in the previous section.

Model name	type	Training data	Refactoring the model	size	LoRA Download
Chinese-LLaMA-7B	Base model	General 20G	Original LLaMA-7B	770M	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-LLaMA-13B	Base model	General 20G	Original LLaMA-13B	1.0G	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-LLaMA-33B	Base model	General 20G	Original LLaMA-33B	2.7G	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-Alpaca-7B	Instruction Model	Instruction 2M	Original LLaMA-7B	790M	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-Alpaca-13B	Instruction Model	Instruction 3M	Original LLaMA-13B	1.1G	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-Alpaca-33B	Instruction Model	Instruction 4.3M	Original LLaMA-33B	2.8G	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-Alpaca-Plus-7B	Instruction Model	Instruction 4M	Original LLaMA-7B & LLaMA-Plus-7B	1.1G	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-Alpaca-Plus-13B	Instruction Model	Instruction 4.3M	Original LLaMA-13B & LLaMA-Plus-13B	1.3G	[Baidu] [Google] [?HF] [?ModelScope]
Chinese-Alpaca-Plus-33B	Instruction Model	Instruction 4.3M	Original LLaMA-33B & LLaMA-Plus-33B	2.1G	[Baidu] [Google] [?HF] [?ModelScope]

?transformers call

All of the above models can be downloaded on Model Hub and the Chinese LLaMA or Alpaca LoRA models can be called using transformers and PEFT. The following model call name refers to the model name specified in .from_pretrained() .

Detailed list and model download address: https://huggingface.co/hfl

Merge the model

As mentioned earlier, the LoRA model cannot be used alone and must be merged with the original LLaMA to be converted into a complete model for model inference, quantification or further training. Please select the following method to convert and merge the model.

Way	Applicable scenarios	Tutorial
Online conversion	Colab users can use the notebook provided by this project to convert online and quantify models.	Link
Manual conversion	Convert offline, generate models in different formats for quantization or further refinement	Link

The following is the FP16 precision and 4-bit quantization size after merging the model. Make sure that the machine has sufficient memory and disk space before conversion (minimum requirements):

Model version	7B	13B	33B	65B
Original model size (FP16)	13 GB	24 GB	60 GB	120 GB
Quantized size (8-bit)	7.8 GB	14.9 GB	32.4 GB	~60 GB
Quantized size (4-bit)	3.9 GB	7.8 GB	17.2 GB	38.5 GB

For details, please refer to this project >>> GitHub Wiki

Local reasoning and rapid deployment

The models in this project mainly support the following quantification, reasoning and deployment methods.

Reasoning and deployment methods	Features	platform	CPU	GPU	Quantitative loading	Graphic interface	Tutorial
llama.cpp	Rich quantitative options and efficient local reasoning	General	✅	✅	✅		link
?Transformers	Native transformers inference interface	General	✅	✅	✅	✅	link
text-generation-webui	How to deploy the front-end Web UI interface	General	✅	✅	✅	✅	link
LlamaChat	Graphic Interactive Interface under macOS	MacOS	✅		✅	✅	link
LangChain	LLM application development framework, suitable for secondary development	General	✅ ^†	✅	✅ ^†		link
privateGPT	Multi-document local question and answer framework based on LangChain	General	✅	✅	✅		link
Colab Gradio Demo	Start an interactive Gradio-based web service in Colab	General	✅	✅	✅		link
API calls	Server demo that emulates OpenAI API interface	General	✅	✅	✅		link

^† : The LangChain framework supports it, but it is not implemented in the tutorial; please refer to the official LangChain documentation for details.

For details, please refer to this project >>> GitHub Wiki

System effects

Generate effect evaluation

In order to quickly evaluate the actual text generation performance of the relevant models, given the same propt, this project compared and tested the effects of Chinese Alpaca-7B, Chinese Alpaca-13B, Chinese Alpaca-33B, Chinese Alpaca-Plus-7B, and Chinese Alpaca-Plus-13B in this project on some common tasks. Generating reply is random and is affected by factors such as decoding hyperparameters and random seeds. The following related reviews are not absolutely rigorous. The test results are for drying reference only. You are welcome to experience it yourself.

For detailed evaluation results and sample generation, please check the examples directory
Alpaca model online battle: http://llm-arena.ymcui.com

Objective effect evaluation

This project also tested the relevant models on the objective evaluation set of "NLU". The results of this type of evaluation are not subjective, and only require the output of a given tag (the tag mapping strategy needs to be designed), so you can understand the capabilities of the big model from another perspective. This project tested the effects of related models on the recently launched C-Eval evaluation dataset, which contained 12.3K multiple-choice questions and covered 52 subjects. The following are the valid and test set evaluation results (Average) of some models. Please refer to the technical report for the complete results.

Model	Valid (zero-shot)	Valid (5-shot)	Test (zero-shot)	Test (5-shot)
Chinese-Alpaca-Plus-33B	46.5	46.3	44.9	43.5
Chinese-Alpaca-33B	43.3	42.6	41.6	40.4
Chinese-Alpaca-Plus-13B	43.3	42.4	41.5	39.9
Chinese-Alpaca-Plus-7B	36.7	32.9	36.4	32.3
Chinese-LLaMA-Plus-33B	37.4	40.0	35.7	38.3
Chinese-LLaMA-33B	34.9	38.4	34.6	39.5
Chinese-LLaMA-Plus-13B	27.3	34.0	27.8	33.3
Chinese-LLaMA-Plus-7B	27.3	28.3	26.9	28.4

It should be noted that comprehensive evaluation of the ability of big models is still an important issue that needs to be solved urgently. A reasonable and dialectical view of various evaluation results related to big models will help the healthy development of big model technology. It is recommended that users test on tasks they are concerned about and select models that adapt to related tasks.

Please refer to this project for C-Eval inference code >>> GitHub Wiki

Training details

The entire training process includes three parts: vocabulary expansion, pre-training and instruction fine adjustment.

The models of this project have expanded Chinese words based on the original LLaMA vocabulary list. Please refer to merge_tokenizers.py for the code.
Pre-training and instruction fine-tuning code referenced? Run_clm.py in transformers and related parts of dataset processing in Stanford Alpaca project
Open source training scripts for pre-training and instruction fine-tuning: pre-training script Wiki, instruction fine-tuning script Wiki

For details, please refer to this project >>> GitHub Wiki

FAQ

The FAQ is given a frequently asked question. Please check the FAQ before asking for an Issue.

问题1：为什么不能放出完整版本权重？
问题2：后面会有33B、65B的版本吗？
问题3：一些任务上效果不好！
问题4：为什么要扩充词表？直接在原版LLaMA上用中文预训练不行吗？
问题5：回复内容很短
问题6：Windows下，模型无法理解中文、生成速度很慢等问题
问题7：Chinese-LLaMA 13B模型没法用llama.cpp启动，提示维度不一致
问题8：Chinese-Alpaca-Plus效果很差
问题9：模型在NLU类任务（文本分类等）上效果不好
问题10：为什么叫33B，不应该是30B吗？
问题11：模型合并之后SHA256不一致

Please refer to this project for specific questions and answers >>> GitHub Wiki

limitation

Although the models in this project have certain Chinese understanding and generation capabilities, they also have limitations, including but not limited to:

Unpredictable harmful content and content that does not conform to human preferences and values
Due to computing power and data problems, the training of related models is not sufficient, and the Chinese understanding ability needs to be further improved.
There is currently no online interactive demo (Note: users can still deploy locally by themselves)

Quote

If you feel this project is helpful to your research or use the code or data of this project, please refer to the technical report citing this project: https://arxiv.org/abs/2304.08177

 @article{chinese-llama-alpaca,
      title={Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca}, 
      author={Cui, Yiming and Yang, Ziqing and Yao, Xin},
      journal={arXiv preprint arXiv:2304.08177},
      url={https://arxiv.org/abs/2304.08177},
      year={2023}
}

Related projects

Project name	Introduction	type
Chinese-LLaMA-Alpaca-2 (official project)	Chinese LLaMA-2, Alpaca-2 big model	text
Visual-Chinese-LLaMA-Alpaca (official project)	Multimodal Chinese LLaMA & Alpaca big model	Multimodal

Want to join the list? >>> Submit an application

Acknowledgements

This project is based on the secondary development of the following open source projects. I would like to express my gratitude to the relevant projects and research and development staff.

Basic model, code	Quantification, reasoning, deployment	data
LLaMA by Facebook Alpaca by Stanford alpaca-lora by @tloen	llama.cpp by @ggerganov LlamaChat by @alexrozanski text-generation-webui by @oobabooga	pCLUE and MT data by @brightmart oasst1 by OpenAssistant

Disclaimer

The resources related to this project are for academic research only and are strictly prohibited for commercial purposes. When using parts involving third-party code, please strictly follow the corresponding open source protocol. The content generated by the model is affected by factors such as model calculation, randomness and quantitative accuracy losses. This project does not guarantee its accuracy. This project assumes no legal liability for any content output by the model, nor is it liable for any losses that may arise from the use of relevant resources and output results. This project is initiated and maintained by individuals and collaborators in their spare time, so it is impossible to guarantee that they can respond promptly to solve the corresponding problems.

Question feedback

If you have any questions, please submit it in GitHub Issue. Ask questions politely and build a harmonious discussion community.

Before submitting the question, please check whether the FAQ can solve the problem. It is also recommended to check whether the previous issue can solve your problem.
To submit a question, please use the Issue template set by this project to help quickly locate specific questions.
Repeat and issues not related to this project will be processed by stable-bot. Please understand.

Expand

Additional Information

Version v5.0
Type Other source code
Update Time 2025-04-15
size 18.41MB
From Github

Related Applications

node llama cpp

2024-11-11
llama models

2024-11-10
LLaMA Factory

2024-11-02
Code Llama

2023-10-30
Llama 2

2023-08-17
Alpaca Ball: Allstars

2022-08-08

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Other source code

v1.1.0-rc-3
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Other source code

v1.1.0-rc-3

Related Information All