
This project opens the fine-tuning model based on the LLaMA system-based model and has been subject to Chinese financial knowledge instruction fine-tuning/instruction fine-tuning. The instruction data set is constructed through the Chinese financial public Q&A data + crawled financial Q&A data, and on this basis, the instruction fine-tuning of the LLaMA system model has been carried out to improve the Q&A effect of LLaMA in the financial field.
Based on existing data and Chinese financial data that continues to be crawled, we will continue to use the GPT3.5/4.0 API to build high-quality data sets, and further expand high-quality instruction data sets on Chinese knowledge graph-finance, CFLEB financial data sets and other data.
New financial models of Chinese scenarios (next-pretrain, multi-task SFT, RLHF) will be released one after another. Everyone is welcome to experience it at that time, so stay tuned.
[2023/05/10] A model for fine-tuning instructions based on Chinese-LLaMA and Chinese financial data was released.
[2023/05/07] A model for fine-tuning instructions based on Meta-LLaMA and Chinese financial data was released.
First install the dependency package, python environment recommends 3.9+
pip install -r requirements.txt
Secondly, install lfs to facilitate local download of LLaMa large model
git lfs install
# 下载7B模型到本地
bash ./base_models/load.sh
LoRA weights can be downloaded through Huggingface, and the structure is as follows:
Fin-Alpaca-LoRA-7B-Meta/
- adapter_config.json # LoRA权重配置文件
- adapter_model.bin # LoRA权重文件
| LoRA model download | Classification | Refactoring the model | Training data | Training sequence length | Version |
|---|---|---|---|---|---|
| Fin-Alpaca-LoRA-7B-Meta | Chinese Financial Q&A Fine Tuning Model | decapoda-research/llama-7b-hf | 12M instruction data | 512 | V1.0 |
| Fin-Alpaca-LoRA-7B-Linly | Chinese Financial Q&A Fine Tuning Model | Linly-AI/Chinese-LLaMA-7B | 14M instruction data | 512 | V1.1 |
Currently, some test cases are provided in ./instruction_data/infer.json , which can also be replaced with other data sets, but please note that the format is consistent.
Run the infer script
# 单模型推理
bash ./scripts/infer.sh
# 多模型对比
bash ./scripts/comparison_test.sh
The previous version used public and crawled Chinese financial field Q&A data , involving insurance, financial management, stocks, funds, loans, credit cards, social security, etc.
The instruction data example is as follows:
问题:办理商业汇票应遵守哪些原则和规定?
回答: 办理商业汇票应遵守下列原则和规定:1.使用商业汇票的单位,必须是在银行开立帐户的法人;2.商业汇票在同城和异地均可使用;3.签发商业汇票必须以合法的商品交易为基础;4.经承兑的商业汇票,可向银行贴现;5.商业汇票一律记名,允许背书转让;6.商业汇票的付款期限由交易双方商定,最长不得超过6个月;7.商业汇票经承兑后,承兑人即付款人负有到期无条件交付票款的责任;8.商业汇票由银行印制和发售。
In view of the previous inaccuracy and single type of data, we are currently using the GPT3.5/4.0 interface to further optimize the data and expand the Chinese financial knowledge base, set up a variety of Prompt forms and multi-task forms to expand the rich instruction data sets , and achieve multi-business scenario coverage in the financial field.
Latest model situation: (To be released soon, please stay tuned~)

If you want to fine-tune LLaMA with your own dataset, please build your own dataset in the format of ./instruction_data/fin_data.json
Run the finetune script
bash ./scripts/finetune.sh
Currently, the training equipment is an A100-SXM-80GB graphics card, with a total training cycle of 10 rounds. With batch_size=64, the memory occupies about 40G and with batch_size=96, the memory occupies about 65G. It is expected that 3090/4090 graphics cards (24GB graphics memory) or above can be better supported, and batch_size is adjusted according to the size of the graphics memory.
For experimental indicator records and hyperparameter settings, please refer to the link of Wandb:
https://api.wandb.ai/links/1812316597/hkyiriw5
| Test input | Original Llama output | Cornucopia (Fin-Alpaca-LoRA-7B-Meta) output | Cornucopia (Fin-Alpaca-LoRA-7B-Linly) output | Wen Xin's words output | iFLYTEK Spark Cognitive Output |
|---|---|---|---|---|---|
| Is it better for the elderly to manage their finances or save them regularly? | The financial management methods of the elderly should be suitable for their own situation. If he/she has enough time and investment ability, he/she is advised to use his/her current time and investment opportunities to conduct financial management activities. Because the elderly can obtain various income methods such as balance, insurance, real estate, etc., he/she can choose different types of financial management projects, including open markets, non-public markets, funds, trust plans, etc. Of course, the elderly should also pay attention to the financial security they actually need for. | If the investor has a relatively high risk tolerance, you can choose financial products to invest. If the investor has a relatively low risk tolerance, you can choose to invest in time deposits. Currently, financial products and time deposits are both guaranteed products, and will not lose principal, and the returns are determined by the investment target, so financial products and time deposits are more stable. It should be noted that financial products and fixed deposits do not have the same current interest rate, and financial products calculate their daily returns, so the higher the returns of financial products, the higher the probability of investors obtaining returns. 9QDII fund refers to open-end foreign-invested funds in the Chinese market. This type of fund mainly invests in: A-shares, Hong Kong stocks, US stocks, British stocks, German stocks, Japanese stocks, French stocks, Taiwan stocks, Hong Kong stocks, Singapore stocks, etc.; QFII fund refers to open-end foreign-invested funds on the China Exchange. This type of fund mainly invests in: A-shares, Hong Kong stocks, US stocks, British stocks, German stocks, Japanese stocks, French stocks, Taiwan stocks, Hong Kong stocks, Singapore stocks, etc. | For the elderly with relatively low risk tolerance, it is better to choose fixed deposits because they are relatively low in risk and have stable returns. For the elderly with relatively high risk tolerance, it is better to choose financial products because they are relatively high in risk and have relatively high returns. It should be noted that current financial products are not guaranteed products, and there is a possibility of losing principal, and they are divided into different risk levels according to the investment target. If the investor's risk tolerance is relatively low, then you can choose financial products with a risk level R2 and below. If the investor's risk tolerance is relatively high, you can choose financial products with a risk level R3 and above. Current financial products include: bond funds, stock funds, mixed funds, index funds and link funds, etc. Investors can choose appropriate products to invest according to their actual situation. | It is better for the elderly to save regularly. The elderly have no job income, and the total assets are gradually decreasing. Moreover, the elderly are prone to illness and have a large expenditure on medical expenses. If the elderly invest in financial products, they may suffer losses as the net value of their products change. For fixed deposits, the interest rate is generally fixed, and the principal and interest will be repaid after the expiration date. Relatively speaking, the safety factor is high and the profit is more lasting. | As an AI language model, I have no personal opinions or suggestions. However, financial management and regular storage for the elderly have their advantages and disadvantages. The advantage of financial management for the elderly is that they can better plan their financial situation, including expenditures in pensions, medical insurance, children's education, etc. In addition, the elderly can increase their wealth by investing in financial products such as stocks and funds. The advantages of regular deposits are that they have less risk and relatively stable returns. Elderly people can choose time deposits of different periods and choose according to their needs and risk tolerance. Overall, the elderly should choose a financial management method that suits them based on their actual situation and needs. If the elderly have sufficient funds and risk tolerance, they can consider investing in stocks, funds and other financial products to increase their wealth. If the elderly pay more attention to stability and safety, they can choose to save regularity to protect their financial situation. |
| Can stocks and funds be bought and sold on the same day? | Yes, stocks and funds can be bought and sold on the same day. | No, stocks and funds are subject to T+1 trading. Buy on the trading day and calculate the net value at the closing day of the purchase. Shares are confirmed on the second trading day, and the profits are calculated after the share is confirmed. Therefore, when buying on the trading day, you have to wait until the second trading day to obtain profits. For example, if an investor bought a stock on Monday and the net value announced in the market on Tuesday is 10 yuan, then investors can calculate the return based on these data: 1000×10=10000; 10000/10=1000. If an investor holds this stock for more than 30 working days, it can be directly converted into cash, that is, withdraw all or part of the stock in advance. It should be noted that there are currently no laws and regulations that restrict investors from buying funds and stocks on the trading day, so investors can conduct open trading on the same day when buying funds and stocks on the trading day. At the same time, after an investor buys stocks on the trading day, his stock will automatically be converted into cash, and after an investor buys stocks on the trading day, his stocks will automatically be converted into a limit of rise and fall (about 20%), which will affect the investor's handling fees and returns. | No, stock trading hours: Monday to Friday from 9:30 to 11:30 am, 13:00 to 15:00 pm, no trading during statutory holidays. Fund trading hours: Monday to Friday from 9:30 to 11:30 am, 13:00 to 15:00 pm, and no trading is allowed during statutory holidays. Stocks are subject to T+1 trading, and trade according to real-time market prices, following the principle of price priority and time priority, the stock price limit for Shanghai and Shenzhen stocks is 10%, the stock price limit for GEM and Science and Technology Innovation Board is 20%, the stock price limit for GEM and Science and Technology Innovation Board is 30%, the price limit for ST and *ST is 5% (the price limit for GEM and Science and Technology Innovation Board is not yet this limit), the price limit for Hong Kong stocks is 20%, and the price limit for the price limit for the New Third Board is 10% (the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price limit for the price | No, stock funds implement T+1 trading. Buy on the trading day (trading time) and calculate based on the net value at the closing day of the purchase. Shares are confirmed on the second trading day. Only after the shares are confirmed can the fund be sold. Generally, buy on the trading day, sell on the third trading day, and sell on the third trading day. | Stocks and funds trade methods vary from securities market to securities market. In some securities markets, stocks and funds can be bought and sold as they want on the same day, which is the so-called "T+0" trading. However, in other securities markets, stocks and funds cannot be bought and sold as soon as they are on the same day, and they need to wait for a certain trading time before they can be traded. In the Chinese A-share market, stocks and funds cannot be bought and sold as soon as they are on the same day, and they need to trade T+1 or T+2. This means that if you buy a stock or fund today, you will have to wait until tomorrow to sell it. In some other countries and regions, stocks and funds may trade differently. Therefore, if you want to invest in stocks or funds, it is recommended that you understand the local trading rules and market conditions. |
This project was completed by Yangmu Yu and Wenhuan Hong, Artificial Intelligence Department of Chengdu Institute of Computer Applications of Chinese Academy of Sciences (China Science Information).
This project refers to the following open source projects, and we would like to express our gratitude to the relevant projects and research and development staff.
The resources related to this project are for academic research only and are strictly prohibited for commercial purposes. When using parts involving third-party code, please strictly follow the corresponding open source protocol. The content generated by the model is affected by factors such as model calculation, randomness and quantitative accuracy losses, and this project cannot guarantee its accuracy. This project does not assume any legal liability for any content output by the model, nor is it liable for any losses that may arise from the use of relevant resources and output results.
If you use the data or code of this project, please declare the reference
@misc{Cornucopia-LLaMA-Fin-Chinese,
title={Cornucopia-LLaMA-Fin-Chinese},
author={YangMu Yu},
year={2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {url{https://github.com/jerry1993-tech/Cornucopia-LLaMA-Fin-Chinese}},
}
If you want to participate in this project, submit contribution data/code, please refer to How to contribute.
Welcome to like?, follow, share, and three consecutive one-click; if you have any questions, please submit it in GitHub Issue, or join the group to discuss further:
