bert4torch下載bert4torch源代碼下載

bert4torch

其他源碼

v0.5.4

下載

bert4torch

Documentation | Torch4keras | Examples | build_MiniLLM_from_scratch | bert4vector

1. 下載安裝

安裝穩定版

pip install bert4torch

安裝最新版

pip install git+https://github.com/Tongjilibo/bert4torch

注意事項：pip包的發布慢於git上的開發版本，git clone注意引用路徑，注意權重是否需要轉換
測試用例： git clone https://github.com/Tongjilibo/bert4torch ，修改example中的預訓練模型文件路徑和數據路徑即可啟動腳本
自行訓練：針對自己的數據，修改相應的數據處理代碼塊
開發環境：原使用torch==1.10版本進行開發，現已切換到torch2.0開發，如其他版本遇到不適配，歡迎反饋

2. 功能

LLM模型: 加載chatglm、llama、 baichuan、ziya、bloom等開源大模型權重進行推理和微調，命令行一行部署大模型
核心功能：加載bert、roberta、albert、xlnet、nezha、bart、RoFormer、RoFormer_V2、ELECTRA、GPT、GPT2、T5、GAU-alpha、ERNIE等預訓練權重繼續進行finetune、並支持在bert基礎上靈活定義自己模型
豐富示例：包含llm、pretrain、sentence_classfication、sentence_embedding、sequence_labeling、relation_extraction、seq2seq、serving等多種解決方案
實驗驗證：已在公開數據集實驗驗證，使用如下examples數據集和實驗指標
易用trick ：集成了常見的trick，即插即用
其他特性：加載transformers庫模型一起使用；調用方式簡潔高效；有訓練進度條動態展示；配合torchinfo打印參數量；默認Logger和Tensorboard簡便記錄訓練過程；自定義fit過程，滿足高階需求
訓練過程：

功能	bert4torch	transformers	備註
訓練進度條	✅	✅	進度條打印loss和定義的metrics
分佈式訓練dp/ddp	✅	✅	torch自帶dp/ddp
各類callbacks	✅	✅	日誌/tensorboard/earlystop/wandb等
大模型推理，stream/batch輸出	✅	✅	各個模型是通用的，無需單獨維護腳本
大模型微調	✅	✅	lora依賴peft庫，pv2自帶
豐富tricks	✅		對抗訓練等tricks即插即用
代碼簡潔易懂，自定義空間大	✅		代碼復用度高, keras代碼訓練風格
倉庫的維護能力/影響力/使用量/兼容性		✅	目前倉庫個人維護
一鍵部署大模型

3. 快速上手

3.1 上手教程

Quick-Start
快速上手教程，教程示例，實戰示例
bert4torch介紹(知乎)，bert4torch快速上手(知乎)，bert4torch又雙叒叕更新啦(知乎)

3.2 命令行快速部署大模型服務

本地/ 聯網加載

 # 联网下载全部文件
bert4torch-llm-server --checkpoint_path Qwen2-0.5B-Instruct

# 加载本地大模型，联网下载bert4torch_config.json
bert4torch-llm-server --checkpoint_path /data/pretrain_ckpt/Qwen/Qwen2-0.5B-Instruct --config_path Qwen/Qwen2-0.5B-Instruct

# 加载本地大模型，且bert4torch_config.json已经下载并放于同名目录下
bert4torch-llm-server --checkpoint_path /data/pretrain_ckpt/Qwen/Qwen2-0.5B-Instruct

命令行/ gradio網頁/ openai_api

 # 命令行
bert4torch-llm-server --checkpoint_path /data/pretrain_ckpt/Qwen/Qwen2-0.5B-Instruct --mode cli

# gradio网页
bert4torch-llm-server --checkpoint_path /data/pretrain_ckpt/Qwen/Qwen2-0.5B-Instruct --mode gradio

# openai_api
bert4torch-llm-server --checkpoint_path /data/pretrain_ckpt/Qwen/Qwen2-0.5B-Instruct --mode openai

命令行聊天示例

4. 版本和更新歷史

4.1 版本歷史

更新日期	bert4torch	torch4keras	版本說明
20240928	0.5.4	0.2.7	【新功能】增加deepseek系列、MiniCPM、MiniCPMV、llama3.2、Qwen2.5；支持device_map=auto;【修復】修復batch_generate和n>1的bug
20240814	0.5.3	0.2.6	【新功能】增加llama3.1/Yi1.5；自動選擇從hfmirror下載；支持命令行參數`bert4torch-llm-server`
20240801	0.5.2	0.2.5	【新功能】chatglm/qwen系列支持function call調用, 增加internlm2系列；【小優化】簡化pipeline中chat demo的調用，generate的終止token元素允許為列表, 統一rope_scaling參數名，增加rope衍生類；【bug】修復flash_attn2的推理bug, 修復bart的tie_word_embedding的bug

更多版本

4.2 更新歷史

更多歷史

5. 預訓練權重

預訓練模型支持多種代碼加載方式

 from bert4torch . models import build_transformer_model

# 1. 仅指定config_path: 从头初始化模型结构, 不加载预训练模型
model = build_transformer_model ( './model/bert4torch_config.json' )

# 2. 仅指定checkpoint_path: 
## 2.1 文件夹路径: 自动寻找路径下的*.bin/*.safetensors权重文件 + 需把bert4torch_config.json下载并放于该目录下
model = build_transformer_model ( checkpoint_path = './model' )

## 2.2 文件路径/列表: 文件路径即权重路径/列表, bert4torch_config.json会从同级目录下寻找
model = build_transformer_model ( checkpoint_path = './pytorch_model.bin' )

## 2.3 model_name: hf上预训练权重名称, 会自动下载hf权重以及bert4torch_config.json文件
model = build_transformer_model ( checkpoint_path = 'bert-base-chinese' )

# 3. 同时指定config_path和checkpoint_path(本地路径名或model_name排列组合): 
#    本地路径从本地加载，pretrained_model_name会联网下载
config_path = './model/bert4torch_config.json'  # 或'bert-base-chinese'
checkpoint_path = './model/pytorch_model.bin'  # 或'bert-base-chinese'
model = build_transformer_model ( config_path , checkpoint_path )

預訓練權重鏈接和bert4torch_config.json

*注：

高亮格式(如bert-base-chinese )的表示可直接build_transformer_model()聯網下載
國內鏡像網站加速下載
- HF_ENDPOINT=https://hf-mirror.com python your_script.py
- export HF_ENDPOINT=https://hf-mirror.com後再執行python代碼
- 在python代碼開頭如下設置
```
 import os
os . environ [ 'HF_ENDPOINT' ] = "https://hf-mirror.com" 
```

6. 鳴謝

感謝蘇神實現的bert4keras，本實現有不少地方參考了bert4keras的源碼，在此衷心感謝大佬的無私奉獻;
其次感謝項目bert4pytorch，也是在該項目的指引下給了我用pytorch來復現bert4keras的想法和思路。

7. 引用

 @misc{bert4torch,
  title={bert4torch},
  author={Bo Li},
  year={2022},
  howpublished={url{https://github.com/Tongjilibo/bert4torch}},
}

8. 其他

Wechat & Star History Chart
微信群人數超過200個（有邀請限制），可添加個人微信拉群

微信號

微信群

Star History Chart

展開

附加信息

版本 v0.5.4
類型其他源碼
更新時間 2025-04-19
大小 3.42MB
來自於 Github

相關應用

Google Dorks

2025-03-10
shepherd

2025-06-04
mongo express

2025-06-04
hidusbf

2025-02-14
Free Algorithms Books

2025-05-29
markdownpedia

2025-04-22

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3

相關資訊全部

模型分類	模型名稱	權重來源	權重鏈接/checkpoint_path	config_path
bert	bert-base-chinese	google-bert	`bert-base-chinese`	`bert-base-chinese`
	chinese_L-12_H-768_A-12	Google	tf權重 `Tongjilibo/bert-chinese_L-12_H-768_A-12`
	chinese-bert-wwm-ext	HFL	`hfl/chinese-bert-wwm-ext`	`hfl/chinese-bert-wwm-ext`
	bert-base-multilingual-cased	google-bert	`bert-base-multilingual-cased`	`bert-base-multilingual-cased`
	MacBERT	HFL	`hfl/chinese-macbert-base` `hfl/chinese-macbert-large`	`hfl/chinese-macbert-base` `hfl/chinese-macbert-large`
	WoBERT	追一科技	`junnyu/wobert_chinese_base` ， `junnyu/wobert_chinese_plus_base`	`junnyu/wobert_chinese_base` `junnyu/wobert_chinese_plus_base`
roberta	chinese-roberta-wwm-ext	HFL	`hfl/chinese-roberta-wwm-ext` `hfl/chinese-roberta-wwm-ext-large` (large的mlm權重是隨機初始化)	`hfl/chinese-roberta-wwm-ext` `hfl/chinese-roberta-wwm-ext-large`
	roberta-small/tiny	追一科技	`Tongjilibo/chinese_roberta_L-4_H-312_A-12` `Tongjilibo/chinese_roberta_L-6_H-384_A-12`
	roberta-base	FacebookAI	`roberta-base`	`roberta-base`
	guwenbert	ethanyt	`ethanyt/guwenbert-base`	`ethanyt/guwenbert-base`
albert	albert_zh albert_pytorch	brightmart	`voidful/albert_chinese_tiny` `voidful/albert_chinese_small` `voidful/albert_chinese_base` `voidful/albert_chinese_large` `voidful/albert_chinese_xlarge` `voidful/albert_chinese_xxlarge`	`voidful/albert_chinese_tiny` `voidful/albert_chinese_small` `voidful/albert_chinese_base` `voidful/albert_chinese_large` `voidful/albert_chinese_xlarge` `voidful/albert_chinese_xxlarge`
nezha	NEZHA NeZha_Chinese_PyTorch	huawei_noah	`sijunhe/nezha-cn-base` `sijunhe/nezha-cn-large` `sijunhe/nezha-base-wwm` `sijunhe/nezha-large-wwm`	`sijunhe/nezha-cn-base` `sijunhe/nezha-cn-large` `sijunhe/nezha-base-wwm` `sijunhe/nezha-large-wwm`
	nezha_gpt_dialog	bojone	`Tongjilibo/nezha_gpt_dialog`
xlnet	Chinese-XLNet	HFL	`hfl/chinese-xlnet-base`	`hfl/chinese-xlnet-base`
	tranformer_xl	huggingface	`transfo-xl/transfo-xl-wt103`	`transfo-xl/transfo-xl-wt103`
deberta	Erlangshen-DeBERTa-v2	IDEA	`IDEA-CCNL/Erlangshen-DeBERTa-v2-97M-Chinese` `IDEA-CCNL/Erlangshen-DeBERTa-v2-320M-Chinese` `IDEA-CCNL/Erlangshen-DeBERTa-v2-710M-Chinese`	`IDEA-CCNL/Erlangshen-DeBERTa-v2-97M-Chinese` `IDEA-CCNL/Erlangshen-DeBERTa-v2-320M-Chinese` `IDEA-CCNL/Erlangshen-DeBERTa-v2-710M-Chinese`
electra	Chinese-ELECTRA	HFL	`hfl/chinese-electra-base-discriminator`	`hfl/chinese-electra-base-discriminator`
ernie	ernie	百度文心	`nghuyong/ernie-1.0-base-zh` `nghuyong/ernie-3.0-base-zh`	`nghuyong/ernie-1.0-base-zh` `nghuyong/ernie-3.0-base-zh`
roformer	roformer	追一科技	`junnyu/roformer_chinese_base`	`junnyu/roformer_chinese_base`
	roformer_v2	追一科技	`junnyu/roformer_v2_chinese_char_base`	`junnyu/roformer_v2_chinese_char_base`
simbert	simbert	追一科技	`Tongjilibo/simbert-chinese-base` `Tongjilibo/simbert-chinese-small` `Tongjilibo/simbert-chinese-tiny`
	simbert_v2/roformer-sim	追一科技	`junnyu/roformer_chinese_sim_char_base` ， `junnyu/roformer_chinese_sim_char_ft_base` ， `junnyu/roformer_chinese_sim_char_small` ， `junnyu/roformer_chinese_sim_char_ft_small`	`junnyu/roformer_chinese_sim_char_base` `junnyu/roformer_chinese_sim_char_ft_base` `junnyu/roformer_chinese_sim_char_small` `junnyu/roformer_chinese_sim_char_ft_small`
gau	GAU-alpha	追一科技	`Tongjilibo/chinese_GAU-alpha-char_L-24_H-768`
uie	uie uie_pytorch	百度	`Tongjilibo/uie-base`
gpt	CDial-GPT	thu-coai	`thu-coai/CDial-GPT_LCCC-base` `thu-coai/CDial-GPT_LCCC-large`	`thu-coai/CDial-GPT_LCCC-base` `thu-coai/CDial-GPT_LCCC-large`
	cmp_lm(26億)	清華	`TsinghuaAI/CPM-Generate`	`TsinghuaAI/CPM-Generate`
	nezha_gen	huawei_noah	`Tongjilibo/chinese_nezha_gpt_L-12_H-768_A-12`
	gpt2-chinese-cluecorpussmall	UER	`uer/gpt2-chinese-cluecorpussmall`	`uer/gpt2-chinese-cluecorpussmall`
	gpt2-ml	imcaspar	torch BaiduYun(84dh)	`gpt2-ml_15g_corpus` `gpt2-ml_30g_corpus`
bart	bart_base_chinese	復旦fnlp	`fnlp/bart-base-chinese` v1.0	`fnlp/bart-base-chinese` `fnlp/bart-base-chinese-v1.0`
t5	t5	UER	`uer/t5-small-chinese-cluecorpussmall` `uer/t5-base-chinese-cluecorpussmall`	`uer/t5-base-chinese-cluecorpussmall` `uer/t5-small-chinese-cluecorpussmall`
	mt5	Google	`google/mt5-base`	`google/mt5-base`
	t5_pegasus	追一科技	`Tongjilibo/chinese_t5_pegasus_small` `Tongjilibo/chinese_t5_pegasus_base`
	chatyuan	clue-ai	`ClueAI/ChatYuan-large-v1` `ClueAI/ChatYuan-large-v2`	`ClueAI/ChatYuan-large-v1` `ClueAI/ChatYuan-large-v2`
	PromptCLUE	clue-ai	`ClueAI/PromptCLUE-base`	`ClueAI/PromptCLUE-base`
chatglm	chatglm-6b	THUDM	`THUDM/chatglm-6b` `THUDM/chatglm-6b-int8` `THUDM/chatglm-6b-int4` v0.1.0	`THUDM/chatglm-6b` `THUDM/chatglm-6b-int8` `THUDM/chatglm-6b-int4` `THUDM/chatglm-6b-v0.1.0`
	chatglm2-6b	THUDM	`THUDM/chatglm2-6b` `THUDM/chatglm2-6b-int4` `THUDM/chatglm2-6b-32k`	`THUDM/chatglm2-6b` `THUDM/chatglm2-6b-int4` `THUDM/chatglm2-6b-32k`
	chatglm3-6b	THUDM	`THUDM/chatglm3-6b` `THUDM/chatglm3-6b-32k`	`THUDM/chatglm3-6b` `THUDM/chatglm3-6b-32k`
	glm4-9b	THUDM	`THUDM/glm-4-9b` `THUDM/glm-4-9b-chat` `THUDM/glm-4-9b-chat-1m`	`THUDM/glm-4-9b` `THUDM/glm-4-9b-chat` `THUDM/glm-4-9b-chat-1m`
llama	llama	meta		`meta-llama/llama-7b` `meta-llama/llama-13b`
	llama-2	meta	meta-llama/Llama-2-7b-hf meta-llama/Llama-2-7b-chat-hf meta-llama/Llama-2-13b-hf meta-llama/Llama-2-13b-chat-hf	`meta-llama/Llama-2-7b-hf` `meta-llama/Llama-2-7b-chat-hf` `meta-llama/Llama-2-13b-hf` `meta-llama/Llama-2-13b-chat-hf`
	llama-3	meta	`meta-llama/Meta-Llama-3-8B` `meta-llama/Meta-Llama-3-8B-Instruct`	`meta-llama/Meta-Llama-3-8B` `meta-llama/Meta-Llama-3-8B-Instruct`
	llama-3.1	meta	`meta-llama/Meta-Llama-3.1-8B` `meta-llama/Meta-Llama-3.1-8B-Instruct`	`meta-llama/Meta-Llama-3.1-8B` `meta-llama/Meta-Llama-3.1-8B-Instruct`
	llama-3.2	meta	`meta-llama/Llama-3.2-1B` `meta-llama/Llama-3.2-1B-Instruct` `meta-llama/Llama-3.2-3B` `meta-llama/Llama-3.2-3B-Instruct`	`meta-llama/Llama-3.2-1B` `meta-llama/Llama-3.2-1B-Instruct` `meta-llama/Llama-3.2-3B` `meta-llama/Llama-3.2-3B-Instruct`
	Chinese-LLaMA-Alpaca	HFL		`hfl/chinese_alpaca_plus_7b` `hfl/chinese_llama_plus_7b`
	Chinese-LLaMA-Alpaca-2	HFL		待添加
	Chinese-LLaMA-Alpaca-3	HFL		待添加
	Belle_llama	LianjiaTech	BelleGroup/BELLE-LLaMA-7B-2M-enc	合成說明、 `BelleGroup/BELLE-LLaMA-7B-2M-enc`
	Ziya	IDEA-CCNL	IDEA-CCNL/Ziya-LLaMA-13B-v1 IDEA-CCNL/Ziya-LLaMA-13B-v1.1 IDEA-CCNL/Ziya-LLaMA-13B-Pretrain-v1	`IDEA-CCNL/Ziya-LLaMA-13B-v1` `IDEA-CCNL/Ziya-LLaMA-13B-v1.1`
	vicuna	lmsys	`lmsys/vicuna-7b-v1.5`	`lmsys/vicuna-7b-v1.5`
Baichuan	Baichuan	baichuan-inc	`baichuan-inc/Baichuan-7B` `baichuan-inc/Baichuan-13B-Base` `baichuan-inc/Baichuan-13B-Chat`	`baichuan-inc/Baichuan-7B` `baichuan-inc/Baichuan-13B-Base` `baichuan-inc/Baichuan-13B-Chat`
	Baichuan2	baichuan-inc	`baichuan-inc/Baichuan2-7B-Base` `baichuan-inc/Baichuan2-7B-Chat` `baichuan-inc/Baichuan2-13B-Base` `baichuan-inc/Baichuan2-13B-Chat`	`baichuan-inc/Baichuan2-7B-Base` `baichuan-inc/Baichuan2-7B-Chat` `baichuan-inc/Baichuan2-13B-Base` `baichuan-inc/Baichuan2-13B-Chat`
Yi	Yi	01-ai	`01-ai/Yi-6B` `01-ai/Yi-6B-200K` `01-ai/Yi-9B` `01-ai/Yi-9B-200K`	`01-ai/Yi-6B` `01-ai/Yi-6B-200K` `01-ai/Yi-9B` `01-ai/Yi-9B-200K`
	Yi-1.5	01-ai	`01-ai/Yi-1.5-6B` `01-ai/Yi-1.5-6B-Chat` `01-ai/Yi-1.5-9B` `01-ai/Yi-1.5-9B-32K` `01-ai/Yi-1.5-9B-Chat` `01-ai/Yi-1.5-9B-Chat-16K`	`01-ai/Yi-1.5-6B` `01-ai/Yi-1.5-6B-Chat` `01-ai/Yi-1.5-9B` `01-ai/Yi-1.5-9B-32K` `01-ai/Yi-1.5-9B-Chat` `01-ai/Yi-1.5-9B-Chat-16K`
bloom	bloom	bigscience	`bigscience/bloom-560m` `bigscience/bloomz-560m`	`bigscience/bloom-560m` `bigscience/bloomz-560m`
Qwen	Qwen	阿里雲	`Qwen/Qwen-1_8B` `Qwen/Qwen-1_8B-Chat` `Qwen/Qwen-7B` `Qwen/Qwen-7B-Chat` `Qwen/Qwen-14B` `Qwen/Qwen-14B-Chat`	`Qwen/Qwen-1_8B` `Qwen/Qwen-1_8B-Chat` `Qwen/Qwen-7B` `Qwen/Qwen-7B-Chat` `Qwen/Qwen-14B` `Qwen/Qwen-14B-Chat`
	Qwen1.5	阿里雲	`Qwen/Qwen1.5-0.5B` `Qwen/Qwen1.5-0.5B-Chat` `Qwen/Qwen1.5-1.8B` `Qwen/Qwen1.5-1.8B-Chat` `Qwen/Qwen1.5-7B` `Qwen/Qwen1.5-7B-Chat` `Qwen/Qwen1.5-14B` `Qwen/Qwen1.5-14B-Chat`	`Qwen/Qwen1.5-0.5B` `Qwen/Qwen1.5-0.5B-Chat` `Qwen/Qwen1.5-1.8B` `Qwen/Qwen1.5-1.8B-Chat` `Qwen/Qwen1.5-7B` `Qwen/Qwen1.5-7B-Chat` `Qwen/Qwen1.5-14B` `Qwen/Qwen1.5-14B-Chat`
	Qwen2	阿里雲	`Qwen/Qwen2-0.5B` `Qwen/Qwen2-0.5B-Instruct` `Qwen/Qwen2-1.5B` `Qwen/Qwen2-1.5B-Instruct` `Qwen/Qwen2-7B` `Qwen/Qwen2-7B-Instruct`	`Qwen/Qwen2-0.5B` `Qwen/Qwen2-0.5B-Instruct` `Qwen/Qwen2-1.5B` `Qwen/Qwen2-1.5B-Instruct` `Qwen/Qwen2-7B` `Qwen/Qwen2-7B-Instruct`
	Qwen2-VL	阿里雲	`Qwen/Qwen2-VL-2B-Instruct` `Qwen/Qwen2-VL-7B-Instruct`	`Qwen/Qwen2-VL-2B-Instruct` `Qwen/Qwen2-VL-7B-Instruct`
	Qwen2.5	阿里雲	`Qwen/Qwen2.5-0.5B` `Qwen/Qwen2.5-0.5B-Instruct` `Qwen/Qwen2.5-1.5B` `Qwen/Qwen2.5-1.5B-Instruct` `Qwen/Qwen2.5-3B` `Qwen/Qwen2.5-3B-Instruct` `Qwen/Qwen2.5-7B` `Qwen/Qwen2.5-7B-Instruct` `Qwen/Qwen2.5-14B` `Qwen/Qwen2.5-14B-Instruct`	`Qwen/Qwen2.5-0.5B` `Qwen/Qwen2.5-0.5B-Instruct` `Qwen/Qwen2.5-1.5B` `Qwen/Qwen2.5-1.5B-Instruct` `Qwen/Qwen2.5-3B` `Qwen/Qwen2.5-3B-Instruct` `Qwen/Qwen2.5-7B` `Qwen/Qwen2.5-7B-Instruct` `Qwen/Qwen2.5-14B` `Qwen/Qwen2.5-14B-Instruct`
InternLM	InternLM	上海人工智能實驗室	`internlm/internlm-7b` `internlm/internlm-chat-7b`	`internlm/internlm-7b` `internlm/internlm-chat-7b`
	InternLM2	上海人工智能實驗室	`internlm/internlm2-1_8b` `internlm/internlm2-chat-1_8b` `internlm/internlm2-7b` `internlm/internlm2-chat-7b` `internlm/internlm2-20b` `internlm/internlm2-chat-20b`	`internlm/internlm2-1_8b` `internlm/internlm2-chat-1_8b` `internlm/internlm2-7b` `internlm/internlm2-chat-7b`
	InternLM2.5	上海人工智能實驗室	`internlm/internlm2_5-7b` `internlm/internlm2_5-7b-chat` `internlm/internlm2_5-7b-chat-1m`	`internlm/internlm2_5-7b` `internlm/internlm2_5-7b-chat` `internlm/internlm2_5-7b-chat-1m`
Falcon	Falcon	tiiuae	`tiiuae/falcon-rw-1b` `tiiuae/falcon-7b` `tiiuae/falcon-7b-instruct`	`tiiuae/falcon-rw-1b` `tiiuae/falcon-7b` `tiiuae/falcon-7b-instruct`
DeepSeek	DeepSeek-MoE	深度求索	`deepseek-ai/deepseek-moe-16b-base` `deepseek-ai/deepseek-moe-16b-chat`	`deepseek-ai/deepseek-moe-16b-base` `deepseek-ai/deepseek-moe-16b-chat`
	DeepSeek-LLM	深度求索	`deepseek-ai/deepseek-llm-7b-base` `deepseek-ai/deepseek-llm-7b-chat`	`deepseek-ai/deepseek-llm-7b-base` `deepseek-ai/deepseek-llm-7b-chat`
	DeepSeek-V2	深度求索	`deepseek-ai/DeepSeek-V2-Lite` `deepseek-ai/DeepSeek-V2-Lite-Chat`	`deepseek-ai/DeepSeek-V2-Lite` `deepseek-ai/DeepSeek-V2-Lite-Chat`
	DeepSeek-Coder	深度求索	`deepseek-ai/deepseek-coder-1.3b-base` `deepseek-ai/deepseek-coder-1.3b-instruct` `deepseek-ai/deepseek-coder-6.7b-base` `deepseek-ai/deepseek-coder-6.7b-instruct` `deepseek-ai/deepseek-coder-7b-base-v1.5` `deepseek-ai/deepseek-coder-7b-instruct-v1.5`	`deepseek-ai/deepseek-coder-1.3b-base` `deepseek-ai/deepseek-coder-1.3b-instruct` `deepseek-ai/deepseek-coder-6.7b-base` `deepseek-ai/deepseek-coder-6.7b-instruct` `deepseek-ai/deepseek-coder-7b-base-v1.5` `deepseek-ai/deepseek-coder-7b-instruct-v1.5`
	DeepSeek-Coder-V2	深度求索	`deepseek-ai/DeepSeek-Coder-V2-Lite-Base` `deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct`	`deepseek-ai/DeepSeek-Coder-V2-Lite-Base` `deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct`
	DeepSeek-Math	深度求索	`deepseek-ai/deepseek-math-7b-base` `deepseek-ai/deepseek-math-7b-instruct` `deepseek-ai/deepseek-math-7b-rl`	`deepseek-ai/deepseek-math-7b-base` `deepseek-ai/deepseek-math-7b-instruct` `deepseek-ai/deepseek-math-7b-rl`
MiniCPM	MiniCPM	OpenBMB	`openbmb/MiniCPM-2B-sft-bf16` `openbmb/MiniCPM-2B-dpo-bf16` `openbmb/MiniCPM-2B-128k` `openbmb/MiniCPM-1B-sft-bf16`	`openbmb/MiniCPM-2B-sft-bf16` `openbmb/MiniCPM-2B-dpo-bf16` `openbmb/MiniCPM-2B-128k` `openbmb/MiniCPM-1B-sft-bf16`
	MiniCPM-V	OpenBMB	`openbmb/MiniCPM-V-2_6` `openbmb/MiniCPM-Llama3-V-2_5`	`openbmb/MiniCPM-V-2_6` `openbmb/MiniCPM-Llama3-V-2_5`
embedding	text2vec-base-chinese	shibing624	`shibing624/text2vec-base-chinese`	`shibing624/text2vec-base-chinese`
	m3e	moka-ai	`moka-ai/m3e-base`	`moka-ai/m3e-base`
	bge	BAAI	`BAAI/bge-large-en-v1.5` `BAAI/bge-large-zh-v1.5` `BAAI/bge-base-en-v1.5` `BAAI/bge-base-zh-v1.5` `BAAI/bge-small-en-v1.5` `BAAI/bge-small-zh-v1.5`	`BAAI/bge-large-en-v1.5` `BAAI/bge-large-zh-v1.5` `BAAI/bge-base-en-v1.5` `BAAI/bge-base-zh-v1.5` `BAAI/bge-small-en-v1.5` `BAAI/bge-small-zh-v1.5`
	gte	thenlper	`thenlper/gte-large-zh` `thenlper/gte-base-zh`	`thenlper/gte-base-zh` `thenlper/gte-large-zh`