xgen
1.0.0
Salesforce AI研究的XGEN模型家族( 7B )的官方研究版本:
標題:使用XGEN的長序列建模:在8K輸入序列長度上訓練的7B LLM
作者:Erik Nijkamp*,Tian Xie*,Hiroaki Hayashi*,Bo pang*,Tongying Xia*,Chen Xing,Chen Xing,Jesse Vig,Semih Yavuz,Semih Yavuz,Philippe Laban,Ben Krause,Senthil Purushwalkam,Tong Niu,tong Niu,Wojciech Kryscinskinskinskiny krraakskiy raakiyy kakhakhakhakhakhakhakhakhakhakhakhakhakhakhakhakhakhakhakhakhakhakhak' Choubey,Alex Fabbri,Ye Liu,Rui Meng,Lifu Tu,Meghana Bhat,Chien-Sheng Wu,Silvio Savarese,Yingbo Zhou,Shafiq Rayhan Joty,Caim caim caim of caim of caim of caim of xiong。
(*表示同等的貢獻)
往來:Shafiq Rayhan Joty,Caiming Xiong
型號卡在HuggingFace Hub上發布:
令牌化使用Openai Tiktoken軟件包,可以通過pip安裝:
pip install tiktoken這些型號可以用作自動回歸採樣器,如下所示:
import torch
from transformers import AutoTokenizer , AutoModelForCausalLM
tokenizer = AutoTokenizer . from_pretrained ( "Salesforce/xgen-7b-8k-base" , trust_remote_code = True )
model = AutoModelForCausalLM . from_pretrained ( "Salesforce/xgen-7b-8k-base" , torch_dtype = torch . bfloat16 )
inputs = tokenizer ( "The world is" , return_tensors = "pt" )
sample = model . generate ( ** inputs , max_length = 128 )
print ( tokenizer . decode ( sample [ 0 ])) @misc { XGen ,
title = { Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length } ,
author = { Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong } ,
howpublished = { ArXiv } ,
year = { 2023 } ,
url = { https://arxiv.org/abs/2309.03450 }
}