xgen
1.0.0
Salesforce AI研究的XGEN模型家族( 7B )的官方研究版本:
标题:使用XGEN的长序列建模:在8K输入序列长度上训练的7B LLM
Authors : Erik Nijkamp*, Tian Xie*, Hiroaki Hayashi*, Bo Pang*, Congying Xia*, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey,Alex Fabbri,Ye Liu,Rui Meng,Lifu Tu,Meghana Bhat,Chien-Sheng Wu,Silvio Savarese,Yingbo Zhou,Shafiq Rayhan Joty,Caim caim caim of caim of caim of caim of xiong。
(*表示同等的贡献)
往来:Shafiq Rayhan Joty,Caiming Xiong
型号卡在HuggingFace Hub上发布:
令牌化使用Openai Tiktoken软件包,可以通过pip安装:
pip install tiktoken这些型号可以用作自动回归采样器,如下所示:
import torch
from transformers import AutoTokenizer , AutoModelForCausalLM
tokenizer = AutoTokenizer . from_pretrained ( "Salesforce/xgen-7b-8k-base" , trust_remote_code = True )
model = AutoModelForCausalLM . from_pretrained ( "Salesforce/xgen-7b-8k-base" , torch_dtype = torch . bfloat16 )
inputs = tokenizer ( "The world is" , return_tensors = "pt" )
sample = model . generate ( ** inputs , max_length = 128 )
print ( tokenizer . decode ( sample [ 0 ])) @misc { XGen ,
title = { Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length } ,
author = { Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong } ,
howpublished = { ArXiv } ,
year = { 2023 } ,
url = { https://arxiv.org/abs/2309.03450 }
}