Rilis Penelitian Resmi untuk Keluarga Model XGEN ( 7B ) oleh Salesforce AI Research:
Judul : Pemodelan Urutan Panjang dengan XGEN: A 7B LLM Dilatih pada Panjang Urutan Input 8K
Penulis : Erik Nijkamp*, Tian Xie*, Hiroaki Hayashi*, Bo Pang*, Congying Xia*, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krauski, Loovi, Tong Niu, Wojciech Kryscinski, Loovi, Tong Niu, Wojciech Kryscinski, Loovi, Wojciech Kryscinskiechki, Tong Niuech, Wojciech Kryscinskiech, Wojciech Kryscinskiech, Wojciech, Wojciech, Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong.
(* menunjukkan kontribusi yang sama)
Korespondensi dengan: Shafiq Rayhan Joty, Caiming Xiong
Kartu model diterbitkan di hub Huggingface:
Tokenisasi menggunakan paket Openai Tiktoken, yang dapat diinstal melalui pip :
pip install tiktokenModel dapat digunakan sebagai sampler regregresif otomatis sebagai berikut:
import torch
from transformers import AutoTokenizer , AutoModelForCausalLM
tokenizer = AutoTokenizer . from_pretrained ( "Salesforce/xgen-7b-8k-base" , trust_remote_code = True )
model = AutoModelForCausalLM . from_pretrained ( "Salesforce/xgen-7b-8k-base" , torch_dtype = torch . bfloat16 )
inputs = tokenizer ( "The world is" , return_tensors = "pt" )
sample = model . generate ( ** inputs , max_length = 128 )
print ( tokenizer . decode ( sample [ 0 ])) @misc { XGen ,
title = { Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length } ,
author = { Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong } ,
howpublished = { ArXiv } ,
year = { 2023 } ,
url = { https://arxiv.org/abs/2309.03450 }
}