ดาวน์โหลด PoliBERTweet - PoliBERTweet Source Source Download

PoliBERTweet

โค้ดแหล่งที่มา AI

1.0.0

ดาวน์โหลด

- Polibertweet: แบบจำลองภาษาสำหรับทวีตทางการเมือง

แบบจำลองภาษาที่ใช้หม้อแปลงมาก่อนได้รับการฝึกอบรมล่วงหน้าเกี่ยวกับข้อมูล Twitter ที่เกี่ยวข้องกับการเมืองจำนวนมาก (ทวีต 83M) repo นี้เป็นทรัพยากรอย่างเป็นทางการของกระดาษต่อไปนี้

Polibertweet: รูปแบบภาษาที่ผ่านการฝึกอบรมมาก่อนสำหรับการวิเคราะห์เนื้อหาทางการเมืองบน Twitter, LREC 2022

ชุดข้อมูล

ชุดข้อมูลสำหรับงานประเมินผลที่นำเสนอในบทความของเรามีอยู่ด้านล่าง

Poli-test & nonpoli-test-[ดาวน์โหลด]
ชุดข้อมูลท่าทาง - [ดาวน์โหลด] [กระดาษ] [GitHub]

รุ่นที่ผ่านการฝึกอบรมมาก่อน

ทุกรุ่นถูกอัปโหลดไปยัง HuggingFace ของฉัน? ดังนั้นคุณสามารถโหลดโมเดลด้วย รหัสเพียงสามบรรทัด !!!

Polibertweet (ทวีต 83m) - อย่าลังเลที่จะปรับแต่งสิ่งนี้กับงานดาวน์สตรีมใด ๆ
Polibertweet-Small (ทวีต 5m)

การใช้งาน

เราทดสอบใน pytorch v1.10.2 และ transformers v4.18.0

เพื่อปรับแต่งโมเดลของเราสำหรับงานเฉพาะ (เช่นการตรวจจับท่าทาง) ดูเอกสาร HuggingFace
โปรดดูหน้ารุ่นเฉพาะด้านบนสำหรับรายละเอียดการใช้งานเพิ่มเติม ด้านล่างเป็นกรณีการใช้งานตัวอย่าง

1. โหลดโมเดลและโทเค็น

 from transformers import AutoModel , AutoTokenizer , pipeline
import torch

# Choose GPU if available
device = torch . device ( "cuda" if torch . cuda . is_available () else "cpu" )

# Select mode path here
pretrained_LM_path = "kornosk/polibertweet-mlm"

# Load model
tokenizer = AutoTokenizer . from_pretrained ( pretrained_LM_path )
model = AutoModel . from_pretrained ( pretrained_LM_path )

2. ทำนายคำที่สวมหน้ากาก

 # Fill mask
example = "Trump is the <mask> of USA"
fill_mask = pipeline ( 'fill-mask' , model = pretrained_LM_path , tokenizer = tokenizer )

outputs = fill_mask ( example )
print ( outputs )

3. ดู EMBEDDINGS

 # See embeddings
inputs = tokenizer ( example , return_tensors = "pt" )
outputs = model ( ** inputs )
print ( outputs )

# OR you can use this model to train on your downstream task!
# please consider citing our paper if you feel this is useful :)

4. ปรับให้เข้ากับงานดาวน์สตรีมเช่นการตรวจจับท่าทาง

ดูรายละเอียดในเอกสาร HuggingFace

✏การอ้างอิง

หากคุณรู้สึกว่ากระดาษและทรัพยากรของเรามีประโยชน์โปรดพิจารณาอ้างถึงงานของเรา!

 @inproceedings { kawintiranon2022polibertweet ,
  title     = { {P}oli{BERT}weet: A Pre-trained Language Model for Analyzing Political Content on {T}witter } ,
  author    = { Kawintiranon, Kornraphop and Singh, Lisa } ,
  booktitle = { Proceedings of the Language Resources and Evaluation Conference (LREC) } ,
  year      = { 2022 } ,
  pages     = { 7360--7367 } ,
  publisher = { European Language Resources Association } ,
  url       = { https://aclanthology.org/2022.lrec-1.801 }
}