ดาวน์โหลด mutate - ดาวน์โหลดซอร์สโค้ด mutate พันธุ์

mutate

โค้ดแหล่งที่มา AI

1.0.0

ดาวน์โหลด

- กลายพันธุ์

ไลบรารีเพื่อสังเคราะห์ชุดข้อมูลข้อความโดยใช้แบบจำลองภาษาขนาดใหญ่ (LLM) การกลายพันธุ์อ่านผ่านตัวอย่างในชุดข้อมูลและสร้างตัวอย่างที่คล้ายกันโดยใช้อัตโนมัติที่สร้างการถ่ายภาพไม่กี่นัด

1. การติดตั้ง

 pip install mutate-nlp

หรือ

 pip install git+https://github.com/infinitylogesh/mutate

2. การใช้งาน

2.1 สังเคราะห์ข้อมูลข้อความจากไฟล์ CSV ในเครื่อง

 from mutate import pipeline

pipe = pipeline ( "text-classification-synthesis" ,
                model = "EleutherAI/gpt-neo-2.7B" ,
                device = 1 )

task_desc = "Each item in the following contains movie reviews and corresponding sentiments. Possible sentimets are neg and pos"


# returns a python generator
text_synth_gen = pipe ( "csv" ,
                    data_files = [ "local/path/sentiment_classfication.csv" ],
                    task_desc = task_desc ,
                    text_column = "text" ,
                    label_column = "label" ,
                    text_column_alias = "Comment" ,
                    label_column_alias = "sentiment" ,
                    shot_count = 5 ,
                    class_names = [ "pos" , "neg" ])

#Loop through the generator to synthesize examples by class
for synthesized_examples  in text_synth_gen :
    print ( synthesized_examples )

แสดงผล

{
    "text" : [ "The story was very dull and was a waste of my time. This was not a film I would ever watch. The acting was bad. I was bored. There were no surprises. They showed one dinosaur," ,
    "I did not like this film. It was a slow and boring film, it didn't seem to have any plot, there was nothing to it. The only good part was the ending, I just felt that the film should have ended more abruptly." ]
    "label" :[ "neg" , "neg" ]
}

{
    "text" :[ "The Bell witch is one of the most interesting, yet disturbing films of recent years. It’s an odd and unique look at a very real, but very dark issue. With its mixture of horror, fantasy and fantasy adventure, this film is as much a horror film as a fantasy film. And it‘s worth your time. While the movie has its flaws, it is worth watching and if you are a fan of a good fantasy or horror story, you will not be disappointed." ],
    "label" :[ "pos" ]
}

# and so on .....

2.2 สังเคราะห์ข้อมูลข้อความจาก? ชุดข้อมูล

ภายใต้ฮูดกลายพันธุ์ใช้สิ่งที่ยอดเยี่ยม? ชุดข้อมูลไลบรารีสำหรับการประมวลผลชุดข้อมูลดังนั้นจึงรองรับ? ชุดข้อมูลนอกกรอบ

 from mutate import pipeline

pipe = pipeline ( "text-classification-synthesis" ,
                model = "EleutherAI/gpt-neo-2.7B" ,
                device = 1 )

task_desc = "Each item in the following contains customer service queries expressing the mentioned intent"

synthesizerGen = pipe ( "banking77" ,
                    task_desc = task_desc ,
                    text_column = "text" ,
                    label_column = "label" ,
                    # if the `text_column` doesn't have a meaningful value
                    text_column_alias = "Queries" ,
                    label_column_alias = "Intent" , # if the `label_column` doesn't have a meaningful value
                    shot_count = 5 ,
                    dataset_args = [ "en" ])


for exp in synthesizerGen :
    print ( exp )

แสดงผล

{ "text" :[ "How can i know if my account has been activated? (This is the one that I am confused about)" ,
         "Thanks! My card activated" ],
"label" :[ "activate_my_card" ,
         "activate_my_card" ]
}

{
"text" : [ "How do i activate this new one? Is it possible?" ,
         "what is the activation process for this card?" ],
"label" :[ "activate_my_card" ,
         "activate_my_card" ]
}

# and so on .....

2.3 ฉันรู้สึกโชคดี: วนรอบชุดข้อมูลเพื่อสร้างตัวอย่างไม่ได้ จำกัด

ข้อควรระวัง : การวนซ้ำอย่างไม่แน่นอนผ่านชุดข้อมูลมีโอกาสสูงกว่าของตัวอย่างที่ซ้ำกันที่จะสร้างขึ้น

 from mutate import pipeline

pipe = pipeline ( "text-classification-synthesis" ,
                model = "EleutherAI/gpt-neo-2.7B" ,
                device = 1 )

task_desc = "Each item in the following contains movie reviews and corresponding sentiments. Possible sentimets are neg and pos"


# returns a python generator
text_synth_gen = pipe ( "csv" ,
                    data_files = [ "local/path/sentiment_classfication.csv" ],
                    task_desc = task_desc ,
                    text_column = "text" ,
                    label_column = "label" ,
                    text_column_alias = "Comment" ,
                    label_column_alias = "sentiment" ,
                    class_names = [ "pos" , "neg" ],
                    # Flag to generate indefinite examples
                    infinite_loop = True )

#Infinite loop
for exp in synthesizerGen :
    print ( exp )

3. สนับสนุน

3.1 รองรับปัจจุบัน

การสังเคราะห์ชุดข้อมูลการจำแนกประเภทข้อความ : ข้อมูลการถ่ายภาพข้อความ synthesize สำหรับชุดข้อมูลการจำแนกประเภทข้อความโดยใช้ LLMs เชิงสาเหตุ (GPT Like)

3.2 แผนงาน:

การสังเคราะห์ชุดข้อมูลข้อความประเภทอื่น - ner, คู่ประโยค ฯลฯ
การสนับสนุน Finetuning เพื่อการสร้างคุณภาพที่ดีขึ้น
การติดฉลากหลอก

4. เครดิต

Eleutherai สำหรับการทำให้เป็นประชาธิปไตย LMS ขนาดใหญ่
ห้องสมุดนี้ใช้? ชุดข้อมูลและ? Transformers สำหรับการประมวลผลชุดข้อมูลและรุ่น

5. ข้อมูลอ้างอิง

แนวคิดในการสร้างตัวอย่างจากรูปแบบภาษาขนาดใหญ่ได้รับแรงบันดาลใจจากงานด้านล่าง

ตัวอย่างอีกสองสามตัวอย่างอาจเป็นพารามิเตอร์พันล้านที่คุ้มค่าโดย Yuval Kirstain, Patrick Lewis, Sebastian Riedel, Omer Levy
GPT3MIX: ใช้ประโยชน์จากแบบจำลองภาษาขนาดใหญ่สำหรับการเพิ่มข้อความโดย Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-Woo Lee, Woomyeong Park
การเพิ่มข้อมูลโดยใช้โมเดลหม้อแปลงที่ผ่านการฝึกอบรมมาแล้วโดย Varun Kumar, Ashutosh Choudhary, Eunah Cho

ขยาย

ข้อมูลเพิ่มเติม

เวอร์ชัน 1.0.0
ประเภท โค้ดแหล่งที่มา AI
เวลาอัปเดต 2025-09-11
ขนาด 132.95KB
มาจาก Github

แอปที่เกี่ยวข้อง

ML stack

2025-07-01
awesome free chatgpt

2025-01-04
pywin_contextmenu

2025-08-31
promptl

2025-02-17
tick.chat

2025-09-16
FastLoRAChat

2025-09-03

แนะนำสำหรับคุณ

chat.petals.dev

ซอร์สโค้ดอื่น ๆ

1.0.0
GPT Prompt Templates

ซอร์สโค้ดอื่น ๆ

1.0.0
GPTyped

ซอร์สโค้ดอื่น ๆ

GPTyped 1.0.5
ML stack

โค้ดแหล่งที่มา AI

1.0.0
awesome free chatgpt

โค้ดแหล่งที่มา AI

1.0.0
pywin_contextmenu

โค้ดแหล่งที่มา AI

Version update
Google Dorks

ซอร์สโค้ดอื่น ๆ

1.0
shepherd

ซอร์สโค้ดอื่น ๆ

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

ซอร์สโค้ดอื่น ๆ

v1.1.0-rc-3

ข้อมูลที่เกี่ยวข้อง ทั้งหมด