تنزيل HumanPrompt - تنزيل رمز مصدر HumanPrompt

HumanPrompt

HumanPrompt هو إطار لتصميم أسهل الإنسان في الحلقة وإدارته ومشاركته واستخدامه للطرق الفوري والسرعة. وهي مصممة خصيصا للباحثين. لا يزال قيد التقدم؟ ، نحن نرحب بشدة بالمساهمات الجديدة على الأساليب والوحدات النمطية. تحقق من اقتراحنا هنا.

محتوى

للبدء
لتسريع بحثك
- تكوين
- تشغيل التجربة
بنيان
المساهمة
- قبل اللبلاب
تستخدم من قبل
اقتباس

للبدء

أولاً ، استنساخ هذا الريبو ، ثم قم بتشغيله:

pip install -e .

سيؤدي ذلك إلى تثبيت حزمة HumanPrompt وإضافة محور الارتباط الناعم إلى ./humanprompt/artifacts/hub .

ثم تحتاج إلى تعيين بعض المتغيرات البيئية مثل مفتاح API Openai:

 export OPENAI_API_KEY = " YOUR_OPENAI_API_KEY "

ثم ، يعتمد ذلك على كيفية استخدام هذا الريبو. في الوقت الحالي ، تتمثل مهمة هذا الريبو في مساعدة الباحثين على التحقق من أفكارهم. لذلك ، نجعلها مرنة حقًا لتمديدها واستخدامها.

مثال على الحد الأدنى لتشغيل طريقة ما يلي:

استخدامنا بسيط للغاية ، فهو مشابه تقريبًا إذا كنت قد استخدمت محولات Huggingface من قبل.

على سبيل المثال ، استخدم سلسلة الأفكار في المشاركة المنطقية:

 from humanprompt . methods . auto . method_auto import AutoMethod
from humanprompt . tasks . dataset_loader import DatasetLoader

# Get one built-in method
method = AutoMethod . from_config ( method_name = "cot" )

# Get one dataset, select one example for demo
data = DatasetLoader . load_dataset ( dataset_name = "commonsense_qa" , dataset_split = "test" )
data_item = data [ 0 ]

# Adapt the raw data to the method's input format, (we will improve this part later)
data_item [ "context" ] = "Answer choices: {}" . format (
        " " . join (
            [
                "({}) {}" . format ( label . lower (), text . lower ())
                for label , text in zip (
                data_item [ "choices" ][ "label" ], data_item [ "choices" ][ "text" ]
            )
            ]
        )
    )

# Run the method
result = method . run ( data_item )
print ( result )
print ( data_item )

Zero-Shot Text2SQL:

 import os
from humanprompt . methods . auto . method_auto import AutoMethod
from humanprompt . tasks . dataset_loader import DatasetLoader

method = AutoMethod . from_config ( "db_text2sql" )
data = DatasetLoader . load_dataset ( dataset_name = "spider" , dataset_split = "validation" )
data_item = data [ 0 ]

data_item [ "db" ] = os . path . join (
data_item [ "db_path" ], data_item [ "db_id" ], data_item [ "db_id" ] + ".sqlite"
)

result = method . run ( data_item )
print ( result )
print ( data_item )

لتسريع بحثك

تكوين

نعتمد نموذجًا "تكوينًا واحدًا ، تجربة واحدة" لتسهيل البحث ، وخاصة عند قياس طرق المطالبة المختلفة. في ملف التكوين الخاص بكل تجربة (.yaml) ضمن examples/configs/ ، يمكنك تكوين مجموعة البيانات وطريقة التقديم والمقاييس.

فيما يلي مثال ملف التكوين لطريقة سلسلة الفكرة على GSM8K:

---
  dataset :
    dataset_name : " gsm8k "                # dataset name, aligned with huggingface dataset if loaded from it
    dataset_split : " test "                # dataset split
    dataset_subset_name : " main "          # dataset subset name, null if not used
    dataset_key_map :                     # mapping original dataset keys to humanprompt task keys to unify the interface
      question : " question "
      answer : " answer "
  method :
    method_name : " cot "                   # method name to initialize the prompting method class
    method_config_file_path : null        # method config file path, null if not used(will be overriden by method_args).
    method_args :
      client_name : " openai "              # LLM API client name, adopted from github.com/HazyResearch/manifest
      transform : " cot.gsm8k.transform_cot_gsm8k.CoTGSM8KTransform "  # user-defined transform class to build the prompts
      extract : " cot.gsm8k.extract_cot_gsm8k.CoTGSM8KExtract "        # user-defined extract class to extract the answers from output
      extraction_regex : " .*The answer is (.*). n ? "                  # user-defined regex to extract the answer from output
      prompt_file_path : " cot/gsm8k/prompt.txt "                      # prompt file path
      max_tokens : 512                    # max generated tokens
      temperature : 0                     # temperature for generated tokens
      engine : code-davinci-002           # LLM engine
      stop_sequence : " nn "              # stop sequence for generation
  metrics :
    - " exact_match "                      # metrics to evaluate the results

يمكن للمستخدمين إنشاء فئات transform extract لتخصيص عملية توليد المطالبات وعملية استخراج الإجابة. يمكن استبدال الملف الموجه أو تحديده وفقًا لاحتياجات المستخدم.

تشغيل التجربة

لتشغيل التجارب ، يمكنك تحديد اسم التجربة والتكوينات الوصفية الأخرى في سطر الأوامر ضمن examples/ دليل.

على سبيل المثال ، قم بتشغيل الأمر التالي لتشغيل سلسلة الأفكار على GSM8K:

python run_experiment.py
  --exp_name cot-gsm8k
  --num_test_samples 300

للحصول على مزيج جديد من الأساليب والمهام ، يمكنك ببساطة إضافة ملف تكوين جديد ضمن examples/configs/ وتشغيل الأمر.

بنيان

 .
├── examples
│   ├── configs                    # config files for experiments
│   ├── main.py                    # one sample demo script
│   └── run_experiment.py          # experiment script
├── hub                            # hub contains static files for methods and tasks
│   ├── cot                        # method Chain-of-Thought
│   │   ├── gsm8k                  # task GSM8K, containing prompt file and transform/extract classes, etc.
│   │   └── ...
│   ├── ama_prompting              # method Ask Me Anything
│   ├── binder                     # method Binder
│   ├── db_text2sql                # method text2sql
│   ├── react                      # method ReAct
│   ├── standard                   # method standard prompting
│   └── zero_shot_cot              # method zero-shot Chain-of-Thought
├── humanprompt                    # humanprompt package, containing building blocks for the complete prompting pipeline
│   ├── artifacts
│   │   ├── artifact.py
│   │   └── hub
│   ├── components                 # key components for the prompting pipeline
│   │   ├── aggregate              # aggregate classes to aggregate the answers
│   │   ├── extract                # extract classes to extract the answers from output
│   │   ├── post_hoc.py            # post-hoc processing
│   │   ├── prompt.py              # prompt classes to build the prompts
│   │   ├── retrieve               # retrieve classes to retrieve in-context examples
│   │   └── transform              # transform classes to transform the raw data to the method's input format
│   ├── evaluators                 # evaluators
│   │   └── evaluator.py           # evaluator class to evaluate the dataset results
│   ├── methods                    # prompting methods, usually one method is related to one paper
│   │   ├── ama_prompting          # Ask Me Anything(https://arxiv.org/pdf/2210.02441.pdf)
│   │   ├── binder                 # Binder(https://arxiv.org/pdf/2210.02875.pdf)
│   │   └── ...
│   ├── tasks                      # dataset loading and preprocessing
│   │   ├── add_sub.py             # AddSub dataset
│   │   ├── wikitq.py              # WikiTableQuestions dataset
│   │   └── ...
│   ├── third_party                # third party packages
│   └── utils                      # utils
│       ├── config_utils.py
│       └── integrations.py
└── tests                          # test scripts
    ├── conftest.py
    ├── test_datasetloader.py
    └── test_method.py

المساهمة

تم تصميم هذا المستودع للباحثين لإعطاء استخدامات سريعة وسهولة التلاعب بالطرق الفريدة المختلفة. لقد أمضينا الكثير من الوقت في تسهيل تمديده واستخدامه ، وبالتالي نأمل أن تتمكن من المساهمة في هذا الريبو.

إذا كنت مهتمًا بالمساهمة في طريقتك في هذا الإطار ، فيمكنك:

طرح مشكلة حول طريقتك المطلوبة ، وسنقوم بإضافتها إلى قائمة TODO والتنفيذ في أقرب وقت ممكن.
أضف طريقتك إلى مجلد humanprompt/methods بنفسك. للقيام بذلك ، يجب عليك اتباع الخطوات التالية:
1. استنساخ الريبو.
2. قم بإنشاء فرع من الفرع main ، اسمه لك الأساليب.
3. ارتكب رمزك في فرعك ، تحتاج إلى:
  1. أضف رمزًا في ./humanprompt/methods ، وأضف طريقتك إلى ./humanprompt/methods/your_method_name .
  2. قم بإنشاء مركز لطريقتك في ./hub/your_method_name ،
  3. تأكد من ./hub/your_method_name مجلد ./examples
  4. الحد الأدنى من العرض التوضيحي في ./examples لتشغيل واختبار طريقتك.
4. إنشاء عرض تجريبي للاستخدام في مجلد.
5. طلب العلاقات العامة لدمج فرعك في الفرع main .
6. سوف نتعامل مع الخطوات القليلة الأخيرة لك للتأكد من دمج طريقتك بشكل جيد في هذا الإطار.

قبل اللبلاب

نحن نستخدم اللجنة المسبقة للتحكم في جودة التعليمات البرمجية. قبل الالتزام ، تأكد من تشغيل الرمز أدناه لتجاوز الكود الخاص بك وإصلاح المشكلات.

 pip install pre-commit
pre-commit install # install all hooks
pre-commit run --all-files # trigger all hooks

يمكنك استخدام git commit --no-verify للتخطي والسماح لنا بالتعامل مع ذلك لاحقًا.

تستخدم من قبل

دفعت دفعة

اقتباس

إذا وجدت هذا الريبو مفيدًا ، فيرجى الاستشهاد بمشروعنا وظهورنا:

 @software { humanprompt ,
  author = { Tianbao Xie and
            Zhoujun Cheng and
            Yiheng Xu and
            Peng Shi and
            Tao Yu } ,
  title = { A framework for human-readable prompt-based method with large language models } ,
  howpublished = { url{https://github.com/hkunlp/humanprompt} } ,
  year = 2022 ,
  month = October
}

 @misc { orr2022manifest ,
  author = { Orr, Laurel } ,
  title = { Manifest } ,
  year = { 2022 } ,
  publisher = { GitHub } ,
  howpublished = { url{https://github.com/HazyResearch/manifest} } ,
}