brewval
1.0.0
在多個大型語言模型提供商和模型的時代,在功能,速度和成本上有所不同的時代,需要評估不同提供商和模型的提示以選擇最合適的組合來給定任務。
from typing import Dict
from brewval . model import Prompt , Label
from brewval . eval import Evaluator
from langchain . llms import OpenAI , BaseLLM
prompt = Prompt ( """
Description: Feelings of disappointment, grief, hopelessness, disinterest, and dampened mood.
Emotion: sadness
Description: muscles become tense, your heart rate and respiration increase, and your mind becomes more alert, priming your body to either run from the danger or stand and fight
Emotion: fear
Description: {description}
Emotion: {result}""" )
labels = [
Label ( 'fear' , { 'description' : 'heart rate and respiration increase' }),
Label ( 'surprise' , { 'description' : 'quite brief and is characterized by a physiological startle response following something unexpected' }),
Label ( 'anger' , { 'description' : 'Characterized by feelings of hostility, agitation, frustration, and antagonism towards others.' })
]
models : Dict [ str , BaseLLM ] = {
'OpenAI[davinci-003]' : OpenAI ( model_name = 'text-davinci-003' ),
'OpenAI[davinci-002]' : OpenAI ( model_name = 'text-davinci-002' ),
'OpenAI[ada-001]' : OpenAI ( model_name = 'text-ada-001' )
}
evaluator = Evaluator ( models )
results = evaluator . evaluate ( prompt , labels )
for result in results :
print ( f'Model { result . model_name } accuracy: { result . accuracy * 100 } %' )輸出
Model OpenAI[davinci-003] accuracy: 100.0%
Model OpenAI[davinci-002] accuracy: 33.3%
Model OpenAI[ada-001] accuracy: 0.0%
安裝詩歌
poetry install
export OPENAI_API_KEY="your key"
命令行,使用來自CSV文件的數據:
poetry run python3 -m brewval.cli -p examples/weather-umbrella/prompts.csv -l examples/weather-umbrella/labels.csv
jupyter筆記本(文檔/示例/評估。IPYNB):
poetry run jupyter notebook