brewval
1.0.0
在多个大型语言模型提供商和模型的时代,在功能,速度和成本上有所不同的时代,需要评估不同提供商和模型的提示以选择最合适的组合来给定任务。
from typing import Dict
from brewval . model import Prompt , Label
from brewval . eval import Evaluator
from langchain . llms import OpenAI , BaseLLM
prompt = Prompt ( """
Description: Feelings of disappointment, grief, hopelessness, disinterest, and dampened mood.
Emotion: sadness
Description: muscles become tense, your heart rate and respiration increase, and your mind becomes more alert, priming your body to either run from the danger or stand and fight
Emotion: fear
Description: {description}
Emotion: {result}""" )
labels = [
Label ( 'fear' , { 'description' : 'heart rate and respiration increase' }),
Label ( 'surprise' , { 'description' : 'quite brief and is characterized by a physiological startle response following something unexpected' }),
Label ( 'anger' , { 'description' : 'Characterized by feelings of hostility, agitation, frustration, and antagonism towards others.' })
]
models : Dict [ str , BaseLLM ] = {
'OpenAI[davinci-003]' : OpenAI ( model_name = 'text-davinci-003' ),
'OpenAI[davinci-002]' : OpenAI ( model_name = 'text-davinci-002' ),
'OpenAI[ada-001]' : OpenAI ( model_name = 'text-ada-001' )
}
evaluator = Evaluator ( models )
results = evaluator . evaluate ( prompt , labels )
for result in results :
print ( f'Model { result . model_name } accuracy: { result . accuracy * 100 } %' )输出
Model OpenAI[davinci-003] accuracy: 100.0%
Model OpenAI[davinci-002] accuracy: 33.3%
Model OpenAI[ada-001] accuracy: 0.0%
安装诗歌
poetry install
export OPENAI_API_KEY="your key"
命令行,使用来自CSV文件的数据:
poetry run python3 -m brewval.cli -p examples/weather-umbrella/prompts.csv -l examples/weather-umbrella/labels.csv
jupyter笔记本(文档/示例/评估。IPYNB):
poetry run jupyter notebook