nlg eval下載nlg eval源代碼下載

nlg eval

其他源碼

2.4.1

下載

nlg-eval

NLG（自然語言生成）的各種無監督自動指標的評估代碼。它將其作為輸入假設文件，一個或多個引用文件和輸出指標值。這些文件之間的行應對應於同一示例。

指標

bleu
流星
胭脂
蘋果酒
香料
跳過餘弦的相似性
嵌入平均餘弦相似性
向量極端餘弦相似性
貪婪的匹配分數

設定

安裝Java 1.8.0（或更高）。

安裝Python依賴項，運行：

pip install git+https://github.com/Maluuba/nlg-eval.git@master

如果您使用的是MacOS High Sierra或更高版本，請運行以允許多線程：

 export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES

簡單設置（下載必需的數據（例如模型，嵌入）和外部代碼文件），運行：

nlg-eval --setup

如果您是從源代碼設置的，或者您正在使用Windows而不是使用Bash終端，則可能會發現有關未找到nlg-eval的錯誤。您將需要找到nlg-eval腳本。有關詳細信息，請參見此處。

自定義設置

 # If you don't like the default path (~/.cache/nlgeval) for the downloaded data,
# then specify a path where you want the files to be downloaded.
# The value for the data path is stored in ~/.config/nlgeval/rc.json and can be overwritten by
# setting the NLGEVAL_DATA environment variable.
nlg-eval --setup ${data_path}

驗證設置（可選）

（這些示例是用窗戶上的git bash製作的）

所有數據文件都應下載，您應該看到大小，例如：

 $ ls -l ~/.cache/nlgeval/
total 6003048
-rw-r--r-- 1 ...  289340074 Sep 12  2018 bi_skip.npz
-rw-r--r-- 1 ...        689 Sep 12  2018 bi_skip.npz.pkl
-rw-r--r-- 1 ... 2342138474 Sep 12  2018 btable.npy
-rw-r--r-- 1 ...    7996547 Sep 12  2018 dictionary.txt
-rw-r--r-- 1 ...   21494787 Jan 22  2019 glove.6B.300d.model.bin
-rw-r--r-- 1 ...  480000128 Jan 22  2019 glove.6B.300d.model.bin.vectors.npy
-rw-r--r-- 1 ...  663989216 Sep 12  2018 uni_skip.npz
-rw-r--r-- 1 ...        693 Sep 12  2018 uni_skip.npz.pkl
-rw-r--r-- 1 ... 2342138474 Sep 12  2018 utable.npy

您還可以驗證一些校驗和

 $ cd ~/.cache/nlgeval/
$ md5sum *
9a15429d694a0e035f9ee1efcb1406f3 *bi_skip.npz
c9b86840e1dedb05837735d8bf94cee2 *bi_skip.npz.pkl
022b5b15f53a84c785e3153a2c383df6 *btable.npy
26d8a3e6458500013723b380a4b4b55e *dictionary.txt
f561ab0b379e23cbf827a054f0e7c28e *glove.6B.300d.model.bin
be5553e91156471fe35a46f7dcdfc44e *glove.6B.300d.model.bin.vectors.npy
8eb7c6948001740c3111d71a2fa446c1 *uni_skip.npz
e1a0ead377877ff3ea5388bb11cfe8d7 *uni_skip.npz.pkl
5871cc62fc01b79788c79c219b175617 *utable.npy
$ sha256sum *
8ab7965d2db5d146a907956d103badfa723b57e0acffb75e10198ba9f124edb0 *bi_skip.npz
d7e81430fcdcbc60b36b92b3f879200919c75d3015505ee76ae3b206634a0eb6 *bi_skip.npz.pkl
4a4ed9d7560bb87f91f241739a8f80d8f2ba787a871da96e1119e913ccd61c53 *btable.npy
4dc5622978a30cddea8c975c871ea8b6382423efb107d27248ed7b6cfa490c7c *dictionary.txt
10c731626e1874effc4b1a08d156482aa602f7f2ca971ae2a2f2cd5d70998397 *glove.6B.300d.model.bin
20dfb1f44719e2d934bfee5d39a6ffb4f248bae2a00a0d59f953ab7d0a39c879 *glove.6B.300d.model.bin.vectors.npy
7f40ff16ff5c54ce9b02bd1a3eb24db3e6adaf7712a7a714f160af3a158899c8 *uni_skip.npz
d58740d46cba28417cbc026af577f530c603d81ac9de43ffd098f207c7dc4411 *uni_skip.npz.pkl
790951d4b08e843e3bca0563570f4134ffd17b6bd4ab8d237d2e5ae15e4febb3 *utable.npy

如果您確保設置成功，則可以運行測試：

pip install pytest
pytest

可能需要幾分鐘，您可能會看到警告，但應該通過。

用法

設置完成後，可以使用Python API或在命令行中評估指標。

可以在test_nlgeval.py中找到Python API的示例。

獨立

 nlg-eval --hypothesis=examples/hyp.txt --references=examples/ref1.txt --references=examples/ref2.txt

假設文件中的每一行都是生成句子，而參考文件上的相應行是相應假設的基礎真實參考句子。

功能性API：對於整個語料庫

 from nlgeval import compute_metrics
metrics_dict = compute_metrics ( hypothesis = 'examples/hyp.txt' ,
                               references = [ 'examples/ref1.txt' , 'examples/ref2.txt' ])

功能性API：僅對一個句子

 from nlgeval import compute_individual_metrics
metrics_dict = compute_individual_metrics ( references , hypothesis )

references是地面真理參考文本字符串和hypothesis的列表是假設文本字符串。

腳本中重複調用的對象導向API-單個示例

 from nlgeval import NLGEval
nlgeval = NLGEval ()  # loads the models
metrics_dict = nlgeval . compute_individual_metrics ( references , hypothesis )

references是地面真理參考文本字符串和hypothesis的列表是假設文本字符串。

腳本中重複調用的對象導向API-多個示例

 from nlgeval import NLGEval
nlgeval = NLGEval ()  # loads the models
metrics_dict = nlgeval . compute_metrics ( references , hypothesis )

references是地面真理參考文本字符串和hypothesis的列表，是假設文本字符串的列表。 references中的每個內部列表都是假設的一組引用（以相同順序為hypothesis的每個句子的單個參考字符串列表）。

參考

如果您將此代碼作為任何已發表研究的一部分，請引用以下論文：

Shikhar Sharma，Layla El Asri，Hannes Schulz和Jeremie Zumer。 “無監督指標在評估自然語言生成的任務對話中的相關性” Arxiv Preprint Arxiv：1706.09799 （2017）

 @article { sharma2017nlgeval ,
    author  = { Sharma, Shikhar and El Asri, Layla and Schulz, Hannes and Zumer, Jeremie } ,
    title   = { Relevance of Unsupervised Metrics in Task-Oriented Dialogue for Evaluating Natural Language Generation } ,
    journal = { CoRR } ,
    volume  = { abs/1706.09799 } ,
    year    = { 2017 } ,
    url     = { http://arxiv.org/abs/1706.09799 }
}

例子

跑步

 nlg-eval --hypothesis=examples/hyp.txt --references=examples/ref1.txt --references=examples/ref2.txt

給予

 Bleu_1: 0.550000
Bleu_2: 0.428174
Bleu_3: 0.284043
Bleu_4: 0.201143
METEOR: 0.295797
ROUGE_L: 0.522104
CIDEr: 1.242192
SPICE: 0.312331
SkipThoughtsCosineSimilarity: 0.626149
EmbeddingAverageCosineSimilarity: 0.884690
VectorExtremaCosineSimilarity: 0.568696
GreedyMatchingScore: 0.784205

故障排除

如果您對流星有問題，則可以嘗試降低MEM.PY中的mem變量

重要說明

默認情況下（將IDF參數設置為“語料庫”模式）使用提供的參考句子來計算IDF值。因此，只有1個圖像（或NLG的示例）的參考數據集的CIDER分數為零。使用一個（或幾個）圖像評估時，將IDF設置為“ Coco-Val-DF”，該IDF使用MSCOCO VAILDATION數據集中的IDF以獲得可靠的結果。此代碼尚未對此進行調整。對於此用例，請應用Vrama91/Coco-Caption的補丁。

外部數據目錄

要將已經準備好的數據目錄安裝到Docker容器中或在用戶之間共享它，您可以設置NLGEVAL_DATA環境變量，讓NLG-Eval知道在哪裡可以找到其模型和數據。例如

 NLGEVAL_DATA=~/workspace/nlg-eval/nlgeval/data

該變量覆蓋了設置過程中提供的值（存儲在~/.config/nlgeval/rc.json中）

微軟開源行為代碼

該項目採用了Microsoft開源的行為代碼。有關更多信息，請參見《行為守則常見問題守則》或與其他問題或評論聯繫[email protected]。

執照

請參閱許可證。

展開

附加信息

版本 2.4.1
類型其他源碼
更新時間 2025-04-18
大小 94.33MB
來自於 Github

相關應用

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

2024-11-03
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
GitHub the via/releases

2024-11-01

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3

相關資訊全部