simple effective text matching pytorch下載 - simple effective text matching pytorch源代碼下載

simple effective text matching pytorch

其他源碼

1.0.0

下載

RE2

這是ACL 2019論文“簡單有效的文本匹配與更豐富的對齊功能”的Pytorch實現。原始TensorFlow實現：https：//github.com/alibaba-edu/simple-effective-text-matching。

快速鏈接

關於
設定
用法

簡單有效的文本匹配

RE2是通用文本匹配應用程序的快速而強大的神經體系結構。在文本匹配任務中，模型將兩個文本序列作為輸入並預測其關係。該方法旨在探索在這些任務中出現強大績效的足夠的方法。它簡化了許多慢速組件，這些組件以前被視為文本匹配中的核心構建塊，同時使三個關鍵功能直接可用於序列對齊：原始點的功能，以前的對齊功能和上下文功能。

RE2在四個基準數據集上與最先進的狀態（SNLI，SCITAIL，QUORA和WIKIQA）在自然語言推理，釋義識別和答案選擇的情況下，沒有或幾個任務特定於特定於任務的適應性。與類似執行的模型相比，它的推理速度至少要快6倍。

下表列出了主要實驗結果。該論文報告了10次運行的平均值和標準偏差。推理時間（以秒為單位）是通過處理英特爾i7 CPU上的8對長度20的批次來測量的。不包括CSRAN和DIIN使用的POS功能的計算時間。

模型	snli	Scitail	Quora	Wikiqa	推理時間
Bimpm	86.9	-	88.2	0.731	0.05
Esim	88.0	70.6	-	-	-
迪恩	88.0	-	89.1	-	1.79
克里斯蘭人	88.7	86.7	89.2	-	0.28
RE2	88.9±0.1	86.0±0.6	89.2±0.2	0.7618±0.0040	0.03〜0.05

有關組件和實驗結果的更多詳細信息，請參閱本文。

設定

安裝Python> = 3.6和PIP
pip install -r requirements.txt
安裝Pytorch
將手套字向量（手套。840b.300d）下載到resources/

本文中使用的數據如下：

snli

下載並解開SNLI（由Tay等人預處理）到data/orig 。
解壓縮“數據/orig/snli”文件夾中的所有zip文件。（ cd data/orig/SNLI && gunzip *.gz ）
cd data && python prepare_snli.py

Scitail

下載並解開Scitail數據集中data/orig 。
cd data && python prepare_scitail.py

Quora

下載並解開Quora數據集（由Wang等人預處理）到data/orig 。
cd data && python prepare_quora.py

Wikiqa

下載並解開Wikiqa到data/orig 。
cd data && python prepare_wikiqa.py
下載和解開拉鍊評估腳本。使用make -B命令在qg-emnlp07-data/eval/trec_eval-8.0中編譯源文件。將二進製文件“ TREC_EVAL”移至resources/ 。

用法

要訓練新的文本匹配模型，請運行以下命令：

python train.py $config_file .json5

示例配置文件以configs/ ：：：

configs/main.json5 ：在論文中復制主實驗結果。
configs/robustness.json5 ：魯棒性檢查
configs/ablation.json5 ：消融研究

編寫您自己的配置文件的說明：

 [
    {
        name : 'exp1' , // name of your experiment, can be the same across different data
        __parents__ : [
            'default' , // always put the default on top
            'data/quora' , // data specific configurations in `configs/data`
            // 'debug', // use "debug" to quick debug your code  
        ] ,
        __repeat__ : 5 ,  // how may repetitions you want
        blocks : 3 , // other configurations for this experiment 
    } ,
    // multiple configurations are executed sequentially
    {
        name : 'exp2' , // results under the same name will be overwritten
        __parents__ : [
            'default' , 
            'data/quora' ,
        ] ,
        __repeat__ : 5 ,  
        blocks : 4 , 
    }
]

要僅檢查配置，請使用

python train.py $config_file .json5 --dry

要評估存在的模型，請使用python evaluate.py $model_path $data_file ，這是一個示例：

python evaluate.py models/snli/benchmark/best.pt data/snli/train.txt 
python evaluate.py models/snli/benchmark/best.pt data/snli/test.txt

請注意，Pytorch實施中尚未支持多GPU培訓。當隱藏尺寸200和批量512的塊<5塊<5的塊<5時，單個16G GPU就足以訓練。在紙上報告的所有結果都可以使用單個16G GPU複製穩健性檢查。

引用

如果您在工作中使用RE2，請引用ACL紙：

 @inproceedings{yang2019simple,
  title={Simple and Effective Text Matching with Richer Alignment Features},
  author={Yang, Runqi and Zhang, Jianhai and Gao, Xing and Ji, Feng and Chen, Haiqing},
  booktitle={Association for Computational Linguistics (ACL)},
  year={2019}
}

執照

該項目在Apache許可證2.0下。

展開

附加信息

版本 1.0.0
類型其他源碼
更新時間 2025-04-18
大小 181.35KB
來自於 Github

相關應用

simple video downloader

2024-11-11
filament simple theme

2024-11-10
pytorch image models

2024-11-03
與耶穌發簡訊

2023-08-17
發短信或死亡

2023-07-03
簡單的組件

2012-03-15

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3
Google Dorks

其他源碼

1.0
shepherd

其他源碼

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

其他源碼

v1.1.0-rc-3

相關資訊全部