fewshot textclassification
1.0.0
使用SETFIT方法进行几次传输进行文本分类。
编辑:我还对主动学习进行了一些实验,所以现在我也有Active.py。我将在一个晴天更好地组织它。
$ ~/Dev/projects/setfit$ python main.py --help
Usage: main.py [OPTIONS]
Options:
-d, --dataset-name TEXT The name of the dataset as it appears on the
HuggingFace hub e.g. SetFit/SentEval-CR |
SetFit/bbc-news | SetFit/enron_spam ...
-c, --case INTEGER 0, 1, 2, or 3: which experiment are we
running. See readme or docstrings to know
more but briefly: **0**: SentTF ->
Constrastive Pretrain -> +LogReg on task.
**1**: SentTF -> +Dense on task. **2**:
SentTF -> +LogReg on task. **3**:
FewShotPrompting based Clf over Flan-t5-xl
[required]
-r, --repeat INTEGER The number of times we should run the entire
experiment (changing the seed).
-bs, --batch-size INTEGER ... you know what it is.
-ns, --num-sents INTEGER Size of our train set. Set short values
(under 100)
-e, --num-epochs INTEGER Epochs for fitting Clf+SentTF on the main
(classification) task.
-eft, --num-epochs-finetune INTEGER
Epochs for both contrastive pretraining of
SentTF.
-ni, --num-iters INTEGER Number of text pairs to generate for
contrastive learning. Values above 20 can
get expensive to train.
-tot, --test-on-test If true, we report metrics on testset. If
not, on a 20% split of train set. Off by
default.
-ft, --full-test We truncate the testset of every dataset to
have 100 instances. If you know what you're
doing, you can test on the full dataset.NOTE
that if you're running this in case 3 you
should probably be a premium member and not
be paying per use.
--help Show this message and exit.
注意:如果要查询在HuggingFace(案例3)托管的LLM,则必须在HuggingFace Hub上创建您的帐户并生成访问令牌,之后您应该将它们粘贴到文件中
./hf_token.key。PS:不用担心,我已经将此文件添加到.gitignore
$ python active.py --help
Usage: active.py [OPTIONS]
Options:
-d, --dataset-name TEXT The name of the dataset as it appears on the
HuggingFace hub e.g. SetFit/SentEval-CR |
SetFit/bbc-news | SetFit/enron_spam | imdb ...
-ns, --num-sents INTEGER Size of our train set. I.e., the dataset at the
END of AL. Not the start of it.
-nq, --num-queries INTEGER Number of times we query the unlabeled set and
pick some labeled examples. Set short values
(under 10)
-ft, --full-test We truncate the testset of every dataset to have
100 instances. If you know what you're doing,
you can test on the full dataset.NOTE that if
you're running this in case 3 you should
probably be a premium member and not be paying
per use.
--help Show this message and exit.
或者,您可以在安装requirements.txt库后只需运行./run.sh
之后,您可以运行笔记本summarise.ipynb来汇总和可视化(如果我添加此代码)结果。
ps:注意 -
--full-test。默认情况下,我们将每个测试集截断为前100个实例。
它们都是由制作SetFit Lib的善良和善良的人清理的分类数据集。但是您可以使用任何HF数据集,只要它具有这三个字段: (i)文本(str),(ii)标签(int)和(iii)label_text(str)。
这是我的结果:
该表介绍了此 +主动学习设置的结果。除非另有说明,否则我们将每个实验重复5次。当我们在火车集中只有100个实例时,这些数字报告了任务准确性。
| BBC新闻 | SST2 | Senteval-Cr | IMDB | Enron_spam | |
|---|---|---|---|---|---|
| setFit ft | 0.978±0.004 | 0.860±0.018 | 0.882±0.029 | 0.924±0.026 | 0.960±0.017 |
| 没有对比度setFit ft | 0.932±0.015 | 0.854±0.019 | 0.886±0.005 | 0.902±0.019 | 0.942±0.020 |
| 常规ft | 0.466±0.133 | 0.628±0.098 | 0.582±0.054 | 0.836±0.166 | 0.776±0.089 |
| LLM提示 | 0.950±0.000 | 0.930±0.000 | 0.900±0.000 | 0.930±0.000 | 0.820±0.000 |
| 约束 | 0.980±0.000 | 0.910±0.000 | 0.910±0.000 | 0.870±0.000 | 0.980±0.000 |
[1]:LLM提示仅在10个实例中完成(实际提示可能包含较小的长度)。对于不同种子,它也不重复。
[2]:不同种子也不重复对比度。