Official repository of paper "How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection". Please star, watch, and fork our repo for the active updates!
See also→(? Feedback Space for Detectors please feel free to leave your feedback here! 請留下您寶貴的意見!)

Yes, we propose the first Human vs. ChatGPT comparison corpus, named HC3 .
我們提出了第一個Human vs. ChatGPT對比語料, 叫做HC3 .

The first version of the HC3 datasets are now available on ? Huggingface Datasets:
在中文社區,HC3 數據集也已在ModelScope 上可用:
Train/Test splits & filtered versions of the paper, ref to Google Drive links in HC3/README.md.
If the source datasets used in this corpus has a specific license which is stricter than CC-BY-SA, our products follow the same. If not, they follow CC-BY-SA license.
| English Split | Source | Source License | Note |
|---|---|---|---|
| reddit_eli5 | ELI5 | BSD License | |
| open_qa | WikiQA | PWC Custom | |
| wiki_csai | Wikipedia | CC-BY-SA | |
| medicine | Medical Dialog | Unknown | Asking |
| finance | FiQA | Unknown | Asking by ? |
| Chinese Split | Source | Source License | Note |
|---|---|---|---|
| open_qa | WebTextQA & BaikeQA | MIT license | |
| baike | Baidu Baike | None | |
| nlpcc_dbqa | NLPCC-DBQA | Unknown | Asking |
| medicine | Chinese Medical Dialogue | CC-BY-NC 4.0 | |
| finance | FinanceZhidao | CC-BY 4.0 | |
| psychology | On Baidu AI Studio | CC0 | |
| law | LegalQA | Unknown | Asking |
(Hosted on ? Hugging Face Spaces)
We provide three kinds of detectors, all in Bilingual / 我們提供了三個版本的檢測器,且都支持中英文:
在modelscope 中文社區平台,三個版本的檢測器也都可用:
The model weights are all available at ? Hugging Face Models:
| Model Checkpoints | Comment |
|---|---|
| chatgpt-detector-roberta | To detect a single piece of text |
| chatgpt-qa-detector-roberta | To detect a question-answer pair |
| chatgpt-detector-roberta-chinese | 檢測單條文本,中文版 |
| chatgpt-qa-detector-roberta-chinese | 檢測一對QA文本,中文版 |
The English models are based on roberta-base. The Chinese models are based on hfl/chinese-roberta-wwm-ext.
| Events | Dates |
|---|---|
| Project Launch / 項目啟動 | 2022-12-09 ✅ |
| Comparison Data Collection / 對比數據收集 | 2022-12-11 to Now ?️ |
| Release ChatGPT Detector (Demo) / 檢測器Demo 發布 | 2023-01-11 ✅ |
| Models Release / 模型開源 | 2023-01-18 ✅ |
| Comparison Corpus Release / 語料集開源 | 2023-01-18 ✅ |
| Research Paper / 研究論文發布 | 2023-01-19 ✅ |
| ... | ... |
Checkout this paper arxiv: 2301.07597
@article{guo-etal-2023-hc3,
title = "How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection",
author = "Guo, Biyang and
Zhang, Xin and
Wang, Ziyuan and
Jiang, Minqi and
Nie, Jinran and
Ding, Yuxuan and
Yue, Jianwei and
Wu, Yupeng",
journal={arXiv preprint arxiv:2301.07597}
year = "2023",
}
On December 9, 2022, which is 10 days after the launch of ChatGPT, we started this project, for two purposes:
2022 年12 月9 日,也就是ChatGPT 推出的第10 天,我們開始了這個項目,為了兩個目的:
Welcome to follow our project! We have released a preview of our ChatGPT detectors, and the models, dataset will be open-sourced in about a week. We look forward to receiving feedback from the community to help improve the models and make contributions to open academic research together:)
歡迎關注我們項目,我們目前已經發布ChatGPT檢測器預覽版,並將於約一周內發布開源模型、數據集。期待得到廣大群眾的反饋,來幫助我們改進模型,為開放的學術研究一起做貢獻!
We are a group of insignificant researchers (in the shadow of ChatGPT) hoping to do some significant work for the community. The team for this projects consists of PhD students and engineers from 6 universities/companies.
我們是一群(在ChatGPT 的陰影下)渺小的研究人員,但希望為社區做一些有意義的事。這個項目的團隊由來自6所大學/公司的博士生和工程師組成。
| Biyang Guo | Minqi Jiang | Ziyuan Wang | Xin Zhang |
| Jinran Nie | Yuxuan Ding | Jianwei Yue | Yupeng Wu |