Official repository of paper "How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection". Please star, watch, and fork our repo for the active updates!
See also→(? Feedback Space for Detectors please feel free to leave your feedback here! Please leave your valuable comments!)

Yes, we propose the first Human vs. ChatGPT comparison corpus, named HC3 .
We propose the first Human vs. ChatGPT contrast corpus called HC3 .

The first version of the HC3 datasets are now available on ? Huggingface Datasets:
In the Chinese community, the HC3 dataset is also available on ModelScope:
Train/Test splits & filtered versions of the paper, ref to Google Drive links in HC3/README.md.
If the source datasets used in this corpus has a specific license which is stricter than CC-BY-SA, our products follow the same. If not, they follow CC-BY-SA license.
| English Split | Source | Source License | Note |
|---|---|---|---|
| reddit_eli5 | ELI5 | BSD License | |
| open_qa | WikiQA | PWC Custom | |
| wiki_csai | Wikipedia | CC-BY-SA | |
| medicine | Medical Dialog | Unknown | Asking |
| finance | FiQA | Unknown | Asking by ? |
| Chinese Split | Source | Source License | Note |
|---|---|---|---|
| open_qa | WebTextQA & BaikeQA | MIT license | |
| baike | Baidu Baike | None | |
| nlpcc_dbqa | NLPCC-DBQA | Unknown | Asking |
| medicine | Chinese Medical Dialogue | CC-BY-NC 4.0 | |
| finance | FinanceZhidao | CC-BY 4.0 | |
| psychology | On Baidu AI Studio | CC0 | |
| law | LegalQA | Unknown | Asking |
(Hosted on ? Hugging Face Spaces)
We provide three kinds of detectors, all in Bilingual / We provide three kinds of detectors, all in English:
On the modelscope Chinese community platform, all three versions of detectors are also available:
The model weights are all available at ? Hugging Face Models:
| Model Checkpoints | Comment |
|---|---|
| chatgpt-detector-roberta | To detect a single piece of text |
| chatgpt-qa-detector-roberta | To detect a question-answer pair |
| chatgpt-detector-roberta-chinese | Detect single text, Chinese version |
| chatgpt-qa-detector-roberta-chinese | Detect a pair of QA text, Chinese version |
The English models are based on roberta-base. The Chinese models are based on hfl/chinese-roberta-wwm-ext.
| Events | Dates |
|---|---|
| Project Launch / Project Launch | 2022-12-09 ✅ |
| Comparison Data Collection / Comparison Data Collection | 2022-12-11 to Now ?️ |
| Release ChatGPT Detector (Demo) / Detector Demo Release | 2023-01-11 ✅ |
| Models Release / Model Open Source | 2023-01-18 ✅ |
| Comparison Corpus Release / corpus open source | 2023-01-18 ✅ |
| Research Paper / Research Paper Release | 2023-01-19 ✅ |
| ... | ... |
Checkout this paper arxiv: 2301.07597
@article{guo-etal-2023-hc3,
title = "How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection",
author = "Guo, Biyang and
Zhang, Xin and
Wang, Ziyuan and
Jiang, Minqi and
Nie, Jinran and
Ding, Yuxuan and
Yue, Jianwei and
Wu, Yupeng",
journal={arXiv preprint arxiv:2301.07597}
year = "2023",
}
On December 9, 2022, which is 10 days after the launch of ChatGPT, we started this project, for two purposes:
On December 9, 2022, the 10th day of ChatGPT launch, we started this project for two purposes:
Welcome to follow our project! We have released a preview of our ChatGPT detectors, and the models, dataset will be open-sourced in about a week. We look forward to receiving feedback from the community to help improve the models and make contributions to open academic research together:)
Welcome to follow our project. We have released the ChatGPT detector preview version and will release open source models and data sets within about one week . We look forward to receiving feedback from the general public to help us improve our model and contribute to open academic research together!
We are a group of insignificant researchers (in the shadow of ChatGPT) hoping to do some significant work for the community. The team for this projects consists of PhD students and engineers from 6 universities/companies.
We are a small group of researchers (in the shadow of ChatGPT) but want to do something meaningful for the community. The team of this project consists of doctoral students and engineers from 6 universities/companies.
| Biyang Guo | Minqi Jiang | Ziyuan Wang | Xin Zhang |
| Jinran Nie | Yuxuan Ding | Jianwei Yue | Yupeng Wu |