
Over the recent years, the growth of online social media has greatly facilitated the way people communicate with each other. The basic countermeasure of comparing websites against a list of labeled fake news sources is inflexible, and so a machine learning approach is desirable. Our project aims to use Natural Language Processing to detect fake news directly, based on the text content of news articles.
Develop a machine learning program to identify when an article might be fake news. We aim to use a corpus of labeled real and fake news articles to build a classifier that can make decisions about information based on the content from the corpus. The model will focus on identifying fake news.
train.csv: A full training dataset with the following attributes: id: unique id for a news article title: the title of a news article author: author of the news article text: the text of the article; could be incomplete label: a label that marks the article as potentially unreliable 1: unreliable 0: reliable
test.csv: A testing training dataset with all the same attributes at train.csv without the label.
Clone the repo to your local machine-
> git clone https://github.com/sanikamal/fake-news-detector.git
> cd fake-news-detector
Make sure you have all the dependencies installed-
python 3.6+
numpy
pandas
matplotlib
sklearn
nltk
| Model | Accuracy |
|---|---|
| Logistic Regression | 72.94% |
| MultinomialNB | 88.42% |
