Skip to content

Latest commit

 

History

History
43 lines (32 loc) · 1.2 KB

README.md

File metadata and controls

43 lines (32 loc) · 1.2 KB

Sentiment Analysis with Logistic Regression

This repository contains a jupyter notebook and the necessary data to implement sentiment analysis of tweets using Logistic Regression. Please open the notebook for more information.

The dataset

The dataset was obtained from a Kaggle competition. The dataset is divided into a train and a test dataset. Each record contains the following fields:

Field name Meaning
ItemID id of twit
Sentiment sentiment (1-positive, 0-negative)
SentimentText text of the twit

Web app

You can go straight ahead and try out the algorithm with a small web app I have included in this repository, just run:

cd site
python app.py

Then open a browser in the default address (http://127.0.0.1:5000/) and play around:

web

Requirements

This notebook will run in Python >= 3.5. The following packages are required:

  • bokeh
  • flask
  • nltk
  • numpy
  • pandas
  • scikit-learn

Limitations

Because the training set contains only English twits, this classifier will only work with English twits.