Twitter Sentiment Analysis with Lambda Architecture

About the Project

The sentiment analysis or opinion mining is the process of determining if a particular block of text is expressing a positive or negative reaction upon something. The goal of this project is to present a functioning Lambda Architecture built to compute a sentiment analysis upon tweets, according to specific keywords.
The Implementation of the structure of the Lambda Architecture was made with Apache Hadoop for the Batch Layer, Apache Storm for the Speed Layer and Apache HBase for the Serving Layer.
To replicate the stream of tweets it was used the Twitter API, through the Twitter4J library.
A GUI, made with JavaFX, is provided to make easier the user experience. LingPipe was used of process the tweets.

Built with

Apache Hadoop(3.2.1) : Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.
Apache Storm(2.1.0) : Apache Storm is a free and open source distributed realtime computation system.
Apache HBase(2.3.4) : Apache HBase is an open-source, distributed, versioned, non-relational database.
Twitter4J : Twitter4J is an unofficial Java library for the Twitter API. With Twitter4J, you can easily integrate your Java application with the Twitter service.
LingPipe(4.1.0) : it is a tool kit for processing text using computational linguistics.
JavaFX : JavaFX is an open source, next generation client application platform for desktop, mobile and embedded systems built on Java.

Datasets

Sentiment140 : This is the sentiment140 dataset. It contains 1,600,000 tweets extracted using the twitter API.
FullCorpus

Usage

To replicate the code is necessary to get your own Twitter Developer Credentials and replace them in the placeholder text file in the repo. Next you need to start the server by running respectively Apache Hadoop, Storm and HBase.
Then run the ClassifierLambdaArchitecture to train and store the model that will be required by the Lambda Architecture. So you have to set the datasets paths and the the file to store the classifier model.
Finally execute the class in the following order:

Topology need to set as args the keywords for the query
BatchDriver
GUILauncher

Authors

Lorenzo Gianassi

Acknowledgments

Parallel Computing Project © Course held by Professor Marco Bertini - Computer Engineering Master Degree @University of Florence

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.idea		.idea
Images		Images
src/main		src/main
README.md		README.md
Sentiment_Analysis_Lambda_Architecture.pdf		Sentiment_Analysis_Lambda_Architecture.pdf
TwitterDevCredentials		TwitterDevCredentials
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Twitter Sentiment Analysis with Lambda Architecture

Table of Contents

About the Project

Built with

Datasets

Usage

Authors

Acknowledgments

About

Releases

Packages

Languages

LorenzoGianassi/Twitter_Sentiment_Analysis_Lambda_Architecture

Folders and files

Latest commit

History

Repository files navigation

Twitter Sentiment Analysis with Lambda Architecture

Table of Contents

About the Project

Built with

Datasets

Usage

Authors

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages