Skip to content

Real-Time Analysis of Twitter hashtags with Apache Storm

Notifications You must be signed in to change notification settings

SunNEET/Storm-Twitter-Hashtags

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Storm-Twitter-Hashtags

It's a course project from Udacity. A stream processing pipeline which performs the real time analysis of top hashtags on Twitter, using Apache Storm.

Demo

Requirements

  • Vagrant - virtual environment manager.
  • Oracle VM VirtualBox - general purpose virtualizer .
  • SSH client, such as PuTTY
  • Java 8 or newer version - Otherwise, unsupported major.minor issue will happen.

Getting Started

  1. Spin up the VM: vagrant up
  2. Using SSH client, SSH vagrant ssh
    2.1. cd ..
  3. Run the visualization web server
    3.1. Inside the VM: cd /vagrant/viz
    3.2. python app.py
  4. Package the topology
    4.1. Inside the VM (open new SSH session): cd /vagrant
    4.2. mvn clean
    4.3. mvn package - may take a while the first time.
  5. Execute the packaged topology
    5.1. Inside the VM: cd /vagrant
    5.2. storm jar target/storm-twitter-top-hashtags-0.0.1-SNAPSHOT-jar-with-dependencies.jar storm.TopNTweetTopology
  6. Live generated results at http://127.0.0.1:5000.
  7. Shutdown the VM: vagrant halt

Topology

topology

About

Real-Time Analysis of Twitter hashtags with Apache Storm

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages