Skip to content

Analyzed 157 US Energy stocks (Jan-Dec '23), identified Bullish/Bearish trends and risk categories. Used KMeans, Hierarchical, Spectral Clustering, revealing balanced returns and low volatility. Integrated data with Kafka for seamless subscriptions.

License

Notifications You must be signed in to change notification settings

razamehar/Financial-Stock-Analysis-and-Clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Financial Stock Analysis and Clustering

Project Overview

In a thorough analysis of 157 US Energy stocks from January to December '23, I employed advanced clustering techniques, including KMeans, Hierarchical, and Spectral Clustering, to identify bullish and bearish trends. The clustering methodology revealed overbought and overpriced stocks, profitable investments, and stocks facing losses. Additionally, a focus on volatility analysis pinpointed the most volatile stocks in the dataset. This comprehensive approach enables investors to make informed decisions and manage risks effectively. Furthermore, seamless integration with Kafka ensures real-time updates and subscriptions for stakeholders, enhancing accessibility to critical market insights.

Analysis Steps

  1. Data Distribution Analysis:

    • Analyzing the distribution of data for insights into stock performance.
  2. Feature Engineering:

    • Enhancing data features to improve the accuracy of subsequent analyses.
  3. Technical Indicators:

    • Employing technical indicators to evaluate stocks as Bullish, Bearish, or Overbought/Oversold.
  4. Risk Analysis:

    • Categorizing stocks as Profitable, Unprofitable, or Volatile.
  5. Cluster Analysis

    • Utilizing clustering techniques like KMeans, Agglomerative Hierarchical, and Spectral Clustering to identify distinct groups among the analyzed stocks.

Observations

Stock Data Distribution Analysis

Top and Bottom 10 Stocks

Clusters Visualization

Findings

The analysis reveals a common theme among energy sector stocks, offering a balance of medium to high returns with relatively low volatility.

Integration with Kafka

The processed data is seamlessly integrated with Kafka, enabling subscription by other systems for further analysis.

Usage

Prerequisites

  • Python 3.10.12
  • Required Python packages:
    • kafka-python
    • yfinance
    • pyspark
    • TA-Lib

Environment Setup

  1. Install the required libraries using the following commands:

    pip install kafka-python
    pip install yfinance --upgrade --no-cache-dir
    pip install pyspark
    pip install TA-Lib
    
  2. Download and instal TA-Lib using the following instructions:

    wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
    tar -xzvf ta-lib-0.4.0-src.tar.gz
    cd ta-lib/
    ./configure --prefix=/usr
    make
    make install
    cd /content
    

License

This project is licensed under the Raza Mehar License. See the LICENSE.md file for details.

Contact

For any questions or clarifications, please contact Raza Mehar at [[email protected]], Pujan Thapa at [[email protected]] or Syed Najam Mehdi at [[email protected]].

About

Analyzed 157 US Energy stocks (Jan-Dec '23), identified Bullish/Bearish trends and risk categories. Used KMeans, Hierarchical, Spectral Clustering, revealing balanced returns and low volatility. Integrated data with Kafka for seamless subscriptions.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published