Skip to content

This repo contains the codes, images, report and slides for the project of the course - MTH552A: Statistical & AI Techniques In Data Mining at IIT Kanpur during the academic year 2022-2023.

Notifications You must be signed in to change notification settings

ArkaB-DS/SpectralClustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SpectralClustering

This repo contains the codes, images, report and slides for the project of the course - MTH552A: Statistical & AI Techniques in Data Mining at IIT Kanpur during the academic year 2022-2023.

Project Title

Spectral Clustering: Theory and Applications [Report]

Abstract

In this report, we present a class of popular clustering algorithms called Spectral Clustering algorithms. We introduce graph theoretic notations required to understand the report. We discuss similarity graphs and graph Laplacians, along with their important properties. Three popular clustering algorithms are presented. Choice of optimal number of clusters, similarity functions, similarity graphs and graph Laplacians are also discussed. We then present Spectral clustering through different looking glasses. Finally, we apply Spectral clustering to simulated and real life datasets. This report is primarily based on [1].

Table of Content

Section Topic
1 Introduction
2 Graph Theory: Some defintions and notations
    2.1 Clustering problem formulation based on graphs
    2.2 Notations
3 Similarity Graphs
    3.1 Construction
      3.1.1 Choice of similarity function
      3.1.2 Choice of similarity graph
      3.1.3 Choice of similarity graph parameters
4 Graph Laplacian: Different types and their properties
    4.1 The unnormlaized graph Laplacian
    4.2 The normlaized graph Laplacian
5 Three Spectral Clustering Algorithms
6 Choice of cluster number
7 Choice of graph Laplacian
    7.1 Clustering Objectives
    7.2 Asymptotics
8 Spectral Clustering: Different Points of view
    8.1 Graph Cut point of view
      8.1.1 Approximating RatioCut for $k=2$
      8.1.2 Approximating RatioCut for arbitrary k=2
      8.1.3 Approximating NCut for $k=2$
      8.1.4 Approximating NCut for arbitrary k
    8.2 Random Walk point of view
9 Applications
    9.1 Synthetic Data
      9.1.1 k-means vs. Spectral Clustering
      9.1.2 Image Segmentation
    9.2 Real Data
      9.2.1 Iris dataset

Primary Reference

[1]. Von Luxburg, Ulrike. "A tutorial on spectral clustering." Statistics and computing 17, no. 4 (2007): 395-416.

About

This repo contains the codes, images, report and slides for the project of the course - MTH552A: Statistical & AI Techniques In Data Mining at IIT Kanpur during the academic year 2022-2023.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published