Skip to content

Modified Newman-Girvan modularity applied in transactional data to perform clustering.

Notifications You must be signed in to change notification settings

khanhbrandy/Graph_based_clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Created on 2019-08-20

Project description

Implement Newman-Girvan modified modularity algorithm from scratch and apply in transactional data to perform clustering.

Dataset Information

Transactional dataset contains transactions of a retail company in a given period.

  • InvoiceNo: Invoice number. Nominal, a 6-digit integral number uniquely assigned to each transaction. If this code starts with letter 'c', it indicates a cancellation.

  • StockCode: Product (item) code. Nominal, a 5-digit integral number uniquely assigned to each distinct product.

  • Quantity: The quantities of each product (item) per transaction. Numeric.

Graph mining techniques

The transactional data will be loaded then treated as bipartite graph with pre-defined sources and targets. The bipartite graph (B) then will be trasformed into weighted undirected graph (G) to be analysed using modified Newman Girvan modularity.

Detected communities after (n) optimal splits having maximum modularity are considered optimal clusters of items (StockCode) and will be compared with the results of other traditional clustering techniques.

About

Modified Newman-Girvan modularity applied in transactional data to perform clustering.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages