Skip to content

datagseoane/Movie-Recommendation-System

Repository files navigation

Sprint 11_Final Project

This repository houses the solution to my Final Project, which involved developing a movie recommendation algorithm by leveraging data from the extensive MovieLens database.

My name is Guillermo Seoane and I'm a #DataScience student at IT Academy by Barcelona Activa

Movie-Recommendation-System

🎬 Context:

The main objective of this project is to develop a movie recommendation algorithm that can utilize the MovieLens dataset to provide personalized recommendations to users. To achieve this objective, the following specific goals have been defined:

  • Explore and understand the MovieLens dataset.
  • Identify patterns and trends in the data that can be used for personalized recommendations.
  • Develop and test two different recommendation methods: cosine similarity and Pearson correlation coefficient.
  • Use Gephi for graph visualization and analysis to gain a better understanding of the network structure of users and movies in the dataset.

📚 Archives:

In the repository you can find:

  • Dataset: The MovieLens dataset is a collection of movie ratings collected by the University of Minnesota Lens Group.
  • Algorithm: movie recommendation algorithm
  • Graph: Use Gephi for graph visualization and analysis to better understand the structure of the network of users and movies in the dataset.
  • A paper & presentation.pdf explaining how I have reach the solution

🦾 Dataset Dictionary:

The main variables in the MovieLens dataset are the following:

  • userId: a unique identifier for each user who rated movies in the dataset.
  • movieId: a unique identifier for each movie in the data set.
  • rating: a numerical rating from 1 to 5 that the user gave the movie.
  • timestamp: the date the user rated the movie.
  • title: the movie title.
  • genre: the genre of the film, such as comedy, drama, or action.

📊 Visualización

The graph visualization and analysis with Gephi have helped to gain a better understanding of the network structure of users and movies in the dataset, and have aided in identifying patterns and trends that can be used to further enhance the performance of the algorithm.

gephi