In today’s busy world, finding and dating a romantic partner seems more time consuming than ever. As a result, many people have turned to speed dating as a solution that allows one to meet and interact with a large number of potential partners in a short amount of time. In this report, we want to explore what people are looking for in their speed dating matches, what it takes to become successful in getting approvals from a potential partner, if there exist any gender differences, and if any other factors (such as the order you met your partner) influence peoples’ decisions. Finally, we’d like to determine if people really know what they want by comparing their self-reported answers to what actually influences peoples’ decisions.
The data set we will explore in the project is named Speed Dating Experiment, as found on Kaggle.com. It was compiled by professors Ray Fisman and Sheena Iyengar from Columbia Business School, originally used for their paper Gender Differences in Mate Selection: Evidence From a Speed Dating Experiment. And later for Racial Preferences in Dating It was generated from a series of experimental speed dating events from 2002
to 2004
and includes data related to demographics, dating habits, lifestyle information, an attribute evaluation questionnaire taken when the participants sign up, and each participant’s ratings for others during the 4 minute interactions. Finally, individuals were asked if they would like a second date with their partners and rated again on similar questions after the event, when matches have met with each other and dated for several times.
The aim of this project is to get a hands-on practice on typical workflows usually seen in any machine learning project. Below mentioned is the basic workflow for this project:
- Exploratory Data Analysis (EDA)
- Pre-processing
- Modelling
- Test Analysis
Note: For more details, please refer to the notebook for each part. For visualizations, you can have a look at this
We have a pretty straightforward setup.
- Set up a new Python env
If you have conda
installed, run the following command
conda create -n machine_learning python=3.8 -y
- Now clone the repository
git clone https://github.com/mohammadzainabbas/Speed-Dating-ML.git
cd Speed-Dating-ML/
- Install the required dependencies
conda activate machine_learning
pip install -r requirements.txt
Now, everything that you need is installed and ready to go.
Model | Accuracy | Precision | Recall | F1 score (macro) |
---|---|---|---|---|
Logistic Regression |
0.754 |
0.721 |
0.678 |
0.745 |
Support Vector Machine |
0.747 |
0.713 |
0.674 |
0.738 |
k-Nearest Neighbours |
0.665 |
0.630 |
0.511 |
0.644 |
Gradient Boosting |
0.755 |
0.725 |
0.680 |
0.746 |
Voting Ensemble |
0.751 |
0.739 |
0.634 |
0.738 |
Stacking Ensemble |
0.756 |
0.726 |
0.679 |
0.747 |