Skip to content

martins-jean/Employee-turnover-prediction-in-R

Repository files navigation

Employee turnover prediction in R

Contextual overview

Employee turnover represents a major burden for companies because it leads to direct costs in the form of hiring costs, training costs, productivity loss, opportunity costs for accounts left unmanaged as well as indirect costs such as the loss of institutional knowledge and the impact on employee morale.

This is the second of series of projects on workforce analytics. I first explored the issue of employee turnover in python using a decision tree classifier, this time I will use a different dataset, model and language to dive deeper into how employee turnover can be analyzed and predicted.

Project objectives

  1. Calculate the turnover rate and explore it across different dimensions.
  2. Identify talent segments and combine relevant data from multiple HR data sources to derive better insights.
  3. Use feature engineering to create new variables and exemplify the concept of information value (IV).
  4. Build a logistic regression model to predict turnover while accounting for multicollinearity among variables.
  5. Evaluate the accuracy of the model and categorize employees into specific risk buckets.
  6. Formulate an intervention strategy and estimate its return on investment (ROI).

Reproducibility guidelines

For notebook-based projects, please refer directly to the Google Colab notebook I uploaded to this repository.

Technologies

  • R libraries:
    readr
    dplyr
    ggplot2
    lubridate
    Information
    caret

About

Predicted turnover with a multiple logistic regression model in R.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages