Skip to content

CMU 95791-Data Mining Group Project on Recidivism Forecasting

Notifications You must be signed in to change notification settings

TINAF1109/Recidivism-Forecasting

Repository files navigation

Recidivism-Forecasting

95-791 Data Mining (Fall 2021) - Final Project - README

Name: Jamie Lim, Thomas Tam, Tina Feng


This repository contains

  • The original dataset: NIJ_s_Recidivism_Challenge_Full_Dataset.csv
  • The cleaned dataset: df_cleaned.csv
  • 4 jupyter notebook files
    • part1_intro+data_cleaning.ipynb
    • part2_question1.ipynb
    • part3_question2.ipynb
    • Part4_question3+summary.ipynb
  • 4 html files of jupyter notebook slides
  • codebook

Run the jupyter notebooks in orders as follows:

  • part1_intro+data_cleaning.ipynb
    • Contains data and problem overview, EDA, and preliminary test
    • Generates the cleaned dataset: df_cleaned.csv
  • part2_question1.ipynb
    • Contains for modeling and analysis for question 1
  • part3_question2.ipynb
    • Contains for modeling and analysis for question 2
  • part4_question3+summary.ipynb
    • Contains for modeling and analysis for question 3
    • Contains conclusions, discussions, and future work