Using a neural network model to determine what causes diabetes in this community.
Explore the docs »
Report Bug
·
Request Feature
We will be speaking into a dataset that describes the medical records for Pima Native Americans and whether or not each patient will have an onset of diabetes. The goal is to determine which factors have higher chances to be leading causes of diabetes in the Pima Native American community. By using a neural network model, one should expect to determine the significance behind certain characteristics when estimating either a diabetic or non-diabetic patient.
Here is a link to the Pima Native American Diabetes dataset information.
To get a local copy up and running, download the neural-network.R
and the text input file, pima_diabetes.csv
. Then run the code in an IDE software, such as RStudio. Set the working directory to the location of the CSV file.
The code guides you through the following:
- Importing the CSV file
- Visualizing the formatting of the variables (datatypes, number of rows/columns, measures of central tendancy, statistical descriptions, etc.)
- Download packages to build a neural network model
- Scale the independent variables to help speed up the learning phase of the model
- Exploring through histograms to find interesting variables
- Set the seed to allow for reproducability and split the dataset into a training set and test set
- Perform the neural network model and choose initial parameters, such as weights and hidden layers
- Visualize neural network when plotted and evaluate confusion matrix
- Change parameters to improve accuracy
- Compare model evaluation methods such as sensitivity, specificity, positive prediction value, negative prediction value, and prevalence of the data
Karishma Mathur - [email protected]
Project Link: https://github.com/Mathurkarishma/pima-diabetes
- Dr. Firdu Bati at University of Maryland, Global Campus - Fall 2019
- Pima Native American Diabetes Dataset Description