Skip to content

Using association rules to understand which variables affect each other, either negatively or positively or independently.

Notifications You must be signed in to change notification settings

Mathurkarishma/burn-victim-factors

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Logo

Burn Victim Factors & Characteristics

Using association rules to learn what causes major and minor burns.
Explore the docs »

Report Bug · Request Feature

Table of Contents

  1. About The Project
  2. Getting Started
  3. Usage
  4. Conclusion
  5. Contact
  6. Acknowledgements

About The Project

We will be speaking into a Burn Study dataset and deciphering association rules between each of the variables. Association analysis or mining is used to discover relationships within a dataset to help analysts understand the behavior and utilize the knowledge to make key decisions. The Apriori Algorithm of Agrawal and Srikant is the most popular method used to do this, and we will be applying this method. We want to analyze, evaluate, and capture the results of these association rules with the burn study dataset in order to make informed decisions.

Here is a link to the Burn Study dataset information.

Built With

Getting Started

To get a local copy up and running, download the apriori_burn.R and the text input file, burn.csv. Then run the code in an IDE software, such as RStudio. Set the working directory to the location of the CSV file.

Usage

The code guides you through the following:

  1. Importing the CSV file
  2. Visualizing the formatting of the variables (datatypes, number of rows/columns, measures of central tendancy, statistical descriptions, etc.)
  3. Exploring through histograms to find interesting variables
  4. Pre-processing such as cleanup, reduction, and transformation (we removed key identifiers due to no added value, perfomed discretization, and factoring)
  5. Perform the Apriori Algorithm, generate rules, and inspect those rules
  6. Change parameters to improve accuracy
  7. Visualize our rules using a matrix plot

Conclusion

The below plot shows the antecedents and the y-axis shows the 2 consequents. The lift ratio color key on the right side of the plot shows dark red as the highest lift ratio, or the strongest rule, and the lightest red as the lowest lift ratio, or the weaker rule. However, the “weaker” rule here has a lift ratio above 1, which is still quite strong. What can be seen here is that the strongest rules apply to burns not caused by flames, since the darkest red is showing in the top half of the graph. The “weakest” rule applies to burns caused by flames, since the lightest color is on the bottom half of the graph. It is interesting to see a greater number of correlations for non-flame related burns, which is very telling. The individuals in this dataset needed to visit a burn facility due to either not seeing flames and accidentally burning themselves or not realizing something was highly flammable and burning themselves. Thus, I would assume that a majority of burns occur by accident.

apriori

Contact

Karishma Mathur - [email protected]

Project Link: https://github.com/Mathurkarishma/burn-victim-factors

Acknowledgements

About

Using association rules to understand which variables affect each other, either negatively or positively or independently.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages