Skip to content

HarvardX: PH125.9x: Data Science - Capstone Breast Cancer Diagnosis Project

Notifications You must be signed in to change notification settings

altaflab/breast-cancer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

breast-cancer

HarvardX: PH125.9x: Data Science - Capstone Breast Cancer Diagnosis Project

This repo was created to file and share the second of two projects within the HarvardX Data Science Professional Certificate (see https://courses.edx.org/dashboard/programs/3c32e3e0-b6fe-4ee4-bd4f-210c6339e074/).

The objective of this project was to train different algorithms in order to accurately diagnosis breast cancer based on a prediction as to whether a given sample of cells was from a malignant (cancerous) or benign (non-cancerous) tumour mass. The algorithms were trained and tested on the Wisconsin breast cancer (diagnostic) data-set which is available to download from the UCI machine learning repository (see https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29).

This repo includes a script file (.R) which provides all of the code used for the exploratory analyses as well as for the development, testing and presentation of the results from each of the models used, a markdown file (.Rmd) and the final report (.pdf) that it was used to generate. The .Rmd file refers to a preamble.tex file which was created to relax the latex rules on floating figures/tables within the pdf report and a references.bib file which includes the references cited in the report in bibtex format. Both of these files are included in the repo for information.