laurahdezlorenzo / CSF_clustering Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Clustering of CSF biomarkers in different AD cohorts

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
figures		figures
manuscript_figures		manuscript_figures
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
clustering_statistics.ipynb		clustering_statistics.ipynb
clusters_description.py		clusters_description.py
datasets_description.ipynb		datasets_description.ipynb
kmeans_clustering.py		kmeans_clustering.py
prepare_datasets_ADNI.py		prepare_datasets_ADNI.py
prepare_datasets_HCSC.py		prepare_datasets_HCSC.py
supp_generalize_results.ipynb		supp_generalize_results.ipynb
survival_analysis.ipynb		survival_analysis.ipynb

Repository files navigation

AD_biomarkers_clustering

Description

This is the code repository for the paper entitled A data-driven approach to complement the A/T/(N) classification system using CSF biomarkers. The repository follows the methodology and results presented in the abovementioned work.

The Python scripts present in this repository are organized as follows:

prepare_datasets_HCSC.py - prepare data for HCSC dataset
prepare_datasets_ADNI.py - prepare data for ADNI dataset
kmeans_clustering.py - script for KMeans clustering using CSF biomarkers data fron the different data sources
clusters_description.py - main functions to obtain several metrics from the obtained clusters

Moreover, there are several Python Jupyter Notebooks done specifically to some tasks:

datasets_description.ipynb - dataset description statistics (number, sociodemo, MMSE, biomarkers values)
clustering_statistics.ipynb - clusters description statistics (number, sociodemo, MMSE, biomarkers values, tests)
survival_analysis.ipynb - survival analysis using Kaplan-Meier plots and Cox regression models

Other subdirectories present in this repository:

data contains several data files used in this work. Please note that data files are not available in this repository due to privacy reasons.
results SI scores os clustering results. Again, other results files are not available in this repository due to privacy reasons.
figures figures obtained for the manuscript.

Implementation

The code in this work was built using:

Scikit-Learn for building clustering models.
SciPy for statistical analyses.
lifelines for survival analyses.

Contact

Please refer any questions to: Laura Hernández-Lorenzo - GitHub - email

About

Clustering of CSF biomarkers in different AD cohorts

Report repository

Releases

No releases published

Packages

Languages