Skip to content

Notebooks for "A topic model analysis of TCGA transcriptomic data of breast and lung cancer"

License

Notifications You must be signed in to change notification settings

fvalle1/topicTCGA

Repository files navigation

Docker Image CI DOI

A topic model analysis of TCGA

Notebooks and libraries for "A Topic Modeling Analysis of TCGA Breast and Lung Cancer Transcriptomic Data"

Analyse results

In order to analyse results and reproduce plots in the paper without rerunning hSBM use the following notebook hSBM_postprocess.ipynb

This repository, following the structure of the paper, is divided into three parts. See Readme.md in each folder for a detailed description of the specific pipeline.

breast

breast analyses, stochastic block modelling and predictor

lung

lung analyses, stochastic block modelling, survival analysis and predictor

unified lung

lung data from unified dataset as discussed in the paper

tree plotter

A submodule useful to plot hierarchies

Run

You can simply create a Docker container with all dependencies installed

docker run -v $PWD:/home/jovyan/work -p 8888:8888 --rm -it --name topic_tcga docker.pkg.github.com/fvalle1/topictcga/topic:latest

then point your browser to localhost

hSBM_Topicmodel

The run_graph.ipynb notebook can be used to run hierarchical Stochastic Block Modelling.

Data

The data processed in our analysis when not available trough git can be accessed via DataVersionControl

dvc pull -r mydrive name_of_the_file_to_download.dvc

License

Please see LICENSE