scANTIPODE

NOTICE: This package will be under heavy development until publication, and will be subject to changes until release 0.1

Single Cell Ancestral Node Taxonomy Inference by Partitioning Of Differential Expression. The model is an extension of the SCVI paradigm--a structured generative, variational inference model developed for the simultaneous analysis (DE) and categorization (taxonomy generation) of cell types across evolution (or now any covariate) using single-cell RNA-seq data. Long ago it began as a hack of a simplified model of scANVI and is built on the pytorch-based PPL pyro. The model acts as an integration method, that learns interpretable differential expression in the process. Note that this means ANTIPODE will fail to integrate datasets of different datasets, or datasets with large disparities in quality or gene mean dispersions.

The complete procedure runs in 3 phases (but can also run fully supervised using only phase 2):

The Fuzzy Phase: Cells may belong to multiple types sampled from a bernoulli distribution, learns an integrated latent space with covariate effects, but is less straightforward to interpret.
The Supervised Phase: Discrete clustering is initialized from a supervised initialization (or defaults to a de novo k-means clustering in the latent space). Can take a supervised clustering and/or latent space for cells.
The Free Phase: All parameters are released for unconstrained learning.

You can read about the generative model here. You can look at example runs here.

Installation

First create a conda environment with python >= 3.10

git clone git@github.com:mtvector/scANTIPODE.git
#cuda 11.7 should work too
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda install jax jaxlib -c conda-forge
cd scANTIPODE
pip install -e .

Please reach out to let me know if you try ANTIPODE on a dataset and it works (or doesn't work)... The model is (forever) a work in process!

Note that the model can be VRAM hungry, with parameters scaling by #covariates x #genes x #clusters|#modules... if you run out of vram, you might need to 1. fix a GPU memory leak, 2. use fewer genes/latent dimensions/cluster, 3. get a bigger GPU

Coming soon

Improved plotting functionality
Expanded tutorials
PyPI release
Gene expression histogram normalization
Phylogeny regression

Next challenges

Parameter variance estimation
Improved clustering

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

scANTIPODE

Installation

Coming soon

Next challenges

Files

README.md

Latest commit

History

README.md

File metadata and controls

scANTIPODE

Installation

Coming soon

Next challenges