Skip to content

Latest commit

 

History

History
74 lines (50 loc) · 2.81 KB

README.md

File metadata and controls

74 lines (50 loc) · 2.81 KB

CBCE

This repository provides an R package for Multi-view data analysis. Consider two types of high-dimensional measurements on the same samples. CBCE (Correlation Bi-Community Extraction method) finds a set of features A, from the first measurement type, and set of features B, from the second measurement type, such that features in A and B are correlated to each other in aggregate.

Formally the pair (A,B) is called a bimodule and the algorithm called the Bimodule Search Procedure (BSP) is introduced in [1]. We have used this method for the analysis of multi-view data in areas like genomics and climate science.

Features of CBCE

  • RCpp implementation of the iterative testing framework; multicore if using ROpen.
  • Multiple backends to calculate p-values. It is also easy to use your own backend.
  • Code tested using testthat.
  • A simple GUI interface to monitor progress and terminate early.
  • Documented using Roxygen and pkgdown.

How to install CBCE

You can install the latest version of cbce directly from the github repo by first installing devtools.

if("devtools" %in% rownames(installed.packages()) == FALSE) {
  install.packages("devtools")
}
devtools::install_github("miheerdew/cbce")

MacOS compilation issues

Updated on July 2023.

If you are facing compilation issues on MacOS due to the inability to load the gfortran library, you may need to check your setup. I was able to address this issue on my computer by using the macrtools::gfortran_install() function from the macrtools package.

Example usage

library(cbce)

#Sample size
n <- 40
#Dimension of measurement 1
dx <- 20
#Dimension of measurement 2
dy <- 50

#Correlation strength
rho <- 0.5

set.seed(1245)

# Assume first measurement is gaussian
X <- matrix(rnorm(dx*n), nrow=n, ncol=dx)

# Measurements 3:6 in set 2 are correlated to 4:7 in set 1
Y <- matrix(rnorm(dy*n), nrow=n, ncol=dy)
Y[, 3:6] <- sqrt(1-rho)*Y[, 3:6] + sqrt(rho)*rowSums(X[, 4:5])

res <- cbce(X, Y)

# Recovers the indices 4:5 for X and 3:6 for Y
# If the strength of the correlation was higher
# all the indices could be recovered.
res$comms

Documentation

More information is available on the software webpage.

Acknowledgement

This project has been funded by NIH R01 HG009125-01 grant.

References

[1] Dewaskar, Miheer, John Palowitch, Mark He, Michael I. Love, and Andrew Nobel. "Finding Stable Groups of Cross-Correlated Features in Multi-View data." arXiv preprint arXiv:2009.05079 (2020).