Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



10 Commits

Repository files navigation


Accompanying data and source code for the manuscript

The different subdirectories contain:

  • 00-algorithm: source code for our network generation algorithm, and a raw mySQL dump of the database used to perform the study
  • 01-pathways-per-genus: source code for the pathway inference algorithm
    • 01-pathways-per-genus/validate_inference: code and data for validating genus-level core genome prediction (used here for pathway inference) using MAGs and isolate genomes
  • 02-network-generation: commands and auxiliary data used for network generation and annotation
    • 02-network-generation/ commands used to generate the individual per-environment networks
    • 02-network-generation/01-output: output of running the commands in 02-network-generation/ For each environment, this includes a presence-absence matrix of genera in samples, and the resulting networks in the gpickle and xml formats
    • 02-network-generation/ commands used to combine the individual environmental networks into a multi-environment network, and perform phylogenetic and functional annotation
    • 02-network-generation/02-output: output of running the commands in 02-network-generation/
      • 02-network-generation/02-output/merged6.2.unlooped.avgFunPhyl.pathways.xml: final combined and annotated network
      • 02-network-generation/02-output/merged6.2.networkTable.csv: network table containing annotations for each node in the network
      • 02-network-generation/02-output/consensusNet.sif: consensus network (edge support > 70) in the SIF format
    • 02-network-generation/goodMetaCyc: accompanying data
      • 02-network-generation/goodMetaCyc/oct2020.combined_noPWY0-1324.tsv: fraction of genomes from each genera containing any given pathway
      • 02-network-generation/goodMetaCyc/phylodist_clean.table.tsv: phylogenetic distances between genera
      • Other accessory files linking MetaCyc pathway IDs to pathway names and broader functional categories
  • 03-analysis: R code used to analyze the results, and the resulting figures

The following conda environment should provide the libraries required to run the different steps of the analysis: conda create -c conda-forge -c r -n microbialNetworks networkx==1.11 lxml pandas scipy rpy2 mysqlclient cython r r-ade4 r-ggplot2 r-reshape2 r-purrr r-gplots r-dendextend r-svglite r-stringr For reproducing the results in 01-pathways-per-genus/validate_inference, mOTUlizer can be installed with python3 -m pip install mOTUlizer==0.2.4


No description, website, or topics provided.







No releases published
