Skip to content

PanACEA is a suite of PERL scripts that allows for visualization of prokaryotic pan-chromosomes, including core and variable regions.

License

Notifications You must be signed in to change notification settings

JCVenterInstitute/PanACEA

Repository files navigation

PanACEA (Pan-genome Atlas and Chromosome Explorer and Analyzer)

PanACEA is a suite of PERL scripts that allows users to create an interconnected set of html, javascript and json files that allows for visualization of prokaryotic pan-chromosomes, including core and variable regions. Users can also include additional information such as gene annotation, metadata annotation, and genome phylogenies. Detailed information about the PanACEA suite and usages can be found in the PDF manual included on the site.

Also included in the Github is a set of exemplar data from a PanOCT run on 19 Acinetobacter baumannii complete genomes and a shell script which can generate the PanACEA output files from the examples is available as a demonstration.

Additionally, the gzipped output from a PanACEA run using the same data and the gzipped PanACEA output from the finished PanOCT run on Enterobactor hormaechei from Chavda et al 2016 are available.

Example data

The example data is the raw output from a PanOCT run on 19 A. baumannii genomes plus annotation that will allow the user to observe all the necessary steps to move from the raw files to the html files. Unix shell users such as Mac and Linux users can use the "./run_panacea_example.sh" in the git download directory to run all the steps. Likewise, Windows users can use the batch file in the same directory called "run_panacea_example.bat" to generate the PanACEA files. Finally, I have included a list of all four commands with $DIR as the git root directory:

  1. Makes the PanACEA flatfile
  • perl $DIR\make_panacea_flatfile.pl -i $DIR\example_dir\ -o $DIR\example_dir\PanACEA.flatfile.txt -t Pangenome
  1. Makes the ARO list files from the RGI dataSummary table
  • perl $DIR\make_rgi_clusters.pl -i $DIR\example_dir\dataSummary.txt -o $DIR\example_dir/aro_centroid.list.txt
  1. Makes the configure file for the annotation files
  • perl $DIR\make_conf_file.pl -d $DIR\example_dir\ -o $DIR\example_dir\ -a $DIR\aro.obo -g $DIR\gene_ontology.1_2.obo
  1. Makes all the output HTML, JSON and javscript files
  • perl $DIR\make_panacea.pl -i $DIR\example_dir\PanACEA.flatfile.txt -o $DIR\example_dir\ -f $DIR\example_dir\func_file.conf.txt -t $DIR\example_dir\genomes.tree -a $DIR\example_dir\cluster_alignments,$DIR\example_dir\combined.fasta

Included PERL scripts

make_panacea.pl

make_panacea.pl generates an interconnected set of html, javascript and json files from a PanACEA flat file generated from a possible suite of programs, including PanOCT, that allows users to explore multiple related prokaryotic genomes through their pan-chromosomes, including core and flexible regions. While the basic version only shows the location and relative position of the core and flexible regions, the user can also add information on the gene cluster annotation and alignment as well as the genomic annotation and phylogeny to obtain a more complete view of the relationship between the genomes, their function and any regions with their associated genes that might drive this relationship. The end result is a central html file that can be opened in many web browsers.

running make_panacea.pl

make_panacea.pl [options]
Options:
	-i	--input 	Input PanACEA Flat File. Required.
	-o	--output 	Output Directory for all the files. Default is current directory.
	-n	--name		Header name of HTML and SVG files. Default is \"PanHTML\"
	-d	--dir_name	Root directory of the HTML for use in the browser. Default is current directory.
	-f	--function	File containing information about additional gene cluster annotation.
	-a	--fasta		Either directory containing the cluster multiple alignment files or a single fasta file. Multiple inputs could be added in a comma separated list
	-t	--tree		File containing the phylogeny of the genomes in newick format. (Requires BioPerl to use)
	-m	--metafiles	File containing group metadata for the genomes. Only used in connection with tree.
	-g	--graphic	File containing the graphics configure file to change the size of the whole image and parts of the image

make_panacea_flatfile.pl

make_panacea_flatfile.pl generates the panchromosome PanACEA flat file from pangenome output from various programs, including PanOCT. This file is required to run the PanACEA visualizer.

running make_panacea_flatfile.pl

make_panacea_flatfile.pl [options]
Options:
	-i	--input 	Input directory of the Pangenome information. Required.
	-o	--output 	Output file. Default is PanACEA.[date].txt
	-t	--type		Input type of the Pangenome. Currently supports "Pangenome" <default>, Pangenome "Iterative", and "Panoct".

make_conf_file.pl

make_conf_file.pl generates a functional configuration file used by make_panacea.pl to color the genes and make the tables for the terms. It is designed to work on the output of the JCVI Pangenome pipeline.

running make_conf_file.pl

make_conf_file.pl [options]
Options:
    -d	--dir input pangenome directory. Default is current directory
	-o	--out output directory. Default is current directory
    -g	--go location of go term obo file. Default is current directory
    -a	--aro location of aro obo file. Default is current directory
    -h 	--help 

make_rgi_clusters.pl

make_rg\i_clusters.pl translates the RGI output file dataSummary.txt into the PanGenome annotation table format. Required to make it readable for PanACEA. Either best or all hits can be outputed

running make_rgi_clusters.pl

make_rgi_clusters.pl [options] -d directory -t type -o outputFlatFile
Options:
    -i	--input Location of the dataSummary.txt file. Required
	-o	--output output file. Default is "aro_centroid.list.txt" in the current directory
    -t 	--type Output type of the file, either "best" (only the best hit) or "all". Best is default 

Included Other Files

PanACEA.manual.pdf

PDF manual including screen shots and usage guides for both the command line and the web interface for PanACEA

run_panacea_example.sh

Shell script to run the scripts required to build the PanACEA web pages for the example data

example_dir/

Directory containing the example PanACEA data derived from the PanOCT run on the 19 Acinetobacter genomes:

Ehormaechei_PanACEA.tar.gz

Compressed file containing the multi-file output of the PanACEA run on Enterobactor cited in Chan et al 2016.

Acinetobacter_PanACEA.tar.gz

Compressed file containing the multi-file output of the PanACEA run on Acinetobacter genomes in the example files

About

PanACEA is a suite of PERL scripts that allows for visualization of prokaryotic pan-chromosomes, including core and variable regions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages