Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speed up #14

Open
mictadlo opened this issue Jan 3, 2019 · 3 comments
Open

speed up #14

mictadlo opened this issue Jan 3, 2019 · 3 comments

Comments

@mictadlo
Copy link

mictadlo commented Jan 3, 2019

Hi,
I compare 2 assemblies (2.7 G) together but satsuma is running for more than 140 hours. By any chance, does anyone knows whether there is a way to split the data and run it on multiple nodes?

Thank you in advance,

Michal

@jonwright99
Copy link
Contributor

Hi Michal,

For larger genomes (>1Gb), we recommend using one chromosome of one genome as the query sequence and the entire other genome as the target sequence, and process alignments one query chromosome at a time.

Best,
Jon

@mictadlo
Copy link
Author

mictadlo commented Feb 26, 2019

Hi Jon,
How would it be possible to merge Satsuma's output files so it would be possible to create a MizBee input file?

Thank you in advance,

Michal

@kushalsuryamohan
Copy link

Hi @mictadlo, were you able to resolve this issue?
If not, dear @jonwright99 / @bjclavijo or other developers of Satsuma2, I'm running into a similar issue here where I'm comparing 10 de novo genomes (chromosomal genome assemblies) against a reference genome. All genomes are ~1.5-1.8 Gb in size and the jobs are running for > 24 hrs.

If I were to run Satsuma2 for each chromosome, I would appreciate it if you could provide guidance on how to proceed with combining the output from Satsuma and generate a figure using the ChromosomePaint command.
Here's my guess at how to proceed:

  1. Generate chained outputs per reference genome chr
  2. Generate block display output per chr
  3. Chromosomepaint per chr
  4. Repeat steps 1-3 for all reference genome chromosomes
  5. Use an image editing tool such as Illustrator to combine the chromsomepaint outputs.

Here's my command per chromosome of the target reference genome:

#Chromosome 1 of target ref genome
SatsumaSynteny2 -t ref_chr1.fasta -q query_all_scaffs_1mb_longer.fasta -o . -slaves 6
BlockDisplaySatsuma -i satsuma_summary.chained.out -q query_all_scaffs_1mb_longer.fasta -t ref_chr1.fasta -s 1000000 > query_chr1_synteny_blockdisplay.txt
ChromosomePaint -i query_chr1_synteny_blockdisplay.txt -o Chrom_paint_query_vs_ref_chr1.ps

I have 10 genomes of interest so if there is an easier way to do this, that would be much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants