Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in during reference based clustering (line 35: 38236 Floating point exception(core dumped)) #89

Open
ValeriiaLadyhina opened this issue Feb 21, 2024 · 2 comments

Comments

@ValeriiaLadyhina
Copy link

Description of the problem

  1. Question regarding memory: I ran

/export/opt/sw/opera-ms/0.8.3/OPERA-MS//tools_opera_ms//mash dist -p 20 -d 0.90 /proj/pig_amresistance/NOBACKUP/Pilot_study/Tem_comp/OPERA_MS_DB/genomes.msh /proj/pig_amresistance/NOBACKUP/Pilot_study/Tem_comp/OPERA_MS/Sample_1_5A/intermediate_files/reference_clustering/MASH//PARTIAL_SKETCH//partial_Sketch0.msh > /proj/pig_amresistance/NOBACKUP/Pilot_study/Tem_comp/OPERA_MS/Sample_1_5A/intermediate_files/reference_clustering/MASH//mash_dist_0.dat

on few different settings 1) 20 CPU 20G RAM, 2) 15 CPU 30G RAM 3) 10 CPU 50G RAM (have max 500 RAM) and I always get this Floating point exception(core dumped) . The error appears within max 40 seconds after I start the job.

Sizes of files that I am working with 11.31 Gbpls for paired Illumina and 2.113282Gbp nanopore reads (environment samples).

Can you also try to run mash help command to see if this is not a compilation/system issue?

There is no problem with mash help.

@jsgounot
Copy link
Contributor

Thank you Valeriia. This one will be difficult to resolve remotely. How big are the two mash files?

/proj/pig_amresistance/NOBACKUP/Pilot_study/Tem_comp/OPERA_MS_DB/genomes.msh
/proj/pig_amresistance/NOBACKUP/Pilot_study/Tem_comp/OPERA_MS/Sample_1_5A/intermediate_files/reference_clustering/MASH//PARTIAL_SKETCH//partial_Sketch0.msh

Would you be able to share the file when you have time? You can use wetransfer if the files are big. Those files are mash sketches (kmers subsampling of your genomes), with limited information (one could only guess what species are in your reads set).

If you can't share the file for any reason, there is not much I can offer you. As it's purely a mash issue, you could try to install a newer version of mash (with conda for example) and rerun the command line with a different executable. My intuition remains that there is not enough memory, but this is just an intuition. Mash pairwise comparison can be heavy memory-wise.

@ValeriiaLadyhina
Copy link
Author

Good day!

  1. genomes.msh is 112G
  2. partial_sketch0.msh is 3.3M

I tried to install the newer version of mash, it didn't help.

If it is indeed a problem of memory, do you have a feeling how much of memory might be enough for my size of data?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants