Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with Mummer #70

Open
iek opened this issue Sep 29, 2021 · 12 comments
Open

Error with Mummer #70

iek opened this issue Sep 29, 2021 · 12 comments

Comments

@iek
Copy link

iek commented Sep 29, 2021

Hello, I am trying to run Opera-MS on the test dataset and have run into an error about Gapfilling and Mummer. To troubleshoot, I removed the "> /dev/null 2> /dev/null" line in "run_mummer_large_ref.pl" because I was unable to see exactly what the error was. I then received this message:

Screen Shot 2021-09-28 at 9 09 36 PM

It seems that Mummer is missing the Query file?

@jsgounot
Copy link
Contributor

Hi. Thanks for the bug report. I will look at this as soon I have time.

@iek
Copy link
Author

iek commented Sep 29, 2021

Thank you very much!

@jsgounot
Copy link
Contributor

Hi iek, sorry for the late reply on this, we've been very busy. Could you share your mummer/cmd_delta.txt file and see if you can run one of the command line found inside?

@jsgounot
Copy link
Contributor

Hi everyone, we've worked on this issue with another user and this one might have several origins.

If you see this error for the first time

Please look first at the first lines of your intermediate file mummer/cmd_delta.txt which should contain multiple nucmer command lines. Pick one and try to run it without the redirection at the end (> /dev/null 2> /dev/null). Note that the behavior of this command line might be different if you run it inside an interactive session or from a cluster job. Don't forget to load the environment used for your OPERA-MS run too.

See bellow if your error has been listed, otherwise let us know by providing your error in this thread with a new comment.

Can't locate Foundation.pm in @inc

It looks like that perl is not able to find the Fundation module linked to MuMMer. Try to clean reinstall mummer in tools_opera_ms\MUMmer3.23. Look at tools_opera_ms\install_mummer3.23.sh to see how it has been automatically installed with OPERA-MS. Confirm that the mummer directory does not move. It is possible that on some clusters, the configuration leads to some error, check with your system admin. Ultimately, you need to check that the executable tools_opera_ms\MUMmer3.23\nucmer works. This means that you can also provide an other nucmer executable with a symlink, not however that we did not test OPERA-MS with other mummer version than 3.23.

On the long run

This issue is part of our motivation to provide a conda packaging with external tools not being directly linked to OPERA-MS. We're still working on this, sorry for the delay.

@Bordeterre
Copy link

Bordeterre commented Feb 17, 2023

I have a simmilar error on my own dataset (it works fine on the test dataset) :
OperaMS's log says there's an error in gap filling, and invite me to check gap_filling.err.
I do, and it says there was an error during tiling generation, and invite me to check tilling_1.out and tilling_1.err.
The .out exists but is empty, and in the .err, in the " *** run the mummer mapping" step, there's an "Error in during nucmer."

In the "intermediate_files/opera_long_read/GAPFILLING/mummer/cmd_delta.txt" file, there's only a single nucmer command line :
/scratch/nimauric/metagenomic_benchmark/workflow/dependencies/OPERA-MS//tools_opera_ms//MUMmer3.23//nucmer --nosimplify --maxmatch -p /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/mummer/split_1.fa_split_1.fa /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/TILLING/REF/split_1.fa /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/TILLING/QUERY/split_1.fa > /dev/null 2> /dev/null

When I run it without the "> /dev/null 2> /dev/null", I get this output :
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
# reading input file "/scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/mummer/split_1.fa_split_1.fa.ntref" /scratch/nimauric/metagenomic_benchmark/workflow/dependencies/OPERA-MS/tools_opera_ms/MUMmer3.23/mummer: empty sequence in multiple fasta file
ERROR: mummer and/or mgaps returned non-zero

I check the content of "/scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/mummer/split_1.fa_split_1.fa.ntref", and it only contains this line :
>allcontigs /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713/intermediate_files/opera_long_read/GAPFILLING/TILLING/REF/split_1.fa

"/scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713/intermediate_files/opera_long_read/GAPFILLING/TILLING/REF/split_1.fa" itself exist, but is an empty file.

I also have a "nucmer.error" file, which contains : 20230217|150203| 12678| ERROR: mummer and/or mgaps returned non-zero

I tried reinstalling mummer, and it didn't fix the problem.

@jsgounot
Copy link
Contributor

Hi Bordeterre,

sorry to hear that! It looks like OPERA-MS generated an empty fasta file in your case. I'll look into this as soon as possible but it might be a rare unlucky error. Maybe resampling reads could solve this issue if you want to try a very quick fix.

Note for me: Check if this does not result from a special case within split_fasta_file function.

JS

@jsgounot
Copy link
Contributor

Hi Bordeterre,

could you check for me the content of your file /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713/intermediate_files/opera_long_read/GAPFILLING/TILLING/consensus.fa?

If it's not empty, could you send it to me for testing purposes?
If it is empty, could you check your log file at intermediate_files/opera_long_read/GAPFILLING/consensus_cmd.sh and try to run one of the commands here (without the log redirection)?

Thank you very much and sorry for this issue!
JS

@Bordeterre
Copy link

Bordeterre commented Feb 21, 2023

Hi JS,

/scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713/intermediate_files/opera_long_read/GAPFILLING/TILLING/consensus.fa is indeed an empty file.

intermediate_files/opera_long_read/GAPFILLING/consensus_cmd.sh is also an empty file.

Concerning the resampling of reads, and perhaps I should have mentioned it sooner, I have a biased dataset where, while the short and long reads were sequenced from the same community, I filtered the long reads as to only keep those that map to one of two specific bacteria present in the community (As a way to produce a lighter dataset for faster testing) (This filtered dataset to did produce satisfying contigs on pure long-reads assembler). Do you think the problem might stem from this discrepancy between the short and long read dataset ?

Thanks for this assembler and your help on this issue,
NM

@jsgounot
Copy link
Contributor

OK this is definitely very weird. I'm not sure the exact source of error though.

Could you share the content of your gapfilling directory?
tree /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713/intermediate_files/opera_long_read/GAPFILLING/

If you have the file extract_read.err inside this folder, could you cp the content here?

Thanks,
JS

@Bordeterre
Copy link

With ls -lh, I get :

-rw-r--r-- 1 nimauric genscale    0 Feb 17 14:32 consensus_cmd.sh
-rw-r--r-- 1 nimauric genscale    0 Feb 17 14:32 consensus_cmd.sh.log
-rw-r--r-- 1 nimauric genscale    0 Feb 17 14:32 contig_extention.log
-rw-r--r-- 1 nimauric genscale    0 Feb 17 14:32 edge_err
-rw-r--r-- 1 nimauric genscale  783 Feb 17 14:32 extract_read.err
-rw-r--r-- 1 nimauric genscale 4.7K Feb 17 14:32 gap_filling.err
-rw-r--r-- 1 nimauric genscale    0 Feb 17 14:32 gap_size.dat
drwxr-xr-x 2 nimauric genscale    0 Feb 17 14:32 LOG
drwxr-xr-x 2 nimauric genscale    5 Feb 17 14:53 mummer
drwxr-xr-x 4 nimauric genscale    5 Feb 17 14:32 TILLING

In extract_reads.err, I have :

 *** Starting the sequence extraction
rm -rf /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/LOG
mkdir -p /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/LOG
 *** Number of edges selected for gapfilling 0
 /scratch/nimauric/metagenomic_benchmark/workflow/../data/assemblies/opearams_SRR8073716_toy-SRR8073713///intermediate_files/opera_long_read/GAPFILLING/contig_extention.log
 *** Extract gap sequences from read file /scratch/nimauric/metagenomic_benchmark/workflow/../data/input_reads/toy-SRR8073713.fastq
 *** Reading contig file
 *** Read the scaffold file and fill gaps

@jsgounot
Copy link
Contributor

jsgounot commented Mar 2, 2023

Sorry for the late reply. There are no edges selected for gapfilling, that's the reason why OPERA-MS is crashing. This is something we should be able to fix but I try to not touch OPERA-MS code as much as I can. I imagine this is not something that would happen for real data and might originate from your test dataset. Would you mind increasing the number of long reads if you didn't do it already?

@Bordeterre
Copy link

Bordeterre commented Mar 8, 2023

It did come from the biased subsampling, and OPERA-MS produces an assembly on the full dataset

Thank you !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants