Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem at step Strain clustering and assembly: Invalid version format (non-numeric data) #57

Open
DDavila10 opened this issue Apr 8, 2021 · 10 comments

Comments

@DDavila10
Copy link

Dear all,

I ran the test files and work perfectly. When I ran my own data, I have a problem with the Strain clustering and assembly step. I have this problem

*** End window construction
Invalid version format (non-numeric data) at /beegfs/group_dv/home/DDavila/.perl/5.24.1/lib/perl5/Statistics/R.pm line 369

Could you please help me on this problem?

Thanks you so much in advance for your time!

@jsgounot
Copy link
Contributor

jsgounot commented Apr 12, 2021

Hi Davila,

thanks for using OPERA-MS ! Top help you I will need to look at your species coverage file. To do that, you can locate which species is problematic with : grep "Invalid version format" outdir/intermediate_files/strain_analysis/*.log. Look at the file name, and using the species name, could you please send me this file : outdir/intermediate_files/strain_analysis/**name_of_your_species**/contigs_window_cov ?

Regards,
jsgounot

@DDavila10
Copy link
Author

Hi Jean,

Thanks a lot for your answer and valuable help.
grep "Invalid version format" *.log showed that all species in the analysis are problematic. 46 species in total showing the same message. For example:

Acutalibacter_muris.log:Invalid version format (non-numeric data) at /beegfs/group_dv/home/DDavila/.perl/5.24.1/lib/perl5/Statistics/R.pm line 369.

I will send to you the contigs_window_cov of this single file to "jsgounot+github" [email protected]. Please let me know if you do not receive the file.

Thanks a lot for your help!!!

Best,
Daniel

@jsgounot
Copy link
Contributor

I'm able to read and process your file without issues. Can you check your perl, R module and R version ?

perl --version
R --version
cpan -D Statistics::R

@DDavila10
Copy link
Author

Hi Jean,

This is the output that I got:

Loading rlang/3.6.3
Loading requirement: gcc/6.3.0 openssl/1.1.0c

This is perl 5, version 24, subversion 1 (v5.24.1) built for x86_64-linux

Statistics::R

    (no description)
    F/FA/FANGLY/Statistics-R-0.34.tar.gz
    /beegfs/group_dv/home/DDavila/.perl/5.24.1/lib/perl5/Statistics/R.pm
    Installed: 0.34
    CPAN:      0.34  up to date
    Florent Angly (FANGLY)
    CENSORED

@DDavila10
Copy link
Author

Hi Jean,

I re-run the pipeline and still have the problem in line 369 on Statistics::R.

The problem is in this part:

  if ( version->parse($self->version) < version->parse('2.5.0') ) {

Do you have any idea on how to solve this?

Thanks a lot for the valuable help!

Best,

Daniel Davila

@DDavila10
Copy link
Author

Hi Jean,

Besides the problem above, I also have another error in nucmer.error which is:

20210413|122427|196763| ERROR: The following critical files could not be used
20210413|122427|196763| /beegfs/group_dv/home/DDavila/OPERA-MS/opera_assembly_reference_metagenome/../beegfs/group_dv/home/DDavila/OPERA-MS/opera_assembly_reference_metagenome/mouse1.opera.DB/intermediate_files/reference_clustering/NUCMER_OUT/temp_genome/258329-vs-33091_0.fa
20210413|122427|196763| Check your paths and file permissions and try again

Is this could the problem with line 369 in Statistics::R?

@jsgounot
Copy link
Contributor

Hi Daniel,

the problem is linked to the perl module and maybe your version of perl. I'm using the same version of the R::Statistics module than you so I don't really know what's happening here.

So first you can try to update / reinstall your module, I don't think it will work but cost nothing to try and should be fast :
cpan install Statistics::R

But I think that you should try Opera-MS with a new fresh perl installation (I'm using v5.26.1).

Before running the whole opera pipeline, you can check if libraries are able to be loaded. For this you can run an interactive sessions with perl -de1 and just write this :

use warnings "all";
use Statistics::Basic qw(:all);
use Statistics::R;

I hope this will work,
Jean-Sebastien

@rhysnewell
Copy link

I'm having the same issue. My perl version is 5.26.2, and I have all perl dependencies installed, running the perl debugger above and loading the statistics modules does not produce any errors or warnings.

@rhysnewell
Copy link

rhysnewell commented Feb 21, 2022

I've figured out the NUCMER error. Seems like Opera-ms assumes you are running your analysis within the folder you have opera-ms installed. For me that is in my home directory: /home/my_user_name/git/OPERA-MS/OPERA-MS.pl So the nucmer step appendsthat entire path on to where it is looking for the files, even if you aren't in your home directory... i.e. /home/n10853499/git/OPERA-MS/lustre/scratch/microbiome/n10853499/03-aviary_testing/00-zymo_hybrid_assembly_benchmark/opera-ms/SRR10084344/intermediate_files/
NOTE that I am in /lustre/scratch/microbiome/n10853499/03-aviary_testing/00-zymo_hybrid_assembly_benchmark/opera-ms/SRR10084344/intermediate_files/

This produces the nucmer error, and the following strain analysis error I assume.

Even if I run everything from within my home directory, inside where I've download the OPERA-MS files it still does the same thing... /home/n10853499/git/OPERA-MS/home/n10853499/git/OPERA-MS/SRR10084344/intermediate_files/reference_clustering/NUCMER_OUT/temp_genome/1540-vs-1660_0.fa

Is there a particular file setup that oepra-ms is demandning or are we missing something?

@jsgounot
Copy link
Contributor

Concerning the issue relative to nucmer, please see the answer in the dedicated thread #63.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants