Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nanodisco characterize error task 1 failed and RDS file issue #53

Open
BioRB opened this issue Sep 13, 2022 · 1 comment
Open

nanodisco characterize error task 1 failed and RDS file issue #53

BioRB opened this issue Sep 13, 2022 · 1 comment

Comments

@BioRB
Copy link

BioRB commented Sep 13, 2022

hello,
we are running nanodisco and we got this error at the characterize step.

nanodisco characterize -p 4 -b baumani -d analysis/merged_difference/baumani_difference.RDS -o analysis/baumani_motifs -m GATC,CCWGG,GCACNNNNNNGTT,AACNNNNNNGTGC -t nn -r reference_genome/Acinetobacter_baumannii_ATCC_BAA_747.fasta
[2022-09-13 14:11:52] Load supplied current differences.
[2022-09-13 14:11:52] Check current differences file version.
Models for Guppy version 6.3.4+cfaa134 is not yet available but we are working on it.
Motif characterization will still proceed with the default model but obtained results might not be optimal.
Additional information can be found in our GitHub repository.
[2022-09-13 14:11:52] Determine motif signature center.
[2022-09-13 14:11:52]   Process GATC.
[2022-09-13 14:11:52]     Tag GATC occurrences.
[2022-09-13 14:11:55]     Score GATC modified position.
[2022-09-13 14:11:56]   Process CCWGG.
[2022-09-13 14:11:56]     Tag CCWGG occurrences.
[2022-09-13 14:11:57]     Score CCWGG modified position.
[2022-09-13 14:11:57]   Process GCACNNNNNNGTT.
[2022-09-13 14:11:57]     Tag GCACNNNNNNGTT occurrences.
[2022-09-13 14:11:59]     Score GCACNNNNNNGTT modified position.
[2022-09-13 14:11:59]   Process AACNNNNNNGTGC.
[2022-09-13 14:11:59]     Tag AACNNNNNNGTGC occurrences.
[2022-09-13 14:12:01]     Score AACNNNNNNGTGC modified position.
Error in { : 
  task 1 failed - "arguments imply differing number of rows: 1, 0"
Calls: find.signature.center -> %do% -> <Anonymous>
Execution halted

So we had a look to the RDS File generated during the nanodisco difference step. It looks like this:

contig	position	dir	strand	N_wga	N_nat	mean_diff	t_test_pval	u_test_pval
9a03e25654c44fe8_1	1	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	5001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	10001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	15001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	20001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	25001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	30001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	35001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	40001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	45001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	50001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	55001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	60001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	65001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	70001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	75001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	80001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	85001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	90001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	95001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	100001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	105001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	110001	rev	t	0	0	NA	NA	NA
9a03e25654c44fe8_1	115001	rev	t	0	0	NA	NA	NA

We don't find any current differences detected. This could be a biological issue but what is also strange is that the positions reported are only every 5000 bp and only reverse sequences. Do you have an explanation for this?
do we have to set up a parameter in the difference step or upstream to have all the bases covered (1,2,3....).
thanks.
thanks for your kind help.
RB

@BioRB BioRB changed the title nanodisco characterize error task 1 failed nanodisco characterize error task 1 failed and RDS file issue Sep 13, 2022
@touala
Copy link
Member

touala commented Nov 28, 2022

Hello @BioRB,

Thank you for reaching out and sorry for the major delay. I hope you figure out the issue already but if not I'll try to help you sort it out.

The information contained in the difference file suggest that there was no data to process during nanodisco difference step. I would suggest loading the native and WGA .bam file from nanodisco preprocess in IGV to directly look at the data. Without more information, I would think that that the genome used do not match the strain/species sequenced or that your dataset is too shallow. Otherwise, an issue during nanodisco preprocess might have happened.

Feel free to reach back with more questions. I'll be more available in the future.

Best,

Alan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants