Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nanodisco difference: no signal difference for majority of a genome #68

Open
christinehe opened this issue Mar 10, 2023 · 5 comments
Open

Comments

@christinehe
Copy link

christinehe commented Mar 10, 2023

Hi,

I'm running nanodisco against a curated genome sequence known to be present in the sample (from Illumina data and assembly of the ONT data). The alignments from nanodisco preprocess show reasonably convincing read support across the genome.

However, nanodisco is unable to find any motifs. Over 70% of the positions in the merged signal differences file have no current values. It seems odd that a current difference would be found at one position, with no current values for the neighboring positions:

"contig" "position" "dir" "strand" "N_wga" "N_nat" "mean_diff" "t_test_pval" "u_test_pval"
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2244 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2244 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2245 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2245 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2246 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2246 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2247 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2247 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2248 "fwd" "t" 11 16 13.7768837637467 0.000430954878020719 6.13595983093897e-07
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2248 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2249 "fwd" "t" 6 22 1.68108663813591 0.0982472566548756 0.0995487604183256
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2249 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2250 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2250 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2251 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2251 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2252 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2252 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2253 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2253 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2254 "fwd" "t" 6 15 0.320795190325512 0.791074140784466 0.62218045112782
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2254 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2255 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2255 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2256 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2256 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2257 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2257 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2258 "fwd" "t" 5 23 3.37158443622192 0.00454686242416758 0.00400895400895401
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2258 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2259 "fwd" "t" 13 10 -1.6600018896479 0.0614309265440312 0.101011654922006
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2259 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2260 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2260 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2261 "fwd" "t" 5 11 10.6930921222628 0.000372580357190029 0.0086996336996337
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2261 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2262 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2262 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2263 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2263 "rev" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2264 "fwd" "t" 0 0 NA NA NA
"SRVP_Atabeyarchaeota_1_curated.FINAL" 2264 "rev" "t" 0 0 NA NA NA

Any advice is appreciated! Happy to provide more info if helpful.

@touala
Copy link
Member

touala commented Mar 11, 2023

Hi @christinehe,

Thank you for trying nanodisco. This is indeed surprising to have partial data. What is the general coverage or coverage distribution from native and WGA files? I've implemented a minimum coverage threshold at 5x that might have kicked in and stopped it from reporting lower confidence statistics.

Best,

Alan

@christinehe
Copy link
Author

Thanks @touala, this makes sense as the mean coverage depth in the WGA sample is unfortunately only 1.7x. Knowing the caveats, I'd still like to try running nanodisco with a lower threshold. I edited the threshold in difference.sh and am running in a Singularity sandbox. Any other recommendations for how best to change this threshold?

@touala
Copy link
Member

touala commented Mar 15, 2023

You're welcome. To be clear, we do not recommend lowering the min coverage threshold in general as it will results in accuracy loss. Adding WGA data will always be the best solution. But feel free to experiment with this threshold. BTW you can use the -e options in nanodisco difference to modify coverage requirement (see here).

Please let me know how it goes.

Alan

@christinehe
Copy link
Author

Unfortunately lowering the coverage threshold did not help. Thanks for your responsiveness - I'll see if generating more WGA data is an option.

@touala
Copy link
Member

touala commented Mar 16, 2023

It was indeed a long shot, but thanks for the feedback. If you're going to generate more data, native would also benefit from increased coverage assuming the 10-25x reported above is global. In Extended Data Fig. 7 from our paper, you can see the positive effect of increased coverage on the analysis (up to 200x).

Alan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants