Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fast5_pass only or also fast5_fail? #59

Open
ecpierce opened this issue Nov 30, 2022 · 1 comment
Open

fast5_pass only or also fast5_fail? #59

ecpierce opened this issue Nov 30, 2022 · 1 comment

Comments

@ecpierce
Copy link

Hi!

I am currently re-basecalling my raw fast5 files using the fast5_out option so that I can input them into nanodisco. I am wondering if you recommend basecalling fast5 files from both the fast5_pass and fast5_fail folders that Guppy creates during live base calling, or if I should only use the fast5_pass files? Would you expect that interesting modifications (not necessarily just methylation of specific residues) would lead to reads with lower quality scores on average and therefore maybe fast5_fail is also interesting?

Thanks! Emily

@touala
Copy link
Member

touala commented Dec 22, 2022

Hi Emily,

nanodisco was implemented using unfiltered input fast5 so that it can handle most situations users will face. In practice, I would consider two things for whether using pass only or all data. First, if the coverage is limited with pass only reads then adding the remaining reads could help. Second, if you observe enrichment of fail reads for certains regions of interest. From my experience, I do not expect that using only pass reads can miss motifs considering that methylation motifs have many occurrences across the genome but there might be rare situations that I'm not aware of.

I hope this helps.

Best,

Alan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants