fast5_pass only or also fast5_fail? #59

ecpierce · 2022-11-30T21:49:00Z

Hi!

I am currently re-basecalling my raw fast5 files using the fast5_out option so that I can input them into nanodisco. I am wondering if you recommend basecalling fast5 files from both the fast5_pass and fast5_fail folders that Guppy creates during live base calling, or if I should only use the fast5_pass files? Would you expect that interesting modifications (not necessarily just methylation of specific residues) would lead to reads with lower quality scores on average and therefore maybe fast5_fail is also interesting?

Thanks! Emily

touala · 2022-12-22T10:04:16Z

Hi Emily,

nanodisco was implemented using unfiltered input fast5 so that it can handle most situations users will face. In practice, I would consider two things for whether using pass only or all data. First, if the coverage is limited with pass only reads then adding the remaining reads could help. Second, if you observe enrichment of fail reads for certains regions of interest. From my experience, I do not expect that using only pass reads can miss motifs considering that methylation motifs have many occurrences across the genome but there might be rare situations that I'm not aware of.

I hope this helps.

Best,

Alan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fast5_pass only or also fast5_fail? #59

fast5_pass only or also fast5_fail? #59

ecpierce commented Nov 30, 2022

touala commented Dec 22, 2022

fast5_pass only or also fast5_fail? #59

fast5_pass only or also fast5_fail? #59

Comments

ecpierce commented Nov 30, 2022

touala commented Dec 22, 2022