-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No data in chunks despite gzip conversion #64
Comments
Hello @Haidermanzer, Thank you for trying Best, Alan |
IGV is a mapping visualisation software. You can find the documentation and download link at this address: https://software.broadinstitute.org/software/igv/UserGuide. Basically, I wanted to make sure reads were properly mapped by Is your dataset public? Alan |
Hi @Haidermanzer, You can send me the files and I'll look ASAP. To diagnose the issue, I would need native and WGA fast5, and the reference you used. Best, Alan |
Great! Which email address should I share the one drive link with? |
You can share them at: [email protected] |
Hi @Haidermanzer, I got your data and I was able to run # Download and unzip your dataset
# Using most recent, but compatible, guppy version i.e. 6.3.8
# Later version discontinued --fast5_out
guppy_basecaller --disable_pings --recursive -i Nativefast5_pass --fast5_out -s Nativefast5_basecalled -c dna_r9.4.1_450bps_sup.cfg --device cuda:0
guppy_basecaller --disable_pings --recursive -i WGAfast5_pass --fast5_out -s WGAfast5_basecalled -c dna_r9.4.1_450bps_sup.cfg --device cuda:0
compress_fast5 -t 20 --recursive -c gzip -i Nativefast5_basecalled -s Nativefast5_basecalled_gzip
compress_fast5 -t 20 --recursive -c gzip -i WGAfast5_basecalled -s WGAfast5_basecalled_gzip
# Using an older singularity version to avoid having to bind directories
conda create -n singularity_3.5.2 -c conda-forge singularity=3.5.2
conda activate singularity_3.5.2
run_nanodisco="singularity exec <path_to/nanodisco> nanodisco"
$run_nanodisco preprocess -p 20 -f Nativefast5_basecalled_gzip -s native -o results/preprocessed_gzip -r Reference.fasta
$run_nanodisco preprocess -p 20 -f WGAfast5_basecalled_gzip -s wga -o results/preprocessed_gzip -r Reference.fasta
$run_nanodisco difference -nj 2 -nc 1 -p 5 -f 281 -l 290 -i results/preprocessed_gzip -o results/difference -w wga -n native -r Reference.fasta
# Make sure to process more if not all the genome chunks for better performances
# $run_nanodisco chunk_info -r Reference.fasta
# Check output within R:
# summary(readRDS("results/difference/chunk.281.difference.rds")) I'm not sure why it didn't work for you earlier. Was the Please let me know if this fixed your issue or if you have any other problem. Best, Alan |
Hi Alan, Great! The steps that you outlined look pretty similar to what I tried other than some version differences, but I'll try using the versions you used now. In the meantime, would it be possible for you to use the one drive to send the output from those steps back to me so I could try to troubleshoot which part our output deviates from yours? |
Ok. It could be version specific indeed. I've uploaded two current differences files: for a chunk and for the whole genome. Do you know if a motif is modified in this sample so I can do a sanity check? Feel free to continue the discussion by email if you don't want to talk specific here. Alan |
@touala just in case it makes a difference- which version of ont-fast5-api/compress_fast5 are you using? |
@ecpierce Unfortunately I didn't kept track of that and I've multiple versions available... Although the most likely is |
I have same issie. when I do nanodisco difference I get no data for any chunk. I followed the mentioned procedure also trying to change the singularity version but notting. could you try a test on my data? thanks |
Hello @BioRB, Sure, feel free to share a subset of data. I've posted my email above. Alan |
ok , I just sent you the files in a google drive folder. |
Dear @touala , do you have news about the subset I've sent you? Did you give it a try? |
Hello,
I've been trying to use Nanodisco and have ran into the issue that many others have had where the calculate differences step gives the "no data in chunk" error. It seems like almost all of the others that have ran into this issue have resolved it by using the ONT tools functionality to convert the files to .gzip, and I tried that as well but am still running into the same error.
Do you have any alternative solutions to this problem?
The text was updated successfully, but these errors were encountered: