Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No such file #131

Open
Johnsonzcode opened this issue Dec 14, 2021 · 17 comments
Open

No such file #131

Johnsonzcode opened this issue Dec 14, 2021 · 17 comments

Comments

@Johnsonzcode
Copy link

(nanome) [poultrylab1@pbsnode01 nanome]$ nextflow run TheJacksonLaboratory/nanome -profile test,docker
N E X T F L O W  ~  version 21.10.0
Launching `TheJacksonLaboratory/nanome` [chaotic_hamilton] - revision: c181f907e9 [master]
NANOME - NF PIPELINE (v1.3.6)
by Li Lab at The Jackson Laboratory
https://github.com/TheJacksonLaboratory/nanome
=================================
dsname              : CIEcoli
input               : https://github.com/TheJacksonLaboratory/nanome/raw/master/test_data/ecoli_ci_test_fast5.tar.gz
genome              : ecoli

Running settings   : --------
processors          : 2
chrSet              : NC_000913.3
dataType            : ecoli
runBasecall         : Yes
runNanopolish       : Yes
runMegalodon        : Yes
runDeepSignal       : Yes
runGuppy            : Yes

Pipeline settings  : --------
Working dir         : /storage-04/chicken/ont_methylation/nanome/work
Output dir          : outputs
Launch dir          : /storage-04/chicken/ont_methylation/nanome
Script dir          : /storage-01/poultrylab1/.nextflow/assets/TheJacksonLaboratory/nanome
User                : poultrylab1
Profile             : test,docker
Config Files        : /storage-01/poultrylab1/.nextflow/assets/TheJacksonLaboratory/nanome/nextflow.config
Pipeline Release    : master
Container           : docker - liuyangzzu/nanome:v1.2
=================================
executor >  local (1)
[6e/af606b] process > EnvCheck (EnvCheck) [100%] 1 of 1 ✔
[-        ] process > Untar               -
[-        ] process > Basecall            -
[-        ] process > QCExport            -
[-        ] process > Resquiggle          -
[-        ] process > Nanopolish          -
[-        ] process > NplshComb           -
[-        ] process > Megalodon           -
[-        ] process > MgldnComb           -
[-        ] process > DeepSignal          -
[-        ] process > DpSigComb           -
[-        ] process > Guppy               -
[-        ] process > GuppyComb           -
[-        ] process > Report              -
No such file: https://github.com/TheJacksonLaboratory/nanome/raw/master/test_data/ecoli_ci_test_fast5.tar.gz


@liuyangzzu
Copy link
Collaborator

This error is very strange, since the input file link is already located at the Github. Would you please rerun to see if it is reproducible, and post the Nextflow running log/error file named .nextflow.log? Thank you!

@Johnsonzcode
Copy link
Author

Yes, I checked the link https://github.com/TheJacksonLaboratory/nanome/raw/master/test_data/ecoli_ci_test_fast5.tar.gz, the ecoli_ci_test_fast5.tar.gz exists.
I realized it may be my internet connection, you know we have Inconveniences in touch github.com

@liuyangzzu
Copy link
Collaborator

In this case, you can firstly download the file, then use --input <file-location> as a parameter.
Example:

nextflow run TheJacksonLaboratory/nanome -profile test,singularity\
      --input  <the-input-file-location>

@Johnsonzcode
Copy link
Author

OK, I am trying to solve it as your instruction. Thank you for your quick reply !

@liuyangzzu
Copy link
Collaborator

Not at all. Let me know how it goes.

@Johnsonzcode
Copy link
Author

It works fine, thank you so much !
By the way, I am curious about whether NANOME or METEORE can detect 5mC in chicken ? I know some model are trained in human.
And If it works good in chicken, some of the top four performers rely on GPU, or they may are slow. Should choose some of them if it is enough to detect 5mC?

@liuyangzzu
Copy link
Collaborator

For all 5mC methylation calling tools, there is no restrictions to apply them to any species. Now, I am working on opening any reference genomes for NANOME pipeline at one branch, and it will be tested and merged by this week. (ref: #130)

I recommend you run jobs using GPU, since basecalling, and the top performers e.g. Megalodon, etc. are deep learning models, that will run very fast in GPU, but will be very slow on CPU machine.

@Johnsonzcode
Copy link
Author

By now, I can't apply NANOME for my species, right ?
Because we don't have GPU server. What is the time consuming for CPU macheine.

Thank you for your great work in advance.

@liuyangzzu
Copy link
Collaborator

You can try your species on the branch enhance5. Documentation: https://github.com/TheJacksonLaboratory/nanome/blob/enhance5/docs/Usage.md#3-support-for-other-reference-genome

Since the branch is not fully tested, I do not merge it to master, so please use below example for your species and reference genome.
Example:

nextflow pull TheJacksonLaboratory/nanome -r enhance5

nextflow run TheJacksonLaboratory/nanome -r enhance5\
    -profile singularity \
    --input [input-file]\
    --genome [reference-genome-dir]\
    --chrSet '[chomosomes sperated by a space]'

@Johnsonzcode
Copy link
Author

Thank you, after reading paper DNA methylation-calling tools for Oxford Nanopore sequencing: a survey and human epigenome-wide evaluation,
if i have no availability in GPU server, maybe I can use CPU-fast-software right?

@Johnsonzcode
Copy link
Author

If my basecalling was done, how can i skip this step and provide fastq for the enhance5 pipline ?

@liuyangzzu
Copy link
Collaborator

If there is no GPU resources, the GPU supported tasks (Megalodon, etc.) may be slow, but you can also try to run any of them without problem. Nanopolish is the tool that not need GPU and run fast.

Currently, NANOME pipeline not support skip basecalling step, I will add this function to the brach in recent and let you know.

@Johnsonzcode
Copy link
Author

Johnsonzcode commented Dec 14, 2021

Thank you so much, I set runBasecall to false, it does not surpport indeed.

liuyangzzu added a commit that referenced this issue Dec 19, 2021
* support basecalled input by `--skipBasecall` (for issue: #131 )
* support any reference genome (for issue: #130 )
* update XGBoost model
* add Lifebit CloudOS report
* update docker to v1.3
* update megalodon to 2.4.1
@liuyangzzu
Copy link
Collaborator

Hi, I added the basecalled input for NANOME, please check https://github.com/TheJacksonLaboratory/nanome/blob/master/docs/SpecificUsage.md. Let me know if you have any issues.

Best

@Johnsonzcode
Copy link
Author

Thanks a lot. I will try !!!

@Johnsonzcode
Copy link
Author

Hi, I added the basecalled input for NANOME, please check https://github.com/TheJacksonLaboratory/nanome/blob/master/docs/SpecificUsage.md. Let me know if you have any issues.

Best

If my bascalled input has sevaral run, namely I had sevaral input folders. How I handle them. Merge them into one folder ? Or ?

@liuyangzzu
Copy link
Collaborator

NANOME support running multiple jobs on each folder, there are two options:

(1) put all folder paths in a file, the file name is suffixed with .filelist.txt, like: https://github.com/TheJacksonLaboratory/nanome/blob/master/inputs/na12878_chr22.filelist.txt

(2) use wildcard path matching, such as --input 'all_data_folders/*', note the single quote in wildcard string is needed.

The methylation calling will be automatically invoked on each folder in parallel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants