Skip to content

Taxonomic assignment with GTDBtk

Francisco Zorrilla edited this page Mar 22, 2021 · 3 revisions

GTDB-Tk is implemented in the Snakefile as follows:

rule GTDBtk:
    input:
        f'{config["path"]["root"]}/dna_bins/{{IDs}}'
    output:
        directory(f'{config["path"]["root"]}/GTDBtk/{{IDs}}')
    benchmark:
        f'{config["path"]["root"]}/benchmarks/{{IDs}}.GTDBtk.benchmark.txt'
    message:
        """
        The folder dna_bins assumes subfolders containing dna bins for refined and reassembled bins.
        """
    shell:
        """
        set +u;source activate gtdbtk-tmp;set -u;
        export GTDBTK_DATA_PATH=/g/scb2/patil/zorrilla/conda/envs/gtdbtk/share/gtdbtk-1.1.0/db/
        cd $SCRATCHDIR
        cp -r {input} .
        gtdbtk classify_wf --genome_dir $(basename {input}) --out_dir GTDBtk -x fa --cpus {config[cores][gtdbtk]}
        mkdir -p {output}
        mv GTDBtk/* {output}
        """