Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any preconfigured databases? #88

Open
jolespin opened this issue Oct 6, 2021 · 1 comment
Open

Any preconfigured databases? #88

jolespin opened this issue Oct 6, 2021 · 1 comment

Comments

@jolespin
Copy link

jolespin commented Oct 6, 2021

I'm looking for a method that can say whether an assembly is prokaryotic or eukaryotic. Was thinking this software might be helpful. Do you have any preconfigured databases that have both prokaryotes and eukaryotes?

@dnbaker
Copy link
Owner

dnbaker commented Oct 6, 2021

Hi there!

We don't have any preconfigured databases currently, but that is something we plan to put together for Dashing2 in the near future.

You'd have to download the set of genomes from RefSeq. I have a script which you could use to download them, at which point you could compare your assembly against them.

You could do something like the following:

python3 download_genomes.py all
find ref -name '*fna.gz' > refs.txt
echo $PATH_TO_ASSEMBLY > query.txt
dashing dist -Q query.txt -F refs.txt -k11 -Orefseq.matches -o refseq.sizes -p24

But a pre-built database would be much easier to work with, and you wouldn't need the disk space. I'll let you know when that changes.

Thanks!

Daniel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants