Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

taxonomy improvement #67

Open
mihinduk opened this issue Mar 2, 2022 · 4 comments
Open

taxonomy improvement #67

mihinduk opened this issue Mar 2, 2022 · 4 comments

Comments

@mihinduk
Copy link
Contributor

mihinduk commented Mar 2, 2022

Hi Mike,

In working with the SIV data, I realized that there are spaces in taxonomic fields. For example, for families:
Verrucomicrobia subdivision 3
Verrucomicrobia subdivision 6

This will make pulling reads by family more difficult. Could all spaces in taxonomy fields be replaced with underscores in the next update?

Thank you,
Kathie

@beardymcjohnface
Copy link
Collaborator

It's a tab-separated file so spaces shouldn't be a problem. How were you planning on parsing the files?

@mihinduk
Copy link
Contributor Author

mihinduk commented Mar 4, 2022

I was trying to make a helper script to pull reads from a family of interest, so a shell script but could do it differently.

@beardymcjohnface
Copy link
Collaborator

If you're using awk, you'll just need to pass -F '\t' to change the field separator to tabs instead of whitespace.

@mihinduk
Copy link
Contributor Author

Hi Mike,
Uploading 2021_05_18_Viral_Baltimore_full_classification_table_ICTV2020.txt…

I just created an updated taxonomy database with Baltimore classification, which I am having trouble uploading here. This is the latest ICTV release. I will email it to you and Rob.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants