Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phage Fasta? #3

Open
leannmlindsey opened this issue Sep 5, 2022 · 2 comments
Open

Phage Fasta? #3

leannmlindsey opened this issue Sep 5, 2022 · 2 comments

Comments

@leannmlindsey
Copy link

Hello! Thank you for your previous help. I now have a set of prophages identified by Prophage_tracer that I want to further investigate and I want to better understand the output files. I was hoping to find the candidate phages fasta sequence listed somewhere in the output files. The only fasta file that I see in the output is the *.SR.reads.fasta, and from what I can see, I think this is simply the sequence of each of the split reads? Is that correct? So I must use the candidate phage output file locations to cut out the appropriate sequence from the contigs? Would that be the correct way to get the phage sequence?

@WangLab-SCSIO
Copy link
Owner

WangLab-SCSIO commented Sep 6, 2022

Hi, you can install the tool seqkit and use the codeseqkit subseq -r attL_start:attR_end reference.fasta >prophage.fsa to extract the candidate prophage sequence. attL_start and attR_end are the predicted prophage end in the output file. If your are going to deal with many prophages. can the print all the codes in the file and run this file in bash.
'awk 'BEGIN{FS="\t";OFS="\t"} NR>1{print "seqkit subseq -r "$3":"$6" reference.fasta >"$1".fasta"}' strain1.prophage.out >jobfile'
'bash jobfile'

@leannmlindsey
Copy link
Author

Thank you so much for this quick reply. I appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants