You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Thank you for developing TEQUILA-seq !
I have encountered some problems when I use Find_AS_Events.py to define alternative splicing events.
Here is my code: python ${TEQUILA_Dir}/Find_AS_Events.py -i ${Output_Dir}/samples_N2_R0_updated.gtf -r ${IndexDir}/gencode.vM33.basic.annotation.gtf -o ${Output_Dir}/DirectRNA_AS_envents.tsv
Here is the error:
'ENSMUSG00000002059.19' exists in both input transcripts.gtf and GENCODE.gtf.
Could you give me some advice?
Best wishes,
Kiritio
The text was updated successfully, but these errors were encountered:
I think I've found the problem, Find_AS_Events.py line 284 does not take into account the situation where this gene does not have a canonical transcript.
I added a line above.
if gene_id in canonDict:
And added a line below.
else:
output.append([row['chr'], row['strand'], gene_id, row['transcript_id'], 'NA', 'NA', 'NA', 'intergenic_transcript'])
It's working properly now.
It looks like the Ensembl canonical transcript for ENSMUSG00000002059.19 is not a "basic" transcript. The canonical transcript is in gencode.vM33.annotation.gtf, but not gencode.vM33.basic.annotation.gtf. I was able to reproduce the error on another dataset by using the gencode basic gtf
Here's a PR based on the code change you suggested. With that change there is no error even with the gencode basic annotation: #2
Hi,
![image](https://private-user-images.githubusercontent.com/87375686/264522197-41030b29-5958-4937-ae26-2398ef5e5e61.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjEzNDI3NDUsIm5iZiI6MTcyMTM0MjQ0NSwicGF0aCI6Ii84NzM3NTY4Ni8yNjQ1MjIxOTctNDEwMzBiMjktNTk1OC00OTM3LWFlMjYtMjM5OGVmNWU1ZTYxLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MTglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzE4VDIyNDA0NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWExMzUwMTAxYjdiMTllYjcwNDdjOWFhM2I1OWM2ZDllNjVjMzYzMGUwYTNkYzNkZDlhYWE4MzNkN2U0OTQwZDkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.R7Pa8PFf2ZpwzUyNJL2sBQhUTM1HM2kpvSPrSBJHA5M)
Thank you for developing TEQUILA-seq !
I have encountered some problems when I use Find_AS_Events.py to define alternative splicing events.
Here is my code:
python ${TEQUILA_Dir}/Find_AS_Events.py -i ${Output_Dir}/samples_N2_R0_updated.gtf -r ${IndexDir}/gencode.vM33.basic.annotation.gtf -o ${Output_Dir}/DirectRNA_AS_envents.tsv
Here is the error:
'ENSMUSG00000002059.19' exists in both input transcripts.gtf and GENCODE.gtf.
Could you give me some advice?
Best wishes,
Kiritio
The text was updated successfully, but these errors were encountered: