You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I tried to quantify the intron rentention level for a transposon gene and manually made a genome file and a gtf file.
IGV visualization of the bam files in the tmp folder shows clear difference of intron rentetion, but the output file from rmats shows 0. The intron there is 41nt.
rMATS usually needs the gtf to have a transcript with the intron included and another transcript with the intron spliced out. This post has some details: #17
From the IGV screenshot it looks like the gtf might only have a transcript with the intron spliced out (if the top blue line is the entire gene and not a transcript)
Hi,
Thanks for the answer.
There was a transcript line for the full length gene in the gtf file.
After I checked the issue #17 , I changed the "transcript" to "exon" and now rmats detects that intron retention.
Does that mean I need to create an exon annotation for each possible intron rentention event?
For example:
If multiple exons in a gene are retained, how could I let rmats to detect all these events? Will a full-length transcript annotation plus every exon annotation be enough or I have to annotate every possible new exon produced by intron retention?
Thank you for your help.
I think you'll need to have an exon annotation for each retained intron event that you want rmats to report. The code needs the "exon" that is actually two exons with the intron retained. In that example it looks like there are 4 exons. You could use a gtf that has each of those 4 exons defined and also 3 additional exons which go from the start of one exon to the end of the next: exon1-exon2, exon2-exon3, exon3-exon4
Hi,
I tried to quantify the intron rentention level for a transposon gene and manually made a genome file and a gtf file.
IGV visualization of the bam files in the tmp folder shows clear difference of intron rentetion, but the output file from rmats shows 0. The intron there is 41nt.
Here's the script I used:
rmats.py --s1 S1.txt --s2 S2.txt
--gtf genome/Tc1/Tc1.test.gtf
--readLength 150
--bi genome/Tc1/STAR_Index
-t paired
--od test/
--tmp tmp/
--variable-read-length
--allow-clipping
--novelSS
--mil 30
The summary.txt shows this:
EventType TotalEventsJC TotalEventsJCEC SignificantEventsJC SigEventsJCSample1HigherInclusion SigEventsJCSample2HigherInclusion SignificantEventsJCEC SigEventsJCECSample1HigherInclusion SigEventsJCECSample2HigherInclusion
SE 0 0 0 0 0 0 0 0
A5SS 0 0 0 0 0 0 0 0
A3SS 0 0 0 0 0 0 0 0
MXE 0 0 0 0 0 0 0 0
RI 0 0 0 0 0 0 0 0
Here's the "read_outcomes_by_bam.txt"
tmp/2024-06-06-19_16_17_795004_bam1_1/Aligned.sortedByCoord.out.bam
USED: 965
NOT_PAIRED: 0
NOT_NH_1: 20
NOT_EXPECTED_CIGAR: 49
NOT_EXPECTED_READ_LENGTH: 0
NOT_EXPECTED_STRAND: 0
EXON_NOT_MATCHED_TO_ANNOTATION: 0
JUNCTION_NOT_MATCHED_TO_ANNOTATION: 0
CLIPPED: 0
TOTAL_FOR_BAM: 1034
tmp/2024-06-06-19_16_17_795004_bam2_1/Aligned.sortedByCoord.out.bam
USED: 1893
NOT_PAIRED: 0
NOT_NH_1: 16
NOT_EXPECTED_CIGAR: 110
NOT_EXPECTED_READ_LENGTH: 0
NOT_EXPECTED_STRAND: 0
EXON_NOT_MATCHED_TO_ANNOTATION: 0
JUNCTION_NOT_MATCHED_TO_ANNOTATION: 3
CLIPPED: 0
TOTAL_FOR_BAM: 2022
Could you help with this?
Thanks a lot.
The text was updated successfully, but these errors were encountered: