Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MaxEntScan SWA module interpretation #492

Open
zyh4482 opened this issue Apr 14, 2022 · 3 comments
Open

MaxEntScan SWA module interpretation #492

zyh4482 opened this issue Apr 14, 2022 · 3 comments
Assignees

Comments

@zyh4482
Copy link

zyh4482 commented Apr 14, 2022

In MaxEntScan header:

 If 'SWA' is specified as a command-line argument, a sliding window algorithm
 is applied to subsequences containing the reference and alternate alleles to
 identify k-mers with the highest donor and acceptor splice site scores. To assess
 the impact of variants, reference comparison scores are also provided. For SNVs,
 the comparison scores are derived from sequence in the same frame as the highest
 scoring k-mers containing the alternate allele. For all other variants, the
 comparison scores are derived from the highest scoring k-mers containing the
 reference allele. The difference between the reference comparison and alternate
 scores (SWA_REF_COMP - SWA_ALT) are also provided.

It will create 8 scores: alt, ref, diff, ref_comp for acceptor and donor sites.

I'm confused about the interpretation of these scores.

  1. What is the difference between ref and ref_comp score? I noticed that some are same but some are different.

  2. In the header:

The difference between the reference comparison and alternate scores (SWA_REF_COMP - SWA_ALT) are also provided

Does it mean diff is calculated by (ref_comp - alt) ?

In your article
Figure 1b mentioned how to interpret the consequence of a variant can have impact on splicing.

Spliceogenicity was assessed using the reference (ref), alternate (alt) and difference (diff; ref–alt) maximum entropy scores and the ENIGMA score thresholds.

The difference seems to have different definition? This question is related to the first one. If I want to calcualte the percent change of scores, I can't tell which score I should use.

  1. It said that SWA scores can be used to assess de novo gain and naive loss. How to distinguish them from the output? When the site is considered different, the diff score can be interpreted oppositely.

  2. What does negative alt, ref, diff, ref_comp scores mean in SWA score, respectively? Lower potential for splicing?

Thank you so much.

@nuno-agostinho nuno-agostinho self-assigned this Apr 14, 2022
@nuno-agostinho
Copy link
Contributor

Hi there @zyh4482, hope you are having a great day!

I'll need more time to go through your questions, but I'll leave here some notes for now:

What is the difference between ref and ref_comp score? I noticed that some are same but some are different.

  • ref is the highest splice acceptor/donor reference sequence score
  • ref_comp is the acceptor/donor reference comparison sequence score

ref_comp is effectively equal to ref, except for SNVs where the comparison scores are derived from sequence in the same frame as the highest scoring k-mers containing the alternate allele.

Does it mean diff is calculated by (ref_comp - alt) ?

Yes, the plugin calculates the difference based on ref_comp - alt. As per the code:

$results{'MES-SWA_acceptor_alt'} = $score;
# ...
$results{'MES-SWA_acceptor_diff'} = $results{'MES-SWA_acceptor_ref_comp'} - $score;

I'll try to get back to you soon regarding the remaining questions.

Kind regards,
Nuno

@zyh4482
Copy link
Author

zyh4482 commented Apr 15, 2022

@nuno-agostinho Thank you. Your answer explained a lot.
For the question 3, I assume naive sites are something like exon-intron junction. But I'm not sure about that.

SWA is useful to calculate the variant impact. But variants may be largely identified beyond exon-intron junction region. For these variants beyond junction region, should they be defined as "de novo" according to this module? Or is there a naive site reference dataset, which provide the chromosome position and can be used for distinguishing de novo variants from naive site variants?

(BTW, I think I missed another important module, NCSS. Because SWA seems to require NCSS to detect whether a de novo site is outcompete the adjacent naive site. I'll update my example data soon after calculation)

For question 4, I could give an example for you.
Assuming ithere are several de novo variants.

id   donor_alt   donor_diff   donor_ref
variant1   9.137   -17.915   -8.778
variant2   3.275   -18.375   -15
variant3   7   -16.258   -9.258
variant4   -0.546   -16.164   -16.710

variant1 should be of high impact on splicing. But what kind of impact? Here the negative score of ref indicated that there is low splicing potential? After mutation, this site gained splicing potential? Do I understand it correctly?
For variant 2 and 3, it is similar with 1 but with different level of impact on splicing.
For variant4, the negative score of donor_alt is less than 6.2. Should it be considered a low splicing potential?

Thank you.

@zyh4482
Copy link
Author

zyh4482 commented Apr 15, 2022

I've finished NCSS calculation. For some variants, I can get e.g MES-NCSS_upstream_acceptor and MES-NCSS_upstream_donor scores.

Question 5: How to compare NCSS score with SWA score and interpret the result?

For instance:

variant1  NCSS_downstream_donor=8.481;MES-SWA_donor_alt=-5.130;MES-SWA_donor_diff=6.576;
variant2  MES-NCSS_downstream_donor=9.452;MES-SWA_donor_alt=6.426;MES-SWA_donor_diff=-7.754;

For variant2, MES-SWA_donor_alt < NCSS_downstream_donor, does it mean this variant has low impact based on figure 1b?
For variant1, MES-SWA_donor_alt=-5.130 < 0. This situation is related to my previous question "how to interpret a negative SWA score".

For the last question 6: Previously, some studies calculated a percentage change of mes score (by calculating MaxEntScan_diff/MaxEntScan_ref and set threshold like <15% for no impact and >15% for potential impact). Does this also suit for SWA module?

Thank you so much. I've asked so many questions. Sorry for bothering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants