HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
Methodology Questions >> How to >> Test for branch specific selection
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1380877327

Message started by Asutu on Oct 4th, 2013 at 2:02am

Title: Test for branch specific selection
Post by Asutu on Oct 4th, 2013 at 2:02am
Hello,

I am very newbie to HYPHY but I would like to run it to test some hypothesis on my data, which consists of a set of populations (plus outgroups).

One of these populations have gone through what it seems to have been an introgression with one of the outgroups, however there are already several and (potentially) important nucleotide differences between these two intervinients. What this seems to me is that, in some way, the introgression might had have a negative effect and that could have accelerated the rate of evolution in my population of interest (this pop is associated to a specific niche).

The analysis/workflow that I am thinking to follow are:
[list bull-blackball]
  • select a nucleotide model (NucModelCompare.bf)
    detect recombination (GARD.bf): I have preliminary runned SingleBreakpointRecomb.bf and none was found
    fit a global model (AnalyzeCodonData.bf)
    fit a local model (BranchSiteREL.bf)


    Do you think these analyses are ok to follow with?
    Can this be done with my populations and with the introgressed region?
    Should I use other outgroups for fitting a model?
    How can I compare the fitting of each model (global vs local) on the command line, is just a LRT or is there any batch file to run?
    Finally, do you have other experienced suggestions so I could test my hypothesis in HYPHY?

    Thanks in advance.

    Pedro

  • Title: Re: Test for branch specific selection
    Post by Asutu on Oct 8th, 2013 at 5:07pm
    Hello again,

    I have run BranchSiteREL.bf on my data above (300 codon positions from 29 sequences aligned in protein space and a maximum likelihood tree from this alignment) and it seems that there was statistical evidence for episodic positive selection in a proportion of sites on two of the branches (one being an outgroup). However I have some questions about the interpretation of the results files and would be very grateful if you could help me with them. Below are some lines of interest (tell me if you need me to attach all the file):

    Branch,Mean_dNdS,RateClasses,OmegaOver1,WtOmegaOver1,LRT,p,p_Holm,BranchLength
    Outgroup4,0.6338530211413603,2,39.50934283283482,0.05867113793959489,12.57492751991231,0.0001954782029494062,0.01075130116221734,0.2337171125395624
    8,0.7937461041192106,2,513.7712033581857,0.003730739712586084,10.18154377818246,0.0007092668219026432,0.03830040838274273,0.006621847307589422
    25,10,1,950694.3545098663,1,0.6620847626218165,0.2079124535772852,1,0.001100459286759653


    1) If I understand things correctly this means that, for example for branch 8, the mean dNdS for this branch is around 0.8 but there are a fraction of 0.3% of sites that have a mean dNdS of 513? Why did this branch was identified as significant and branch 25 did not? (is it because the latter was assign with only one omega rate class? despite the much higher omega value?)
    2) in the output ps file I see the branch leading to the Outgroup4 as a blue thick line with red in the tip end but for branch 8 I only see a black thick line. Is this ok? shouldn't this branch be colored? or has this to do with the fraction of sites that were identified as being under diversifying selection?

    Other more general questions (if I may):
    3) I am mainly using intrapopulation data with an introgressed region in one of the populations. Is this ok for this type of analysis?
    4) Does BranchSiteREL also tests for a global fitting or this as to be done separately?

    Sorry if all these sound like naive questions but I am just now getting into selection analysis.
    Thanks.
    Pedro

    Title: Re: Test for branch specific selection
    Post by Martin on Oct 9th, 2013 at 1:29pm
    Hi Pedro,

    Hopefully I can answer a few of your questions.

    1. Yes, a mean dNdS of ~0.8 was found for branch 8 while 0.3% of sites have a
    dNdS of 513. This branch was deemed significant because the null hypothesis
    where no sites are under positive selection could be rejected using the
    likelihood ratio test. Branch 25 was not found to be significant because the
    likelihood ratio test did not support rejection of the null hypothesis,
    likely because the likelihood increase vs a neutral omega was small. It is
    also likely that power is low at branch 25 due to its short length.

    2. Branches are each colored with by omega value and proportion, with blue
    being dNdS <1, grey being dNdS=1 and red being dNdS>1. Line thickness
    indicates significance of positive selection. If you cannot see the red tip
    it is likely because the proportion of sites is too small to not be hidden or
    rounded off.

    3. BSREL is potentially applicable. The usual assumptions apply, as does
    their violation, e.g. are nucleotide proportions roughly the same across the
    topology, should alphas vary across the tree if branch length instability
    occurs as a result of ultra-high omegas, etc. As not all variants have been
    driven to fixation by selection, however, the dN/dS results are somewhat more
    difficult to interpret. Proceed with caution.

    4. A likelihood ratio test between global and local fits can be done
    separately, manually or through modification of BranchSiteREL.bf.

    Martin

    Martin Smith
    PhD Candidate
    Viral Evolution Group
    Bioinformatics and Systems Biology
    University of California San Diego

    Title: Re: Test for branch specific selection
    Post by Asutu on Oct 10th, 2013 at 3:28pm
    Hi Martin,

    thanks a lot for your insightful comments and help on interpreting these results. I further reduce the complexity of my data using only 9 sequences (one from each outgroup species plus one from each of the species of interest an another from my intogressed lineage) and the same result stands, which may indicate that there is sufficient power to detect these signs of episodic diversifying selection. These results are really really cool and very promising. I think I will need to run a global fitting and compare it to this local, although I do expect significance. I think that this can be also done with analyseCodonData.bf, is this right? Should I also test the branch-site model in paml? Since only two lineages were identified as being positively selected, should I expect the two models to give the same results?

    Thanks in advance.
    Pedro

    Title: Re: Test for branch specific selection
    Post by Sergei on Oct 31st, 2013 at 3:06pm
    Hi Pedro,

    1). I do not think the test of the global model vs the fully local model is needed. It is not a particularly interesting test (since a positive result -- the global model is rejected is not informative,   because you would most likely like to know WHERE selection took place), and would only make sense if you had NO positively selected branches. In the latter case, you might see some evidence of deviation from the global model overall, but not enough signal on any single branch to reach significance.

    2). I have no idea what PAML's models would tell you. There are so many different assumptions between them that the agreement (or lack of agreement) between the outcomes is informative. What we HAVE shown is that PAML's models (which force a uniform model of evolution on background branches) can have very poor statistical properties when their assumptions is violated. Generally, branch site REL models can accommodate ALL of the scenarios included in PAML's models, but the reverse is not true.

    Sergei


    Asutu wrote on Oct 10th, 2013 at 3:28pm:
    Hi Martin,

    thanks a lot for your insightful comments and help on interpreting these results. I further reduce the complexity of my data using only 9 sequences (one from each outgroup species plus one from each of the species of interest an another from my intogressed lineage) and the same result stands, which may indicate that there is sufficient power to detect these signs of episodic diversifying selection. These results are really really cool and very promising. I think I will need to run a global fitting and compare it to this local, although I do expect significance. I think that this can be also done with analyseCodonData.bf, is this right? Should I also test the branch-site model in paml? Since only two lineages were identified as being positively selected, should I expect the two models to give the same results?

    Thanks in advance.
    Pedro


    HyPhy message board » Powered by YaBB 2.5.2!
    YaBB Forum Software © 2000-2024. All Rights Reserved.