Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Test for branch specific selection (Read 4558 times)
Asutu
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 3
Test for branch specific selection
Oct 4th, 2013 at 2:02am
 
Hello,

I am very newbie to HYPHY but I would like to run it to test some hypothesis on my data, which consists of a set of populations (plus outgroups).

One of these populations have gone through what it seems to have been an introgression with one of the outgroups, however there are already several and (potentially) important nucleotide differences between these two intervinients. What this seems to me is that, in some way, the introgression might had have a negative effect and that could have accelerated the rate of evolution in my population of interest (this pop is associated to a specific niche).

The analysis/workflow that I am thinking to follow are:
  • select a nucleotide model (NucModelCompare.bf)
    detect recombination (GARD.bf): I have preliminary runned SingleBreakpointRecomb.bf and none was found
    fit a global model (AnalyzeCodonData.bf)
    fit a local model (BranchSiteREL.bf)


Do you think these analyses are ok to follow with?
Can this be done with my populations and with the introgressed region?
Should I use other outgroups for fitting a model?
How can I compare the fitting of each model (global vs local) on the command line, is just a LRT or is there any batch file to run?
Finally, do you have other experienced suggestions so I could test my hypothesis in HYPHY?

Thanks in advance.

Pedro
Back to top
 
 
IP Logged
 
Asutu
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 3
Re: Test for branch specific selection
Reply #1 - Oct 8th, 2013 at 5:07pm
 
Hello again,

I have run BranchSiteREL.bf on my data above (300 codon positions from 29 sequences aligned in protein space and a maximum likelihood tree from this alignment) and it seems that there was statistical evidence for episodic positive selection in a proportion of sites on two of the branches (one being an outgroup). However I have some questions about the interpretation of the results files and would be very grateful if you could help me with them. Below are some lines of interest (tell me if you need me to attach all the file):

Branch,Mean_dNdS,RateClasses,OmegaOver1,WtOmegaOver1,LRT,p,p_Holm,BranchLength
Outgroup4,0.6338530211413603,2,39.50934283283482,0.05867113793959489,12.57492751
991231,0.0001954782029494062,0.01075130116221734,0.2337171125395624
8,0.7937461041192106,2,513.7712033581857,0.003730739712586084,10.18154377818246,
0.0007092668219026432,0.03830040838274273,0.006621847307589422
25,10,1,950694.3545098663,1,0.6620847626218165,0.2079124535772852,1,0.0011004592
86759653


1) If I understand things correctly this means that, for example for branch 8, the mean dNdS for this branch is around 0.8 but there are a fraction of 0.3% of sites that have a mean dNdS of 513? Why did this branch was identified as significant and branch 25 did not? (is it because the latter was assign with only one omega rate class? despite the much higher omega value?)
2) in the output ps file I see the branch leading to the Outgroup4 as a blue thick line with red in the tip end but for branch 8 I only see a black thick line. Is this ok? shouldn't this branch be colored? or has this to do with the fraction of sites that were identified as being under diversifying selection?

Other more general questions (if I may):
3) I am mainly using intrapopulation data with an introgressed region in one of the populations. Is this ok for this type of analysis?
4) Does BranchSiteREL also tests for a global fitting or this as to be done separately?

Sorry if all these sound like naive questions but I am just now getting into selection analysis.
Thanks.
Pedro
Back to top
 
 
IP Logged
 
Martin
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 1
Re: Test for branch specific selection
Reply #2 - Oct 9th, 2013 at 1:29pm
 
Hi Pedro,

Hopefully I can answer a few of your questions.

1. Yes, a mean dNdS of ~0.8 was found for branch 8 while 0.3% of sites have a
dNdS of 513. This branch was deemed significant because the null hypothesis
where no sites are under positive selection could be rejected using the
likelihood ratio test. Branch 25 was not found to be significant because the
likelihood ratio test did not support rejection of the null hypothesis,
likely because the likelihood increase vs a neutral omega was small. It is
also likely that power is low at branch 25 due to its short length.

2. Branches are each colored with by omega value and proportion, with blue
being dNdS <1, grey being dNdS=1 and red being dNdS>1. Line thickness
indicates significance of positive selection. If you cannot see the red tip
it is likely because the proportion of sites is too small to not be hidden or
rounded off.

3. BSREL is potentially applicable. The usual assumptions apply, as does
their violation, e.g. are nucleotide proportions roughly the same across the
topology, should alphas vary across the tree if branch length instability
occurs as a result of ultra-high omegas, etc. As not all variants have been
driven to fixation by selection, however, the dN/dS results are somewhat more
difficult to interpret. Proceed with caution.

4. A likelihood ratio test between global and local fits can be done
separately, manually or through modification of BranchSiteREL.bf.

Martin

Martin Smith
PhD Candidate
Viral Evolution Group
Bioinformatics and Systems Biology
University of California San Diego
Back to top
 
 
IP Logged
 
Asutu
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 3
Re: Test for branch specific selection
Reply #3 - Oct 10th, 2013 at 3:28pm
 
Hi Martin,

thanks a lot for your insightful comments and help on interpreting these results. I further reduce the complexity of my data using only 9 sequences (one from each outgroup species plus one from each of the species of interest an another from my intogressed lineage) and the same result stands, which may indicate that there is sufficient power to detect these signs of episodic diversifying selection. These results are really really cool and very promising. I think I will need to run a global fitting and compare it to this local, although I do expect significance. I think that this can be also done with analyseCodonData.bf, is this right? Should I also test the branch-site model in paml? Since only two lineages were identified as being positively selected, should I expect the two models to give the same results?

Thanks in advance.
Pedro
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Test for branch specific selection
Reply #4 - Oct 31st, 2013 at 3:06pm
 
Hi Pedro,

1). I do not think the test of the global model vs the fully local model is needed. It is not a particularly interesting test (since a positive result -- the global model is rejected is not informative,   because you would most likely like to know WHERE selection took place), and would only make sense if you had NO positively selected branches. In the latter case, you might see some evidence of deviation from the global model overall, but not enough signal on any single branch to reach significance.

2). I have no idea what PAML's models would tell you. There are so many different assumptions between them that the agreement (or lack of agreement) between the outcomes is informative. What we HAVE shown is that PAML's models (which force a uniform model of evolution on background branches) can have very poor statistical properties when their assumptions is violated. Generally, branch site REL models can accommodate ALL of the scenarios included in PAML's models, but the reverse is not true.

Sergei

Asutu wrote on Oct 10th, 2013 at 3:28pm:
Hi Martin,

thanks a lot for your insightful comments and help on interpreting these results. I further reduce the complexity of my data using only 9 sequences (one from each outgroup species plus one from each of the species of interest an another from my intogressed lineage) and the same result stands, which may indicate that there is sufficient power to detect these signs of episodic diversifying selection. These results are really really cool and very promising. I think I will need to run a global fitting and compare it to this local, although I do expect significance. I think that this can be also done with analyseCodonData.bf, is this right? Should I also test the branch-site model in paml? Since only two lineages were identified as being positively selected, should I expect the two models to give the same results?

Thanks in advance.
Pedro

Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged