HyPhy message board | |
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
Methodology Questions >> How to >> use user-defined branch lengths for analyses http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1240046106 Message started by Seraina on Apr 18th, 2009 at 2:15am |
Title: use user-defined branch lengths for analyses Post by Seraina on Apr 18th, 2009 at 2:15am
Hello,
I'm trying to use the branch lengths specified in the tree file for the estimation of site specific substitution rates while estimating the model parameters (actually, a GTR) from the data. How do I have to proceed to constrain the branch lengths before the likelihood optimization? Thanks for your help! Seraina |
Title: Re: use user-defined branch lengths for analyses Post by wayne on Apr 22nd, 2009 at 10:29am
Hi Seraina,
you can use the SiteRates2.bf batch file in ./TemplateBatchFiles/ which will implement a site by site rate estimation with other parameters shared across the tree and branch lengths constrained to that in your tree file. cheers ./w |
Title: Re: use user-defined branch lengths for analyses Post by Seraina on Apr 22nd, 2009 at 11:56pm
Hi Wayne,
thanks for your reply. Actually, I've already used (and modified for my purpose) the batch file SiteRates2.bf. However, when it comes to the step "Optimize (resNull,lf)", not only the model parameters, but also the branch lengths of my input tree are modified. So what I'm looking for is a way to constrain the optimization procedure to only estimate the parameters of the GTR model from the data, but not modifying the branch lengths. Is this possible? Thanks for the help! Seraina |
Title: Re: use user-defined branch lengths for analyses Post by Seraina on Apr 23rd, 2009 at 12:25am
Okay, maybe I've missed something there - I was actually only using the SiteRates.bf. Do I correctly understand that the line
"ReplicateConstraint ("this1.?.?:=this2.?.?__",givenTree,givenTree);" is constraining all the branch lengths of the tree to the actual values? And "this1.?.?" means all the branches of the tree (with this1.? being all the nodes?)? Is there any reference file where I can find details about the class definitions in HyPhy? Thanks again! Seraina |
Title: Re: use user-defined branch lengths for analyses Post by wayne on Apr 23rd, 2009 at 4:34pm
Hi Seraina, I'll try answer this post and the one to the feedback board here. so yes there is a difference between SiteRates.bf and SiteRates2.bf.
SiteRates.bf will first fit a model of the entire alignment in which it estimates both branch length parameters and model parameters (AC, GC etc depending on your model). In the site wise model, the model parameters are constrained to those estimated across the alignment, and the branch lengths (per site) are set to someFactorX*alignmentWideBranchLength. So the site wise analysis returns a substitution rate at a site relative to the average across all sites. It does not fit independent model parameters for each site (AC, GT, etc are all shared across all sites). SiteRates2.bf will do the same, but the BranchLengths in someFactorX*alignmentWideBranchLength are taken from file. This would be useful if you had some independent estimates of length of branches in units of time. to answer your direct ? ReplicateConstraint ("this1.?.?:=this2.?.?__",givenTree,givenTree) in the SiteRates2.bf file will set the givenTree to exact values from the tree read from file. These will apply to all parameters associated with branches, but not with model parameters that are global across all branches. There was a problem with the previous SiteRates2.bf in that not all the someFactorX's were being applied. I've attached an edit that will also go into the next release. If what I've described above is what you want then use this code. But from your later posts I think you actually want independent model parameters (GTR) for each site. hope this helps somewhat ./w http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?action=downloadfile;file=SiteRates2.bf (7 KB | )
|
Title: Re: use user-defined branch lengths for analyses Post by Seraina on Apr 24th, 2009 at 5:20am
Hi Wayne,
thank you for your answers! What I actually need is exactly what the program DNARates by Gary Olsen has done, which is to calculate site specific rates with reference to a "known" tree with fixed branch lengths. So the branch lengths should be global (and not be recalculated or scaled for each site). The rate should then be given with reference to the branch length in the original file, and not relative to any average over sites. The rates estimated at each site would accordingly be the same no matter whether I use the whole alignment or only the first 20 sites. Thanks for the help! Seraina |
Title: Re: use user-defined branch lengths for analyses Post by Seraina on Apr 24th, 2009 at 8:44am
Dear Sergei,
thanks a lot for your answer - this previous discussion with Francesc was exactly what I was looking for. I think I now more or less understand your approach, and it should really give similar results to DNARates. What HyPhy actually does better is to estimate a rate of zero for invariant positions - in contrast to DNARates. Indeed, the branch lengths I have are in units of time - so a chronogram as in Francesc's case. I get very similar rate distributions in HyPhy and DNARates when first re-estimating branch lengths under the molecular clock constraint in HyPhy and then estimating the rates (with a modified SiteRates.bf). However, the rate estimates are rather different when using the SiteRates2.bf and the same branch lengths as I used for input in DNARates - I'll try to figure out why this is the case. One last question: does the branch parameter constraint in the SiteRates2.bf really work? : global rateX = 1; ReplicateConstraint ("this1.?.?:=rateX*this2.?.?__",givenTree,givenTree); Does this really freeze the branch length to the values at the moment (or only keeping them proportional with rateX, or even doing actually nothing)? Just wondering because in the example for constraints in the manual (a=3; b=2; c:=a+b; a=5; --> c=7), the constraint is for the parameters and not their current values. Sincerely Seraina |
Title: Re: use user-defined branch lengths for analyses Post by Sergei on Apr 24th, 2009 at 9:12am
Dear Seraina,
[code] x:=x__; [/code] can be read as 'constrain x to its current value'. The '__' postfix is the evaluator operation that replaces the variable (x) with its current value before the formula is evaluated. Cheers, Sergei |
Title: Re: use user-defined branch lengths for analyses Post by Seraina on May 8th, 2009 at 8:17am
Dear Sergei (and others),
Looking at the SiteRates2.bf again, I found out what the problem was. Actually, the rates were output correctly on the screen (i.e. in substitutions per unit of time, with the unit of time as specified by the branch lengths in the input tree), but then there was some code at the bottom of the file which rescaled all the rates according to the average site rate of the alignment. The rates saved to file and output in the diagram were thus wrong (or let's say, not as one would expect with a dated tree). Attached is thus a new batch file which can be used with user-specified branch lengths. Could it be included in the next release of HyPhy? Or do I just post it in the Contributions? One last question still: The implementation is currently working with the 3 trees that you mentioned in your posts to Francesc: - the input tree - the expected tree (which is rateX*inputTree) - the site tree (which is siteRate*expectedTree) But then, you also use a parameter "scalingfactor" which is (total tree length of the expected tree) / (total tree length of input tree). In my understanding, this should be the same as the factor "rateX", but it attains only a quite similar value. Why do you have to make the apparent detour via the estimated branch lengths? Finally, thanks a lot for all the help (also to Wayne and artpoon)! Seraina http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?action=downloadfile;file=SiteRates_ForDatedTree.bf (7 KB | )
|
HyPhy message board » Powered by YaBB 2.5.2! YaBB Forum Software © 2000-2024. All Rights Reserved. |