HyPhy message board - Print Page

Dear Jennifer,

Your message raises a number of points. To answer your main question, if you try to estimate a parameter such as the transition/transversion ratio with too little data, then ML estimates may be highly biased; imagine you have so few changes that there are no transversions, then the ML estimate of the ratio will be infinity. For small datasets, it's highly recommended that you inspect the confidence intervals on the parameters. HyPhy provides a simple interface to calculating confidence intervals based on profile likelihood.

As an aside, you should be careful about interpreting the results obtained by fitting a HKY85 model, as it may not be the best model for your data. Spencer Muse has a chapter in Keith Crandall's Evolution of HIV book (Multimedia File Viewing and Clickable Links are available for Registered Members only!! You need to

Login) which shows that HIV genes don't often conform to one of the simple 'named' models such as HKY85, TN93, etc. John Huelsenbeck et al. (Multimedia File Viewing and Clickable Links are available for Registered Members only!! You need to

Login) also demonstrated that the best fitting models are not always 'named' models. Also have a look at David Posada and Thomas Buckley's model selection paper in Systematic Biology. If you want to get a quick idea of which is the best model, upload the data to DataMonkey, and use the model selection option, which will fit 203 reversible nucleotide models to your data. There's also an option in HyPhy that allows you to do the same thing, but unless you have an MPI enabled cluster of computers handy, DataMonkey will be much faster.

In addition, it may be wise to use the tree based on the alignment of interest, rather than the tree estimated from the whole genome, as this will be less likely to be affected by recombination. Even though there's less information in short alignments, the tree shouldn't affect the estimates of the substitution parameters much.

Best wishes
Simon

HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl Theoretical questions >> Sequence Analysis >> Number of Sites Required for Ti/Tv http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1108651563 Message started by Jennifer Knies on Feb 17^th, 2005 at 6:46am