Dear Jeff,
Quote:I have another question if you don't mind. I'd like to estimate the actual number of specific changes that have occurred along a particular lineage. For example, the number of A->G transitions that have occurred along the human branch subsequent to the split between humans and chimps. I'm using local rate matrices, so that each branch has its own set of rates. And the matrices are nonreversible as I've mentioned, so that the A->G rate is separate from the G->A rate. What would be the best way to go about this?
This is a fairly tricky question. One easy option (which is likely to undercount the number of changes) is to infer the ancestral sequences and simply count the number of changes from A->G (or any other pair) along the branches. The problem with this is two-fold
(a). We are treating ancestral states as known, ignoring reconstruction errors (this can be corrected fairly easily, by computing support for a given branch labeling, much like we did in the WAC method in Multimedia File Viewing and Clickable Links are available for Registered Members only!! You need to
).
(b). Secondly, what you are really after is the expected number of A->G transitions over any branch. Computing this quantity is much more involved, because it involves integration over all paths of a Markov chain which can take it from one state to another (not necessarily from A or to G, but all possible 16 pairs) and has at least one A->G substitution. Dutheil et al (Multimedia File Viewing and Clickable Links are available for Registered Members only!! You need to
) address this to an extent (see equations 4-7). Their approach still does not directly yield the expected number of A->G substitutions, but it can be modified to approximate it (I believe).
HTH,
Sergei