Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Interpretation of Q matrix (Read 4338 times)
Jeff
YaBB Newbies
*
Offline


HyPhy Junkie

Posts: 9
Interpretation of Q matrix
Mar 30th, 2007 at 9:41am
 
Hi Sergei,

I am interested in using likelihood to compare substitution rates between specific nucleotides.  The problem I am running into is in interpretation of the various rates that are calculated, because it seems there are at least three different categories of 'rates'.

The first category is simply the parameters used to define a particular model.  For instance, we use the parameters a, b, c, d, e, and f to define the general time-reversible model.  I think these are called rate ratios.

The second category is defined in Q, the instantaneous rate matrix, where each rate ratio is multiplied by the appropriate nt frequency.  So each element, qXY, of the matrix represents the instantaneous rate from nt X to nt Y.  For example,
qAC = a * freqC
qAG = b * freqG

A third category is also commonly calculated, where the instantaneous rate is multiplied by another nt frequency parameter.  So you get a rate defined by qXY * freqX.  For instance, this rate is used when testing for reversibility of a Q matrix (Does qXY * freqX equal qYX * freqY?).

Here's what I'd like to know: 1) what are the proper names for each of these three different categories of rates, 2) what is the precise interpretation of these different categories (if there is one), and 3) do you know of any references that provide a good explanation of these various rates?

Thanks!
Jeff
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Interpretation of Q matrix
Reply #1 - Apr 2nd, 2007 at 11:20am
 
Dear Jeff,

Here are a couple of facts about the rate matrix Q that you may find useful.

1). For any finite state time-homogeneous (i.e. the rate matrix is independent of time) Markov process there exists an equilibrium rate distribution pi (column vector) such that pi*Q = 0, or equivalently, pi*T(t) = pi, where T(t) = exp (Qt) - the transition matrix.

2). If in addition, the process is time-reversible (i.e. pi_i * q_ij = pi_j * q_ji for all i,j), then Q can be written as q_ij = r_ij * pi_j, and r_ji = r_ij. This is the usual description of the substitution process. For non-reversible models this decomposition does not in general exist.

3). q_ij is the raw substitution rate from state i to state j. Note that even if r_ab and r_ac are equal, the raw substitution rates q_ab = r_ab * pi_b and q_ac = r_ac * pi_c are not in general equal (since pi_b and pi_c are in general not equal). This substitution bias is required to maintain reversible equilibrium (i.e. substitutions into more prevalent states have to be relatively more frequent than those into relatively less prevalent states, otherwise the frequencies will drift).

4). One almost always wants to compare the r_ij components of q_ij because (i) they are directly comparable between datasets with different base compositions (because pi_j take care of the differences); (ii) they directly measure the biases in substitution frequencies.

Hope this helps,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Aidan Budd
YaBB Newbies
*
Offline


Monkey fed...

Posts: 29
Heidelberg, Germany
Gender: male
Re: Interpretation of Q matrix
Reply #2 - Jul 21st, 2008 at 8:37am
 
Hi Jeff/Sergei,

Was looking through the forum, and found your post on interpretation of Q that is relevant to some problems I have understanding it.

Sergei - you mention that
Quote:
2). If in addition, the process is time-reversible (i.e. pi_i * q_ij = pi_j * q_ji for all i,j), then Q can be written as q_ij = r_ij * pi_j, and r_ji = r_ij. This is the usual description of the substitution process. For non-reversible models this decomposition does not in general exist."

But, when I read something like Huelsenbeck et al. 2002 (Syst Biol 51(1):32-43 "Inferring the Root of a Phylogenetic Tree" PMID:11943091), or Swofford et al. 1996 (in Molecular Systematics book by Hillis et al.), a reversible matrix is shown of the form where qij = rij * pij   (for both articles, this is equation 3).

Not sure how to interpret the difference between these representations of these reversible matrices, and your comment...? Is it that, as these general non-reversible models are stationary, there are indeed pii for each state, and that there are rij for each possible transition, but that these are perhaps non-unique or something like that?

Illumination much appreciated

Thanks

Aidan
Back to top
 

Aidan Budd&&Computational Biologist&&EMBL Heideberg, Germany
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Interpretation of Q matrix
Reply #3 - Jul 22nd, 2008 at 3:01pm
 
Dear Aidan,

I am a bit confused; the papers you cited state exactly the same decomposition that my comment mentioned.

This q = r * pi decomposition for reversible matrices necessarily implies that pi are the stationary frequencies. Those frequencies are unique for a given Markov process (finite state ergodic, which all phylogenetic models are). They tell you about where the process will be if you run it long enough, i.e. lim (t->infty) T(t) = PI, T is the transition matrix (exp (Qt)) and PI is the matrix where column i consists of pi_i

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Aidan Budd
YaBB Newbies
*
Offline


Monkey fed...

Posts: 29
Heidelberg, Germany
Gender: male
Re: Interpretation of Q matrix
Reply #4 - Jul 23rd, 2008 at 5:03am
 
Hi Sergei,

Firstly - many many thanks for taking the time to help me with these questions.

One reason my question was confusing is that I mis-typed something - I meant to say that when looking at other descriptions of the 12-model matrix (e.g. in Swofford et al.), the non-reversible matrix has q_ij = r_ij * pi_j   even if r_ji =/= r_ij - and I understood from your message that
Quote:
For non-reversible models this decomposition does not in general exist.

(which I thought meant that the decomposition of q_ij into s_ij * pi_j was not possible for these matrix).

The reason why I'm asking along these lines is that I'm keen to get a reasonable idea of the implications (and interpretation) of non-reversible models. In particular, what I don't understand is how the equilibrium state of these models is maintained. For the reversible models,  pi_i * q_ij = pi_j * q_ji for all i,j - so at the steady-state/equilibrium, in a given time interval, the same number of transitions occur from all i to all j [e.g. with a two-state model, the same number of 1->0 transitions occur as 0->1 transitions).

Do I understand right when I read that the above model is reversible, and that for a non-reversible model then, at equilibrium (and taking the binary state model), the number of 1->0 transitions is DIFFERENT from the number of 0->1 transitions? i.e. if for example in a binary model q_01 is greater than q_10 (i.e. r_10 > r_01), then at equilibrium the number of changes of state from 0->1 is greater than the number of changes from state 1->0.

But if that's the case, I'd expect the frequency of state 1 to be increasing with time.... but we're at equilibrium, so the frequencies of the different states should be constant...? This is the contradiction I can't understand - I think that my problem lies somewhere with my interpretation of what it means to be in equilibrium - perhaps you can see where I'm making a mistake in my understanding?

Thanks

Aidan
Back to top
 

Aidan Budd&&Computational Biologist&&EMBL Heideberg, Germany
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Interpretation of Q matrix
Reply #5 - Jul 25th, 2008 at 3:12pm
 
Dear Aidan,

Aidan Budd wrote on Jul 23rd, 2008 at 5:03am:
One reason my question was confusing is that I mis-typed something - I meant to say that when looking at other descriptions of the 12-model matrix (e.g. in Swofford et al.), the non-reversible matrix has q_ij = r_ij * pi_j   even if r_ji =/= r_ij - and I understood from your message that

(which I thought meant that the decomposition of q_ij into s_ij * pi_j was not possible for these matrix).



I am pretty sure that that decomposition does NOT exist in general for  non-reversible models (because rates and frequencies are now interdependent); I'll check the papers you quote to carefully examine their statements. I'm in an airport now and don't have much time to think about this carefully, but I'll come up with an example and post it here.


Quote:
The reason why I'm asking along these lines is that I'm keen to get a reasonable idea of the implications (and interpretation) of non-reversible models. In particular, what I don't understand is how the equilibrium state of these models is maintained. For the reversible models,  pi_i * q_ij = pi_j * q_ji for all i,j - so at the steady-state/equilibrium, in a given time interval, the same number of transitions occur from all i to all j [e.g. with a two-state model, the same number of 1->0 transitions occur as 0->1 transitions).

Do I understand right when I read that the above model is reversible, and that for a non-reversible model then, at equilibrium (and taking the binary state model), the number of 1->0 transitions is DIFFERENT from the number of 0->1 transitions? i.e. if for example in a binary model q_01 is greater than q_10 (i.e. r_10 > r_01), then at equilibrium the number of changes of state from 0->1 is greater than the number of changes from state 1->0.


Detailed equilibrium is not necessary for stationarity in general (for models with more that two states, i.e. you could for a three state model you can have more 1->2 than 2->1 transitions but that can be compensated for by transitions from the third state). For a two-state model generally the form q_ij = r_ij pi_j AND stationarity imply reversibility. Indeed, for

Q = [[ -r01 pi_1, r01 pi_1][r10 pi_0, -r10 pi_0]]

to have pi_0, pi_1 (=1 - pi_0) the following must hold:

-r01 pi_1 pi_0 + r10 pi_0 pi_1 = 0; in other words r01 = r10 must hold (hence it will be reversible).

For models with more than two states, you can have the q_ij = r_ij pi_ j representation only if there are some conditions or r_ij and pi_j.

HTH,
Sergei







Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged