Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
specie-specific substitutions models (Read 2808 times)
gustavo
YaBB Newbies
*
Offline


I love YaBB 1G - SP1!

Posts: 12
specie-specific substitutions models
Jun 2nd, 2005 at 8:47pm
 
Hello,

We are studying a family of a protein that adopts two different quaternary structures in different species. We can define a substitution model specific for each quaternary structure. Is it possible to perform ML analysis of a given tree (that includes both types of species) combining our quaternary structure-specific substitution models? I know that HYPHY can use different models for different positions in the alignment, but I don't know if it is possible to use specie-specific models.

Many thanks,

gustavo
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: specie-specific substitutions models
Reply #1 - Jun 2nd, 2005 at 9:32pm
 
Dear Gustavo,

Quote:
We are studying a family of a protein that adopts two different quaternary structures in different species. We can define a substitution model specific for each quaternary structure. Is it possible to perform ML analysis of a given tree (that includes both types of species) combining our quaternary structure-specific substitution models? I know that HYPHY can use different models for different positions in the alignment, but I don't know if it is possible to use specie-specific models.


It is indeed possible to do have different models in different parts of the tree. I am attaching a complete example (based on the standard p51.nex file which comes with the HyPhy distribution; the example will actually download the file from our server) to fit a rather meaningless mix of HKY85 and F81 to different parts of the tree; in this case HKY85 to subtype B viruses and F81 to subtype D viruses.

The key part of the example is the use of extended Newick syntax to define the tree and attach different models to different branches.

One caveat though: the models should have the same equilibrium frequencies; this limitation is NOT imposed by HyPhy - it will happily compute the likelihood, but the likelihood may actually be incorrect, because we can't infer the distribution of characters at the root.

Cheers,
Sergei

Code:
/* download the data, make data filter and count character frequencies */

GetURL (p51data, "http://www.hyphy.org/pubs/seqs/p51.nex");

DataSet 		ds 	= ReadFromString (p51data);
DataSetFilter	dsf	= CreateFilter (ds,1);

HarvestFrequencies (freqs, ds, 1, 1, 1);

/* define the F81 and HKY85 rate matrices */

F81Matrix = {{*,t,t,t}
		     {t,*,t,t}
		     {t,t,*,t}
		     {t,t,t,*}};

HKY85Matrix = {{*,tv,ts,tv}
			{tv,*,tv,ts}
			{ts,tv,*,tv}
			{tv,ts,tv,*}};

/* define the F81 and HKY85 models */

Model F81   = (F81Matrix, freqs, 1);
Model HKY85 = (HKY85Matrix, freqs, 1);

/* define the extended tree syntax; model name enclosed in curly braces following the name
of the sequence or a closing ) for internal nodes.

Note that we don't have to specify HKY85 explictly (although it's done below for clarity),
because HyPhy will use the last defined model (or the model set with UseModel), to define
all branches without an explicit model attachment.

*/

Tree p51_tree = ((((D_CD_83_ELI_ACC_K03454{F81},D_CD_83_NDK_ACC_M27323{F81}){F81},
				  D_UG_94_94UG114_ACC_U88824{F81}){F81},D_CD_84_84ZR085_ACC_U88822{F81}){F81},
				   B_US_83_RF_ACC_M17451{HKY85},((B_FR_83_HXB2_ACC_K03455{HKY85},B_US_86_JRFL_ACC_U63632{HKY85}){HKY85},
				   B_US_90_WEAU160_ACC_U21135{HKY85}){HKY85});

LikelihoodFunction lf = (dsf, p51_tree);

Optimize (res, lf);

LIKELIHOOD_FUNCTION_OUTPUT = 1;
fprintf (stdout, lf);

 

Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Gustavo Parisi
Guest


Re: specie-specific substitutions models
Reply #2 - Jan 7th, 2006 at 10:10am
 
Dear Sergei,
I'm back with a subjet posted last june refering to lineages-specific models of evolution.
As you mentioned in your email, the models should have the same equilibrium frequencies to avoid problems in the calculation of the distribution of characters in the root. In this sense,  I would like to know if it is possible to assign a given model to the root. So, in this way for a tree with two linages-specific models (M1 and M2),  I can test the hypothesis for M1 or M2 at the root (for example using a LRT) and  then I can choose the best model representing the ancestral character distribution. Is this possible?
Thank you very much.
Regards,

gustavo
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: specie-specific substitutions models
Reply #3 - Jan 10th, 2006 at 8:34am
 
Dear Gustavo,

Quote:
 I would like to know if it is possible to assign a given model to the root.


The root of the tree does not really have a model, only a distribution of character states. In earlier versions of HyPhy one could define this vector explicitly, but a few years back Spencer Muse and I decided that this option was too confusing and potentially dangerous to keep.

It would not be too difficult to restore if you really need it, but one must be very sure that the quantity computed by the pruning algorithm is indeed the likelihood function (of some sort).

Could you give me a bit more detail about your problem (perhaps by e-mail)?

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged