HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
Methodology Questions >> How to >> specie-specific substitutions models
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1117770442

Message started by gustavo on Jun 2nd, 2005 at 8:47pm

Title: specie-specific substitutions models
Post by gustavo on Jun 2nd, 2005 at 8:47pm
Hello,

We are studying a family of a protein that adopts two different quaternary structures in different species. We can define a substitution model specific for each quaternary structure. Is it possible to perform ML analysis of a given tree (that includes both types of species) combining our quaternary structure-specific substitution models? I know that HYPHY can use different models for different positions in the alignment, but I don't know if it is possible to use specie-specific models.

Many thanks,

gustavo

Title: Re: specie-specific substitutions models
Post by Sergei on Jun 2nd, 2005 at 9:32pm
Dear Gustavo,


wrote on Jun 2nd, 2005 at 8:47pm:
We are studying a family of a protein that adopts two different quaternary structures in different species. We can define a substitution model specific for each quaternary structure. Is it possible to perform ML analysis of a given tree (that includes both types of species) combining our quaternary structure-specific substitution models? I know that HYPHY can use different models for different positions in the alignment, but I don't know if it is possible to use specie-specific models.


It is indeed possible to do have different models in different parts of the tree. I am attaching a complete example (based on the standard p51.nex file which comes with the HyPhy distribution; the example will actually download the file from our server) to fit a rather meaningless mix of HKY85 and F81 to different parts of the tree; in this case HKY85 to subtype B viruses and F81 to subtype D viruses.

The key part of the example is the use of extended Newick syntax to define the tree and attach different models to different branches.

One caveat though: the models should have the same equilibrium frequencies; this limitation is NOT imposed by HyPhy - it will happily compute the likelihood, but the likelihood may actually be incorrect, because we can't infer the distribution of characters at the root.

Cheers,
Sergei

[code]

/* download the data, make data filter and count character frequencies */

GetURL (p51data, "Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login;

DataSet            ds      = ReadFromString (p51data);
DataSetFilter      dsf      = CreateFilter (ds,1);

HarvestFrequencies (freqs, ds, 1, 1, 1);

/* define the F81 and HKY85 rate matrices */

F81Matrix = [*,t,t,t}
                {t,*,t,t}
                {t,t,*,t}
                {t,t,t,*];
               
HKY85Matrix = [*,tv,ts,tv}
                 {tv,*,tv,ts}
                 {ts,tv,*,tv}
                 {tv,ts,tv,*];

/* define the F81 and HKY85 models */

Model F81   = (F81Matrix, freqs, 1);
Model HKY85 = (HKY85Matrix, freqs, 1);

/* define the extended tree syntax; model name enclosed in curly braces following the name
of the sequence or a closing ) for internal nodes.

Note that we don't have to specify HKY85 explictly (although it's done below for clarity),
because HyPhy will use the last defined model (or the model set with UseModel), to define
all branches without an explicit model attachment.

*/

Tree p51_tree = ((((D_CD_83_ELI_ACC_K03454{F81},D_CD_83_NDK_ACC_M27323{F81}){F81},
                         D_UG_94_94UG114_ACC_U88824{F81}){F81},D_CD_84_84ZR085_ACC_U88822{F81}){F81},
                          B_US_83_RF_ACC_M17451{HKY85},((B_FR_83_HXB2_ACC_K03455{HKY85},B_US_86_JRFL_ACC_U63632{HKY85}){HKY85},
                          B_US_90_WEAU160_ACC_U21135{HKY85}){HKY85});
                       
LikelihoodFunction lf = (dsf, p51_tree);
                                         
Optimize (res, lf);

LIKELIHOOD_FUNCTION_OUTPUT = 1;
fprintf (stdout, lf);

[/code]

Title: Re: specie-specific substitutions models
Post by Gustavo Parisi on Jan 7th, 2006 at 10:10am
Dear Sergei,
I'm back with a subjet posted last june refering to lineages-specific models of evolution.
As you mentioned in your email, the models should have the same equilibrium frequencies to avoid problems in the calculation of the distribution of characters in the root. In this sense,  I would like to know if it is possible to assign a given model to the root. So, in this way for a tree with two linages-specific models (M1 and M2),  I can test the hypothesis for M1 or M2 at the root (for example using a LRT) and  then I can choose the best model representing the ancestral character distribution. Is this possible?
Thank you very much.
Regards,

gustavo

Title: Re: specie-specific substitutions models
Post by Sergei on Jan 10th, 2006 at 8:34am
Dear Gustavo,


wrote on Jan 7th, 2006 at 10:10am:
 I would like to know if it is possible to assign a given model to the root.


The root of the tree does not really have a model, only a distribution of character states. In earlier versions of HyPhy one could define this vector explicitly, but a few years back Spencer Muse and I decided that this option was too confusing and potentially dangerous to keep.

It would not be too difficult to restore if you really need it, but one must be very sure that the quantity computed by the pruning algorithm is indeed the likelihood function (of some sort).

Could you give me a bit more detail about your problem (perhaps by e-mail)?

Cheers,
Sergei

HyPhy message board » Powered by YaBB 2.5.2!
YaBB Forum Software © 2000-2024. All Rights Reserved.