HyPhy message board - Print Page

HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl Methodology Questions >> How to >> How to measure the adequacy of a model http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1114509450 Message started by Federico Abascal on Apr 26^th, 2005 at 2:57am

Title: How to measure the adequacy of a model
Post by Federico Abascal on Apr 26^th, 2005 at 2:57am

Does HYPHY calculate some statistic to measure the adequacy of a model (e.g. WAG model, Dayhoff model, etc)?
For example, calculating the unconstrained(maximum) likelihood of the data and comparing that likelihood with the one obtained with the model.

Or any other test to measure the adequacy.

Thanks in advance,
Federico

Title: Re: How to measure the adequacy of a model
Post by Sergei on Apr 26^th, 2005 at 10:15pm

Dear Federico,

HyPhy can estimate the multinomial upper bound (i.e. the non-paramteric likelihood estimate) for a given dataset without ambiguities (it is in Standard Analyses->Miscellaneous). There is a recent paper by Waddell (MBE 2005) that proposes a method of extending the estimate to include data with ambiguous characters (e.g. deletions) but I have not implemented it yet.

Did you have any particular measure in mind?

Cheers,
Sergei

Title: Re: How to measure the adequacy of a model
Post by Federico Abascal on Apr 27^th, 2005 at 3:09am

Dear Sergei,

I already know the paper of Waddell... I was thinking in exactly that, but also open to other ideas. I wrote Waddell asking for the implementation of his algorithm, but got no response (to implement it seems too complicate to me). I would desire to calculate the unconstrained likelihood WITH ambiguities.

What I want is to measure model adequacy... but I'm not convinced that comparing the likelihood under a given model with the unconstrained one give you much information. There are always very big differences, and I don't believe that in some case that test would tell you: "this model is adequate" (at least for proteins). Even so, I wanted to play around with it.

I'll watch HyPhy in the future to see if you some day include the unconstrained+ambiguities calculation.

Thanks a lot,
Federico

Title: Re: How to measure the adequacy of a model
Post by Sergei on Apr 27^th, 2005 at 7:21am

Dear Federico,

I agree that the multinomial approximation is probably not a useful measure of model fit - since there will indeed always be a very large difference. I'll put the Waddell method on my list of things to add to the program - I can't recall how difficult it appeared to implement. The Standard Analyses->Miscellaneous->UpperBound.bf can also be used to get a rough idea of the upper bound on the likelihood even with ambiguities (if one considers a gap to be a proper 'state' for example).

One thing you could try in the meantime is to simply fit the general reversible model for proteins (which HyPhy implements) and compare that to the fit of named models. This will at least tell you whether the rates of substitution are modelled adequately.

Cheers,
Sergei

Title: Re: How to measure the adequacy of a model
Post by Simon on May 3^rd, 2005 at 6:00pm

Just to add to Sergei's comment; although an overall measure of deviance from the multinomial may not be a good measure of the goodness of fit, looking at the contribution of each site to the difference in likelihood between two models can give a reasonable impression of whether there are sites that are fitting particularly poorly.

Best,
Simon

Title: Re: How to measure the adequacy of a model
Post by federico Abascal on May 4^th, 2005 at 2:12am

Dear Simon and Sergei,

just to thank you for your suggestions. I'll investigate on those directions.

best,
Federico