Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
beta distribution in codon models (Read 3654 times)
konrad
Junior Member
**
Offline


I love YaBB 1G - SP1!

Posts: 53
beta distribution in codon models
Jan 10th, 2006 at 1:02am
 
Hi Sergei,

On running model 8 (beta and omega) using code adapted from NielsenYang.bf (definition of the w distribution unchanged), with 10 components in the discretised beta, I frequently get values larger than 1 for one or more components, e.g:

Rate[1]=  0.47114976 (weight=0.0789979)
Rate[2]=  0.92748400 (weight=0.0789979)
Rate[3]=  0.99351373 (weight=0.0789979)
Rate[4]=  0.99959140 (weight=0.0789979)
Rate[5]=  0.99998337 (weight=0.0789979)
Rate[6]=  0.99999968 (weight=0.0789979)
Rate[7]=  1.00181077 (weight=0.0789979)
Rate[8]=  1.00000000 (weight=0.0789979)
Rate[9]=  2.54085271 (weight=0.0789979)
Rate[10]=  1.00000000 (weight=0.0789979)
Rate[11]=  1.61228244 (weight=0.2100211)

Any idea what could be going on here?

thanks,
Konrad
Back to top
 
WWW WWW  
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: beta distribution in codon models
Reply #1 - Jan 10th, 2006 at 7:40am
 
Dear Konrad,

The last component is the omega part (>=1, also notice different weight for it). So that is fine. The other values > 1 suggest that there is something funny going on with discretization. Could you post the P and Q values for this beta?

HTH,
Sergei
Back to top
« Last Edit: Jan 10th, 2006 at 8:50am by Sergei »  

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
konrad
Junior Member
**
Offline


I love YaBB 1G - SP1!

Posts: 53
Re: beta distribution in codon models
Reply #2 - Jan 11th, 2006 at 5:11am
 
Not sure about this - it's not in the res file. The MODEL_7.nex file has the following, which are not the initial values so I assume they are the final values?

global betaP=0.7763054513104841;
betaP:>0.05;
betaP:<85;
global betaQ=0.05027934440289945;
betaQ:>0.05;
betaQ:<85;

This is a bimodal beta with a weak peak at 0 and a strong peak at 1.
Back to top
 
WWW WWW  
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: beta distribution in codon models
Reply #3 - Jan 12th, 2006 at 11:30pm
 
Dear Konrad,

The discretization procedure fails to perform correctly because of the steep density gradient around 1. I'll fix the code (it needs to be in done in HyPhy source), and post a note here when it's done. In the meantime, try setting the minimum betaP and betaQ to 0.1

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
konrad
Junior Member
**
Offline


I love YaBB 1G - SP1!

Posts: 53
Re: beta distribution in codon models
Reply #4 - Jan 16th, 2006 at 7:32am
 
Thanks - this reduces the anomalous behaviour, but I still get some values > 1.
Back to top
 
WWW WWW  
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: beta distribution in codon models
Reply #5 - Jan 26th, 2006 at 11:05pm
 
Dear Konrad,

I have looked into how to resolve this discretization issue; unfortunately, there is not much that the current implementation lets you do - it discretizes the mixture, and because the gradient of the beta gets very steep around '1', the double precision is insufficient to resolve discretization intervals - one of them (incorrectly) spills over one, and you get silly values for the beta part. Note that if you remove the omega part, the beta distribution with same p and q will be discretized correctly, because of the [0,1] support coded into the definition of the category variable.

I'll actually add a new way to define mixtures (e.g. category mixed = (cat1,p1,cat2,p2,...,catN,(p_n)), which will allow to specify "hard" bounds on the components and avoid round-off error related issues.

In the meantime, I suggest inspecting the output for oddities like the one you found - this is a fairly rare case, and it suggests that beta+w is a poor choice to model the distribution of rates in your case.

Thanks for bringing this to my attention and stay tuned.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged