Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
REV+I+G batchfile (Read 10688 times)
Jose_Patane
YaBB Newbies
*
Offline



Posts: 26
Brazil
Gender: male
REV+I+G batchfile
Aug 23rd, 2006 at 9:17pm
 
Dear all,

It took me some time figuring out how to build a customized batch file for HyPhy with a REV matrix, plus a proportion of invariable sites and the other ones following gamma... I hope this can be useful to someone else as well (tested with the same tree and same parameters estimated from PAUP*, same results were obtained):

-----------------------------------------------------------------
/* Basefreqs, 5 transition types, proportion of invariable sites
   and rate heterogeneity are global. */

#include "displayfunction.bf";

myTopology = "some_tree"

DataSet myData  = ReadDataFile("some_data");
DataSetFilter myFilter = CreateFilter(myData,1);
HarvestFrequencies(obsFreqs, myFilter, 1, 1, 1);

/* "invariant sites" rate and "gamma shape (alpha) for rate heterogeneity" categories */
global Pinvar;
global shape;

/* Five individual global transition rates of each type (the other one is free to vary on each branch). */
global r1;
global r2;
global r3;
global r4;
global r5;

/* Discrete probability distribution used to estimate # of invariant sites:
   2 classes, 1 with invariant sites (0), the other with variable ones (1).
   The value of "Pinvar" shall be within the interval [0,1]
   Estimate 'Pinvar' */
category invClass = (2,{{Pinvar}{1-Pinvar}},MEAN, ,{{0}{1}}, 0,1);

/* Gamma distributed site rate heterogenity:
   4 classes, gamma prob. distribution of rates.
   Estimate 'shape' (= alpha) */
category rateCat =(4, EQUAL, MEAN, GammaDist(_x_,shape,shape), CGammaDist(_x_,shape,shape),0,1e25);

/* REV+I+G matrix. (I let the "C-T" rate to freely vary, but could be any other as well) */
REV_I_G_rateMatrix =
{{         *          ,r1*m*invClass*rateCat,r2*m*invClass*rateCat,r3*m*invClass*rateCat}
{r1*m*invClass*rateCat,         *           ,r4*m*invClass*rateCat,   m*invClass*rateCat}
{r2*m*invClass*rateCat,r4*m*invClass*rateCat,         *           ,r5*m*invClass*rateCat}
{r3*m*invClass*rateCat,   m*invClass*rateCat,r5*m*invClass*rateCat,         *           }};

Model REV_I_G = (REV_I_G_rateMatrix, obsFreqs);

UseModel(REV_I_G);
Tree myTree = myTopology;
LikelihoodFunction theLikFun = (myFilter, myTree);
Optimize(results,theLikFun);
fprintf (stdout,theLikFun);

/* Log 'Pinvar' to screen. */
GetInformation (catInfo,invClass);
catInfo = echoinvClass(catInfo);

/* Log 'shape' to screen. */
GetInformation (catInfo,rateCat);
catInfo = echoinvClass(catInfo);

---------------------------------------------------------------
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: REV+I+G batchfile
Reply #1 - Aug 24th, 2006 at 7:58am
 
Dear Jose,

Thanks! I haven't thought of this way to parameterize the G+I (using multiplicative category variables), it's actually a neat trick.

For the information of other users: 'displayfunction.bf' is located in Examples/Categories.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
ofedrigo
YaBB Newbies
*
Offline


I love YaBB 1G - SP1!

Posts: 14
Re: REV+I+G batchfile
Reply #2 - Jan 31st, 2007 at 12:09pm
 
hi,

I tried to use this trick in my own code and I got the following error message:

ERROR: only one category variable can be handled by this code

did I miss something or did somethind change in the way hyphy handle multiple category variables?

thanks,

Olivier
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: REV+I+G batchfile
Reply #3 - Jan 31st, 2007 at 1:08pm
 
Dear Olivier,

I think you may need to redefine the 'echoinvClass' function (in displayfunction.bf) to do this. It's not difficult to do, because all you have to do is add a '0' rate with weight Pinvar to the display, and then multiply all gamma weights by (1-Pinvar).

Cheers,
Sergei

Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
ofedrigo
YaBB Newbies
*
Offline


I love YaBB 1G - SP1!

Posts: 14
Re: REV+I+G batchfile
Reply #4 - Feb 1st, 2007 at 7:51am
 
Thanks for your email Sergei. Sorry, I looked more carefully and I located the problem. Actually I tried to plug in this G+I code into a modified Plato.bf. But the plato.bf does not handle multiple category variable (?). Is there an other way to programatically define REV+G+I?

Thanks,

Olivier
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: REV+I+G batchfile
Reply #5 - Feb 1st, 2007 at 3:27pm
 
Dear Olivier,

You can define the G+I category variable (resp bins in Gamma) thus:

Code:
	global alpha   = .5;
	alpha:>0.01;alpha:<100;

	global beta = 1;
	beta:>0.01;
	beta:<200;

	global P;
	P :< 1-(1e-10);
	P = 1/(resp+1);

	beta := (1-P)*alpha;


	catFreqMatrix = {resp+1,1};
	catFreqMatrix [0] := P;

	for (h=1; h<=resp;h=h+1)
	{
		catFreqMatrix [h] := (1-P)/resp__;
	}

	category c = (resp+1, catFreqMatrix, MEAN,
					(1-P)*GammaDist(_x_,alpha,beta)*(_x_>0),
					(1-P)*CGammaDist(_x_,alpha,beta)*(_x_>0)+P,
					0,
					1e25,
					(1-P)*CGammaDist(_x_,alpha+1,beta)*(alpha/beta)*(_x_>0)
				   );
 



Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
wayne
Global Moderator
*****
Offline


I love YaBB 1G - SP1!

Posts: 57
San Diego, CA
Gender: male
Re: REV+I+G batchfile
Reply #6 - Feb 8th, 2007 at 4:42am
 
Hi Sergei, this could deserve a new thread but is nonetheless related to this one. How is the conditional site likelihood matrix constructed when using multiplicative category variables and subsequently calling ConstructCategoryMatrix. I have two category variables (3 and 2 intervals respectively) which gives a (3*2)*sites matrix but some values are repeating. For instance
{ { 4.904 e-05, ...}
   { 4.904 e-05, ...}
   {3.480 e-05, ....}
   {3.480 e-05, ...}
    etc
}
thanks ./w
Back to top
 

Assistant Project Scientist&&Antiviral Research Center&&Department of Pathology&&University of California, San Diego
WWW WWW  
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: REV+I+G batchfile
Reply #7 - Feb 8th, 2007 at 12:18pm
 
Dear Wayne,

For multiple cat. variables, each column of 'ConstructCategoryMatrix' will be a flattened representation of a multi-dimensional matrix of all possible cat. var. values.

Each likelihood function has a ordering of category variables (usually in the order they were encountered during likelihood function construction), which can be retrieved at run time using

Code:
GetInformation (storageID, lfID);
 



Suppose for example, that there are two category variables: c1 (M classes) and c2 (N classes) and c1 is the first in the orderings.

Then the i-th entry (0-based indexing) of each column vector from 'ConstructCategoryMatrix' will correspond to i div N class in c1 and i mod N class in c2.

In your case, repeating values suggest that the second category variable does not affect that particular site (for whatever reason).

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
wayne
Global Moderator
*****
Offline


I love YaBB 1G - SP1!

Posts: 57
San Diego, CA
Gender: male
Re: REV+I+G batchfile
Reply #8 - Feb 8th, 2007 at 1:31pm
 
thanks Sergei, i suspected a problem with my 2nd cat variable. now confirmed. thanks ./w
Back to top
 

Assistant Project Scientist&&Antiviral Research Center&&Department of Pathology&&University of California, San Diego
WWW WWW  
IP Logged
 
kdang
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 2
Re: REV+I+G batchfile
Reply #9 - Oct 16th, 2007 at 6:24pm
 
Hi Sergei,
On a related note, how do I specify an HKY85, beta-gamma (5 categories) model in a batch file (or for that matter, any standard model + beta-gamma)? I am trying to implement a sliding window NT model fit for command-line submission on a cluster. If there is an example or documentation that I missed, please feel free to point me there. Thanks in advance.

Regards,
Kristen Dang
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: REV+I+G batchfile
Reply #10 - Oct 16th, 2007 at 10:57pm
 
Dear Kristen,

kdang wrote on Oct 16th, 2007 at 6:24pm:
On a related note, how do I specify an HKY85, beta-gamma (5 categories) model in a batch file (or for that matter, any standard model + beta-gamma)? I am trying to implement a sliding window NT model fit for command-line submission on a cluster. If there is an example or documentation that I missed, please feel free to point me there. Thanks in advance.


Do you want to embed the model definition in your file?
If not, just select

'Rate variation'
'Beta-Gamma'
5

when a standard analysis prompts you for model options.
You can define HKY85 with beta-gamma with this self contained code (replace < insert_your_filter_name_here> as appropriate):

Code:
        resp = 5; /* how many categories */
	global betaP = 1;
	global betaQ = 1;
	betaP:>0.05;betaP:<85;
	betaQ:>0.05;betaQ:<85;
	category pc = (resp-1, EQUAL, MEAN,
					_x_^(betaP-1)*(1-_x_)^(betaQ-1)/Beta(betaP,betaQ), /* density */
					IBeta(_x_,betaP,betaQ), /*CDF*/
					0, 				   /*left bound*/
					1, 			   /*right bound*/
					IBeta(_x_,betaP+1,betaQ)*betaP/(betaP+betaQ)
				   );

	global alpha = .5;
	alpha:>0.01;alpha:<100;
	category c = (resp, pc, MEAN,
					GammaDist(_x_,alpha,alpha),
					CGammaDist(_x_,alpha,alpha),
					0 ,
			  	    1e25,
			  	    CGammaDist(_x_,alpha+1,alpha)
			  	 );


	global kappa_inv = 1;
	HKY_Rate_Matrix = {{*,kappa_inv*t*c,t*c,kappa_inv*t*c}{kappa_inv*t*c,*,kappa_inv*t*c,t*c}{t*c,kappa_inv*t*c,*,kappa_inv*t*c}{kappa_inv*t*c,t*c,kappa_inv*t*c,*}};

    HarvestFrequencies (vectorOfFrequencies, <insert_your_filter_name_here> ,1,1,0);
    Model HKY85Model = (HKY_Rate_Matrix, vectorOfFrequencies, 1);

 



Cheers,
Sergei

Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
kdang
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 2
Re: REV+I+G batchfile
Reply #11 - Oct 17th, 2007 at 5:29am
 
Yes, I do want to embed it in the file (I'm trying to remove the interactive feature so I can submit to the parallel queue on our cluster).

Thanks! Smiley
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: REV+I+G batchfile
Reply #12 - Oct 17th, 2007 at 9:07am
 
Dear Kristen,

You can also use 'ExecuteAFile' with appropriately constructed options.
This saves a lot of copy and paste and makes the code more modular.

See Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged