HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
HYPHY Package >> Contributions >> REV+I+G batchfile
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1156393053

Message started by Jose_Patane on Aug 23rd, 2006 at 9:17pm

Title: REV+I+G batchfile
Post by Jose_Patane on Aug 23rd, 2006 at 9:17pm
Dear all,

It took me some time figuring out how to build a customized batch file for HyPhy with a REV matrix, plus a proportion of invariable sites and the other ones following gamma... I hope this can be useful to someone else as well (tested with the same tree and same parameters estimated from PAUP*, same results were obtained):

-----------------------------------------------------------------
/* Basefreqs, 5 transition types, proportion of invariable sites
  and rate heterogeneity are global. */

#include "displayfunction.bf";

myTopology = "some_tree"

DataSet myData  = ReadDataFile("some_data");
DataSetFilter myFilter = CreateFilter(myData,1);
HarvestFrequencies(obsFreqs, myFilter, 1, 1, 1);

/* "invariant sites" rate and "gamma shape (alpha) for rate heterogeneity" categories */
global Pinvar;
global shape;

/* Five individual global transition rates of each type (the other one is free to vary on each branch). */
global r1;
global r2;
global r3;
global r4;
global r5;

/* Discrete probability distribution used to estimate # of invariant sites:
  2 classes, 1 with invariant sites (0), the other with variable ones (1).
  The value of "Pinvar" shall be within the interval [0,1]
  Estimate 'Pinvar' */
category invClass = (2,[Pinvar}{1-Pinvar],MEAN, ,[0}{1], 0,1);

/* Gamma distributed site rate heterogenity:
  4 classes, gamma prob. distribution of rates.
  Estimate 'shape' (= alpha) */
category rateCat =(4, EQUAL, MEAN, GammaDist(_x_,shape,shape), CGammaDist(_x_,shape,shape),0,1e25);

/* REV+I+G matrix. (I let the "C-T" rate to freely vary, but could be any other as well) */
REV_I_G_rateMatrix =
[         *          ,r1*m*invClass*rateCat,r2*m*invClass*rateCat,r3*m*invClass*rateCat}
{r1*m*invClass*rateCat,         *           ,r4*m*invClass*rateCat,   m*invClass*rateCat}
{r2*m*invClass*rateCat,r4*m*invClass*rateCat,         *           ,r5*m*invClass*rateCat}
{r3*m*invClass*rateCat,   m*invClass*rateCat,r5*m*invClass*rateCat,         *           ];

Model REV_I_G = (REV_I_G_rateMatrix, obsFreqs);

UseModel(REV_I_G);
Tree myTree = myTopology;
LikelihoodFunction theLikFun = (myFilter, myTree);
Optimize(results,theLikFun);
fprintf (stdout,theLikFun);

/* Log 'Pinvar' to screen. */
GetInformation (catInfo,invClass);
catInfo = echoinvClass(catInfo);

/* Log 'shape' to screen. */
GetInformation (catInfo,rateCat);
catInfo = echoinvClass(catInfo);

---------------------------------------------------------------

Title: Re: REV+I+G batchfile
Post by Sergei on Aug 24th, 2006 at 7:58am
Dear Jose,

Thanks! I haven't thought of this way to parameterize the G+I (using multiplicative category variables), it's actually a neat trick.

For the information of other users: 'displayfunction.bf' is located in Examples/Categories.

Cheers,
Sergei

Title: Re: REV+I+G batchfile
Post by ofedrigo on Jan 31st, 2007 at 12:09pm
hi,

I tried to use this trick in my own code and I got the following error message:

ERROR: only one category variable can be handled by this code

did I miss something or did somethind change in the way hyphy handle multiple category variables?

thanks,

Olivier

Title: Re: REV+I+G batchfile
Post by Sergei on Jan 31st, 2007 at 1:08pm
Dear Olivier,

I think you may need to redefine the 'echoinvClass' function (in displayfunction.bf) to do this. It's not difficult to do, because all you have to do is add a '0' rate with weight Pinvar to the display, and then multiply all gamma weights by (1-Pinvar).

Cheers,
Sergei


Title: Re: REV+I+G batchfile
Post by ofedrigo on Feb 1st, 2007 at 7:51am
Thanks for your email Sergei. Sorry, I looked more carefully and I located the problem. Actually I tried to plug in this G+I code into a modified Plato.bf. But the plato.bf does not handle multiple category variable (?). Is there an other way to programatically define REV+G+I?

Thanks,

Olivier

Title: Re: REV+I+G batchfile
Post by Sergei on Feb 1st, 2007 at 3:27pm
Dear Olivier,

You can define the G+I category variable (resp bins in Gamma) thus:


Code (]
     global alpha   = .5;
     alpha:>0.01;alpha:<100;

     global beta = 1;
     beta:>0.01;
     beta:<200;

     global P;
     P :< 1-(1e-10);
     P = 1/(resp+1);
     
     beta := (1-P)*alpha;
     
     
     catFreqMatrix = {resp+1,1};
     catFreqMatrix [0):

:= P;
     
     for (h=1; h<=resp;h=h+1)
     {
           catFreqMatrix [h] := (1-P)/resp__;
     }
                               
     category c = (resp+1, catFreqMatrix, MEAN,
                             (1-P)*GammaDist(_x_,alpha,beta)*(_x_>0),
                             (1-P)*CGammaDist(_x_,alpha,beta)*(_x_>0)+P,
                             0,
                             1e25,
                             (1-P)*CGammaDist(_x_,alpha+1,beta)*(alpha/beta)*(_x_>0)
                          );


Cheers,
Sergei

Title: Re: REV+I+G batchfile
Post by wayne on Feb 8th, 2007 at 4:42am
Hi Sergei, this could deserve a new thread but is nonetheless related to this one. How is the conditional site likelihood matrix constructed when using multiplicative category variables and subsequently calling ConstructCategoryMatrix. I have two category variables (3 and 2 intervals respectively) which gives a (3*2)*sites matrix but some values are repeating. For instance
{ { 4.904 e-05, ...}
  { 4.904 e-05, ...}
  {3.480 e-05, ....}
  {3.480 e-05, ...}
   etc
}
thanks ./w

Title: Re: REV+I+G batchfile
Post by Sergei on Feb 8th, 2007 at 12:18pm
Dear Wayne,

For multiple cat. variables, each column of 'ConstructCategoryMatrix' will be a flattened representation of a multi-dimensional matrix of all possible cat. var. values.

Each likelihood function has a ordering of category variables (usually in the order they were encountered during likelihood function construction), which can be retrieved at run time using

[code]
GetInformation (storageID, lfID);
[/code]

Suppose for example, that there are two category variables: c1 (M classes) and c2 (N classes) and c1 is the first in the orderings.

Then the i-th entry (0-based indexing) of each column vector from 'ConstructCategoryMatrix' will correspond to i div N class in c1 and i mod N class in c2.

In your case, repeating values suggest that the second category variable does not affect that particular site (for whatever reason).

Cheers,
Sergei

Title: Re: REV+I+G batchfile
Post by wayne on Feb 8th, 2007 at 1:31pm
thanks Sergei, i suspected a problem with my 2nd cat variable. now confirmed. thanks ./w

Title: Re: REV+I+G batchfile
Post by kdang on Oct 16th, 2007 at 6:24pm
Hi Sergei,
On a related note, how do I specify an HKY85, beta-gamma (5 categories) model in a batch file (or for that matter, any standard model + beta-gamma)? I am trying to implement a sliding window NT model fit for command-line submission on a cluster. If there is an example or documentation that I missed, please feel free to point me there. Thanks in advance.

Regards,
Kristen Dang

Title: Re: REV+I+G batchfile
Post by Sergei on Oct 16th, 2007 at 10:57pm
Dear Kristen,


kdang wrote on Oct 16th, 2007 at 6:24pm:
On a related note, how do I specify an HKY85, beta-gamma (5 categories) model in a batch file (or for that matter, any standard model + beta-gamma)? I am trying to implement a sliding window NT model fit for command-line submission on a cluster. If there is an example or documentation that I missed, please feel free to point me there. Thanks in advance.


Do you want to embed the model definition in your file?
If not, just select

'Rate variation'
'Beta-Gamma'
5

when a standard analysis prompts you for model options.
You can define HKY85 with beta-gamma with this self contained code (replace < insert_your_filter_name_here> as appropriate):

[code]
       resp = 5; /* how many categories */
     global betaP = 1;
     global betaQ = 1;
     betaP:>0.05;betaP:<85;
     betaQ:>0.05;betaQ:<85;
     category pc = (resp-1, EQUAL, MEAN,
                             _x_^(betaP-1)*(1-_x_)^(betaQ-1)/Beta(betaP,betaQ), /* density */
                             IBeta(_x_,betaP,betaQ), /*CDF*/
                             0,                           /*left bound*/
                             1,                     /*right bound*/
                             IBeta(_x_,betaP+1,betaQ)*betaP/(betaP+betaQ)
                          );
     
     global alpha = .5;
     alpha:>0.01;alpha:<100;
     category c = (resp, pc, MEAN,
                             GammaDist(_x_,alpha,alpha),
                             CGammaDist(_x_,alpha,alpha),
                             0 ,
                             1e25,
                             CGammaDist(_x_,alpha+1,alpha)
                          );


     global kappa_inv = 1;
     HKY_Rate_Matrix = [*,kappa_inv*t*c,t*c,kappa_inv*t*c}{kappa_inv*t*c,*,kappa_inv*t*c,t*c}{t*c,kappa_inv*t*c,*,kappa_inv*t*c}{kappa_inv*t*c,t*c,kappa_inv*t*c,*];
     
   HarvestFrequencies (vectorOfFrequencies, <insert_your_filter_name_here> ,1,1,0);
   Model HKY85Model = (HKY_Rate_Matrix, vectorOfFrequencies, 1);

[/code]

Cheers,
Sergei


Title: Re: REV+I+G batchfile
Post by kdang on Oct 17th, 2007 at 5:29am
Yes, I do want to embed it in the file (I'm trying to remove the interactive feature so I can submit to the parallel queue on our cluster).

Thanks! :)

Title: Re: REV+I+G batchfile
Post by Sergei on Oct 17th, 2007 at 9:07am
Dear Kristen,

You can also use 'ExecuteAFile' with appropriately constructed options.
This saves a lot of copy and paste and makes the code more modular.

See Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login

Cheers,
Sergei

HyPhy message board » Powered by YaBB 2.5.2!
YaBB Forum Software © 2000-2024. All Rights Reserved.