Hi
I often get stuck trying to scale up my hyphy analyses from simple test cases onto my real datasets (where I'm working with multiple datasets over multiple trees, doing joint ML estimation over the set of all trees/datasets), so I've been trying to refactor my hyphy scripts a bit to use user-defined functions a bit more, to make it easier to reuse code in slightly different contexts.
I often use branch lengths of a given input tree, setting the "t" parameter to these branch lengths, and then optimizing the ML function while constraining these "t"s to be the values of the branch lengths in the input tree files.
To do that, I've tried writing a function that will take a treeString and a model, and return a Tree object with branch-lengths constrained in this way - see code at the end of this post that defines this and uses it in a simple case (associated with the current models I'm interested in, binary characters with rate-variation modeled using a gamma distribution).
My question - I seem to have problems with this when it comes to returning a tree object from a function (I get a "Bus error" when running the script below). Any ideas about what my mistakes are/whether this is feasible much appreciated.
Thanks
Aidan
Code:/* attempt to get gamma rate correction working for trees and datasets dsecribed within
the script i.e. not accessing data in external files */
fprintf (stdout, "\nBegin new analysis\n\n");
ACCEPT_BRANCH_LENGTHS = 1;
ACCEPT_ROOTED_TREES = 1;
/*Specify models */
global eqFreq0 = 0.5;
eqFreq1 := 1 - eqFreq0;
global alpha = 1;
global postDuplicationBranchRateScalar = 1;
eqFrequencies = {{eqFreq0, eqFreq1}};
category rateCat = (4,EQUAL, MEAN, GammaDist(_x_,alpha,alpha), CGammaDist(_x_,alpha,alpha),0,1e25,CGammaDist(_x_,alpha+1,alpha));
binaryGammaQ = {{*, rateCat*eqFreq1*t}
{rateCat*eqFreq0*t,*}};
/* "daughter" is the label given to the post-dupn branches in the trees provided by Claudia */
Model binaryGamma = (binaryGammaQ, eqFrequencies);
inputTreeString1 = "(((seqA:0.1,seqB:0.1):0.1,seqC:0.2):0.1,seqD:0.3);";
inputDataString1 = "$BASESET:01\n#seqA\n10\n#seqB\n11\n#seqC\n00\n#seqD\n11";
DataSet ds1 = ReadFromString(inputDataString1);
DataSetFilter df1 = CreateFilter(ds1,1);
myTree1 = getTreeWithFixedTs (inputTreeString1, "binaryGamma");
LikelihoodFunction lf = (df1,myTree1);
fprintf (stdout, "\nLikelihood function BEFORE optimisation, but AFTER fixing t\n",lf);
Optimize (parameterValues,lf);
fprintf (stdout, "\nLikelihood function AFTER optimisation\n",lf);
function getTreeWithFixedTs (treeString, model) {
fprintf (stdout, "using getTreeWithFixedTs\n");
UseModel (USE_NO_MODEL);
Tree dummyTree = treeString;
branchNames = BranchName (dummyTree, -1);
branchLengthsNow = BranchLength (dummyTree, -1);
fprintf (stdout, "branch names ", branchNames, "\n");
fprintf (stdout, "branch lengths ", branchLengthsNow, "\n");
setModelString = "UseModel(" + model + ");";
printAndExecute (setModelString);
Tree treeToBeReturned = treeString;
bn = BranchName (treeToBeReturned, -1);
for (k=0; k < Columns (bn) - 1; k=k+1) {
outString = "treeToBeReturned." + bn[k] + ".t := " + branchLengthsNow[k];
printAndExecute (outString);
}
return treeToBeReturned;
}
function printAndExecute (inString) {
fprintf(stdout, "executing:", inString, "\n");
ExecuteCommands (inString);
return 0;
}