Marissa
YaBB Newbies
Offline
Feed your monkey!
Posts: 6
|
Here is the code I've been using:
[code] /* Load data 1 */ DataSet seq1 = ReadDataFile ("Data/modeled/AT1_s_1.fa"); DataSetFilter filter1 = CreateFilter (seq1, 1);
/* Load data 2 */ DataSet seq2 = ReadDataFile ("Data/human/AT2_s_1.fa"); DataSetFilter filter2 = CreateFilter (seq2, 1);
/* Define general non-reversible model */ /* This file defines the transition matrix for the General Non-Reversible model without explicit frequency parameterization. 01/04/2006 by Sergei L. Kosakovsky Pond */
global AC = 1; global AG = 1; global AT = 1; global CA = 1; global CG = 1; global CT = 1; global GA = 1; global GC = 1; global GT = 1; global TA = 1; global TC = 1; global TG = 1;
global AC_2 = 1; global AG_2 = 1; global AT_2 = 1; global CA_2 = 1; global CG_2 = 1; global CT_2 = 1; global GA_2 = 1; global GC_2 = 1; global GT_2 = 1; global TA_2 = 1; global TC_2 = 1; global TG_2 = 1;
function PopulateModelMatrix (ModelMatrixName&) { ModelMatrixName = {{*,t*AC,t,t*AT} {t*CA,*,t*CG,t*CT} {t*GA,t*GC,*,t*GT} {t*TA,t*TC,t*TG,*}}; return 0; }
function PopulateModelMatrix_2 (ModelMatrixName&) { ModelMatrixName = {{*, t_2*AC_2, t_2, t_2*AT_2} {t_2*CA_2, *, t_2*CG_2, t_2*CT_2} {t_2*GA_2, t_2*GC_2, *, t_2*GT_2} {t_2*TA_2, t_2*TC_2, t_2*TG_2, *}}; return 0; }
/* Create two trees, one for seq1 and one for seq2. We set the anc branch to 0 by setting t to 0. */
/* NRM = Non-Reversible Model */ NRM = 0; MULTIPLY_BY_FREQS = PopulateModelMatrix ("NRM");
/* Calculate equilibrium frequencies */ dynamicFreqs = {4,1}; vectorOfFrequencies = {4,1}; vectorOfFrequencies [0] := computeEQF (AC,AT,CG,CT,GT,CA,GA,TA,GC,TC,TG); vectorOfFrequencies [1] := dynamicFreqs[1]; vectorOfFrequencies [2] := dynamicFreqs[2]; vectorOfFrequencies [3] := dynamicFreqs[3];
/* Create the model for the */ Model GRMModel = (NRM, vectorOfFrequencies, MULTIPLY_BY_FREQS); FREQUENCY_SENSITIVE = 0; ACCEPT_ROOTED_TREES = 1;
Tree tree1 = (anc,der); tree1.anc.t := 0;
NRM_2 = 0; MULTIPLY_BY_FREQS_2 = PopulateModelMatrix_2 ("NRM_2");
dynamicFreqs_2 = {4,1}; vectorOfFrequencies_2 = {4,1}; vectorOfFrequencies_2 [0] := computeEQF_2 (AC_2,AT_2,CG_2,CT_2,GT_2,CA_2,GA_2,TA_2,GC_2,TC_2,TG_2); vectorOfFrequencies_2 [1] := dynamicFreqs_2[1]; vectorOfFrequencies_2 [2] := dynamicFreqs_2[2]; vectorOfFrequencies_2 [3] := dynamicFreqs_2[3];
Model GRMModel_2 = (NRM_2, vectorOfFrequencies_2, MULTIPLY_BY_FREQS_2); FREQUENCY_SENSITIVE = 0; ACCEPT_ROOTED_TREES = 1;
Tree tree2 = (anc,der); tree2.anc.t_2 := 0;
/*---------- Unconstrained Model ----------*/
LikelihoodFunction theLnLik = (filter1, tree1, filter2, tree2);
Optimize (paramValues, theLnLik); unconstrLik = paramValues[1][0]; unConstrainedParamCount = paramValues[1][1]; fprintf (stdout, "unconstrained param count: ", unConstrainedParamCount,"\n");
fprintf (stdout, "NRM:\n", NRM, "\n");
/*----------- Constrained Model -----------*/
LikelihoodFunction theLnLik = (filter1, tree1, filter2, tree2);
/* Constraint */ t_2 := t*dynamicFreqs_2[2]/dynamicFreqs[2];
Optimize (paramValues, theLnLik); constrainedLnLik = paramValues[1][0]; constrainedParamCount = paramValues[1][1]; fprintf (stdout, "constrained param count: ", constrainedParamCount, "\n");
fprintf (stdout, "NRM_2:\n", NRM_2, "\n");
/* Compute and print the log ratio */ LRT = 2 (unconstrLik - constrainedLnLik); fprintf (stdout, "LRT: ", LRT, "\n");
/*--------------- function to compute equilibrium frequencies ---------------*/
ffunction computeEQF (dummy1,dummy2,dummy3,dummy4,dummy5,dummy6,dummy7,dummy8,dummy9,dummy10,dummy11) { t = 1; tempMatrix = NRM; for (k=0; k<4; k=k+1) { tempMatrix[k][3] = 1; } tempMatrix = Inverse(tempMatrix); for (k=0; k<4; k=k+1) { dynamicFreqs[k] = tempMatrix[3][k]; } return dynamicFreqs[0]; }
ffunction computeEQF_2 (dummy1,dummy2,dummy3,dummy4,dummy5,dummy6,dummy7,dummy8,dummy9,dummy10,dummy11) { t_2 = 1; tempMatrix = NRM_2; for (k=0; k<4; k=k+1) { tempMatrix[k][3] = 1; } tempMatrix = Inverse(tempMatrix); for (k=0; k<4; k=k+1) { dynamicFreqs_2[k] = tempMatrix[3][k]; } return dynamicFreqs_2[0]; } [/code]
The two data sets being loaded are modeled data with equal ACGT content (seq1) and double AT content (seq2), where the instantaneous substitution rate to any other base is 1%.
Marissa
|