Jose_Patane
YaBB Newbies
Offline
Posts: 26
Brazil
Gender:
|
Hi all,
I got stuck trying to MLE a tree with 2 partitions within the 12s, one for loops (4x4), and another for stems (8x8). The thing is patristic ML-distances are way too high, even > 1, so it may be a problem with the joint use of models with different parameters/dimensions. I assume a proportional model among partitions, which works properly whenever I use the same model type for different partitions (e.g.: GTR_1 for codon position 1, GTR_2 for position 2...). Here's the code I've been trying to use:
-------------------------------------------------------------------------------- ---------------------------------
IS_TREE_PRESENT_IN_DATA = 1;
DataSet gene_12s = ReadDataFile (USE_NEXUS_FILE_DATA);
DataSetFilter Loop_RNA = CreateFilter (gene_12s, 1, DATA_FILE_PARTITION_MATRIX[1][1]);
/* RNA doublets (with exclusion set) */ DataSetFilter Stem_RNA = CreateFilter (gene_12s, 2, "2,18,3,17,4,16,...", "" ,"AA,AG,CC,CT,GA,GG,TC,TT");
/* Single transition between doublets */ global S;
/* Double transition between canonical doublets */ global D;
/* Loop: TN_93 */ global P; global Q;
/* "Proportional model" among partitions */ global R;
/* +I, +G */ global P_invar; global alpha;
category Pinv = (2,{{P_invar}{1-P_invar}},MEAN, ,{{0}{1}}, 0,1); category gam =(4, EQUAL, MEAN, GammaDist(_x_,alpha,alpha), CGammaDist(_x_,alpha,alpha),0,1e25);
HarvestFrequencies (Loop_Freqs, Loop_RNA, 1, 1, 1);
/* Empirical frequencies of the 8 most common doublet states */ Doublet_Freqs = {{0.00854592}{0.174745}{0.012174}{0.304436}{0.245536}{0.0181973}{0.221259}{0.006 37755}};
/* Partition rate weights */ global rate_loop := (Loop_RNA.sites + Stem_RNA.sites) / (Loop_RNA.sites + R * Stem_RNA.sites); global rate_stem := R * rate_loop;
/* Loops: TN_93 + I + G */ RNA_loop_matrix = { { * ,( m * Pinv * gam),(P*m * Pinv * gam),( m * Pinv * gam)} {( m * Pinv * gam), * ,( m * Pinv * gam),(Q*m * Pinv * gam)} {(P*m * Pinv * gam),( m * Pinv * gam), * ,( m * Pinv * gam)} {( m * Pinv * gam),(Q*m * Pinv * gam),( m * Pinv * gam), * }
};
/* Model used:
a = double transversion between canonical pairs. S * a = single transition between doublets. D * a = double transition between canonical pairs.
Everything else not allowed in a single time step.
Sequence of states: AC, AT, CA, CG, GC, GT, TA, TG. */
RNA_stem_matrix = { { * ,S * a, 0 , 0 ,S * a, 0 , 0 , 0 } {S * a, * , 0 , a ,D * a,S * a, 0 , 0 } { 0 , 0 , * ,S * a, 0 , 0 ,S * a, 0 } { 0 , a ,S * a, * , 0 , 0 ,D * a,S * a} {S * a,D * a, 0 , 0 , * ,S * a, a , 0 } { 0 ,S * a, 0 , 0 ,S * a, * , 0 , 0 } { 0 , 0 ,S * a,D * a, a , 0 , * ,S * a} { 0 , 0 , 0 ,S * a, 0 , 0 ,S * a, * }
};
Model Loop_model = (RNA_loop_matrix, Loop_Freqs); Model Stem_model = (RNA_stem_matrix, Doublet_Freqs);
UseModel(Loop_model); Tree myTree_loop = DATAFILE_TREE;
UseModel(Stem_model); Tree myTree_stem = DATAFILE_TREE;
Tree myTree_12s = DATAFILE_TREE;
ReplicateConstraint ("this1.?.a := this2.?.m*R", myTree_stem, myTree_loop);
LikelihoodFunction theLikFun = (Loop_RNA, myTree_loop, Stem_RNA, myTree_stem); Optimize(results,theLikFun);
ReplicateConstraint ("this1.?.a := (this2.?.m + this3.?.a)/(rate_loop + rate_stem)", myTree_loop, myTree_stem, myTree_12s); -------------------------------------------------------------------------------- ---------------------------------
... So how can I correct the problem stated above?
|