Thanks Sergei!
I did the changes you told me (except last one) and it works.
There is also a couple of things I would like to comment. Please, let me know if I am wrong.
To constrain the model, I fixed the model to be (K2P with kappa=2) changing this line (#64):
Code:SelectTemplateModel(filteredData); /*Prompts the user to select a template model */
By:
Code:equalFreqs = {{0.25}{0.25}{0.25}{0.25}};
K2Pkappa2RateMatrix = {{*,b,b*2,b}{b,*,b,b*2}{b*2,b,*,b}{b,b*2,b,*}};
Model K2P = (K2Pkappa2RateMatrix, equalFreqs);
UseModel(K2P);
One of my worries is the position of UseModel(K2P) since, below in the batch file, there is the following:
Code:UseModel(USE_NO_MODEL);
Tree referenceTree = treeString;
refL = BranchLength(referenceTree,-1);
estL = BranchLength(givenTree,-1);
the estL is an array with each branch length of a tree calculated from the topology inputted and the whole alignment, right?
How is the "USE_NO_MODEL" affecting the estimation of the total length of the tree? Should I change the position of UseModel(K2P)?
I think now I understand better why we should apply the correction factor for each rate. Is this the only way that Hyphy can estimate rates for each site using the branch lengths of the input tree?
So, is optimizing the substitutions of a given tree topology and a model the only way to get a rate estimation for each site?
The approach is different for DNArates or Rate4site: based on a model, the program tries to find a factor X (mutation rate) that multiplies the branch length that maximizes the function (or in other words calculate the different probabilities of nucleotide changes along a branch of known length given a rate). Well, more or less. Our branch lengths are in time units (trees are chronograms) that that approach appears to work fine. The problem is the super high rates that we get with DNArates.
I really appreciate your help,
Francesc