Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Pages: 1 2 
ancestral sequence reconstruction for many sets (Read 8892 times)
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: ancestral sequence reconstruction for many sets
Reply #15 - Aug 14th, 2008 at 12:29pm
 
Dear Miguel,

Set

Code:
ACCEPT_ROOTED_TREES=1;
 



flag before you call Tree id = ... to tell HyPhy not to unroot the tree.
This is safer than trying to figure out where your node went after rerooting.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Miguel
Junior Member
**
Offline


Hi Hyphy!

Posts: 53
CBMSO, CSIC (Spain)
Gender: male
Re: ancestral sequence reconstruction for many sets
Reply #16 - Aug 15th, 2008 at 9:45am
 
Dear Sergei,

That code was good!  Cheesy

You told me that to recognize the internal node (MRCA of ingroup) the easiest way is to name this internal node in the tree.
Well, in my case the tree is estimated in the same BatchFile by a previous NJ, so the tree does not come from an input file (the only one input file is the alignment) and I have to repit this analysis thousands of times. Perhaps there is a way to introduce a name to this internal node in the BatchFile, although I did not find it in the documentation.

On the other hand, I have here 200 alignments of 11 sequences (including outgroup) each one and, for all the cases, the internal node MRCA of ingroup have the same name (Node1). Maybe I could consider this name always like the MRCA of ingroup..

Cheers,
Miguel
Back to top
 
WWW WWW  
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: ancestral sequence reconstruction for many sets
Reply #17 - Aug 15th, 2008 at 11:45am
 
Dear Miguel,

If you need to have access to node names in HyPhy batch language, you can try using the following example. HyPhy can convert a tree object into a list of nodes in post-order traversal (using the tree^0 operation). For example:

Code:
ACCEPT_ROOTED_TREES = 1;
Tree 				myTree = ((0,(1,3)),outgroup);
flatTree = 			myTree^0;

fprintf (stdout, flatTree);_hyphyAssociativeArray={};

 



produces the following output

Code:
(_hyphyAssociativeArray["1"])={};
(_hyphyAssociativeArray["1"])["Name"] = "0";
(_hyphyAssociativeArray["1"])["Length"] = -1;
(_hyphyAssociativeArray["1"])["Depth"] = 2;
(_hyphyAssociativeArray["1"])["Parent"] = 5;
(_hyphyAssociativeArray["2"])={};
(_hyphyAssociativeArray["2"])["Name"] = "1";
(_hyphyAssociativeArray["2"])["Length"] = -1;
(_hyphyAssociativeArray["2"])["Depth"] = 3;
(_hyphyAssociativeArray["2"])["Parent"] = 4;
(_hyphyAssociativeArray["3"])={};
(_hyphyAssociativeArray["3"])["Name"] = "3";
(_hyphyAssociativeArray["3"])["Length"] = -1;
(_hyphyAssociativeArray["3"])["Depth"] = 3;
(_hyphyAssociativeArray["3"])["Parent"] = 4;
(_hyphyAssociativeArray["4"])={};
(_hyphyAssociativeArray["4"])["Name"] = "Node3";
(_hyphyAssociativeArray["4"])["Length"] = -1;
(_hyphyAssociativeArray["4"])["Depth"] = 2;
(_hyphyAssociativeArray["4"])["Parent"] = 5;
((_hyphyAssociativeArray["4"])["Children"])={};
((_hyphyAssociativeArray["4"])["Children"])["0"] = 2;
((_hyphyAssociativeArray["4"])["Children"])["1"] = 3;
(_hyphyAssociativeArray["5"])={};
(_hyphyAssociativeArray["5"])["Name"] = "Node1";
(_hyphyAssociativeArray["5"])["Length"] = -1;
(_hyphyAssociativeArray["5"])["Depth"] = 1;
(_hyphyAssociativeArray["5"])["Parent"] = 7;
((_hyphyAssociativeArray["5"])["Children"])={};
((_hyphyAssociativeArray["5"])["Children"])["0"] = 1;
((_hyphyAssociativeArray["5"])["Children"])["1"] = 4;
(_hyphyAssociativeArray["6"])={};
(_hyphyAssociativeArray["6"])["Name"] = "outgroup";
(_hyphyAssociativeArray["6"])["Length"] = -1;
(_hyphyAssociativeArray["6"])["Depth"] = 1;
(_hyphyAssociativeArray["6"])["Parent"] = 7;
(_hyphyAssociativeArray["7"])={};
(_hyphyAssociativeArray["7"])["Name"] = "Node0";
(_hyphyAssociativeArray["7"])["Length"] = -1;
(_hyphyAssociativeArray["7"])["Depth"] = 0;
((_hyphyAssociativeArray["7"])["Children"])={};
((_hyphyAssociativeArray["7"])["Children"])["0"] = 5;
((_hyphyAssociativeArray["7"])["Children"])["1"] = 6;
(_hyphyAssociativeArray["0"])={};
(_hyphyAssociativeArray["0"])["Name"] = "myTree";
(_hyphyAssociativeArray["0"])["Root"] = 7;
 



It looks pretty ugly, but what the array is quite simple:
key "0" (i.e. flatTree[0]) stores the information about the tree itself and where the root is located in the array (in this case it's at index 7). Then all nodes are laid out in post-order traversal (left-right-parent).
For each node you get a series of attributes (its name, what node is its parent, what depth in the tree it is, whether or not it has children etc). If all you want to look up is the name of the left-most child of the root node, something like this should do the trick:

Code:
rootIndex =(flatTree [0])["Root"];
leftChildIndex = ((flatTree[rootIndex])["Children"])[0];
fprintf (stdout, "\nThe name of root's left child is ", (flatTree[leftChildIndex])["Name"], "\n");
 



Cheers.
Sergei
[code]
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Miguel
Junior Member
**
Offline


Hi Hyphy!

Posts: 53
CBMSO, CSIC (Spain)
Gender: male
Re: ancestral sequence reconstruction for many sets
Reply #18 - Aug 15th, 2008 at 5:36pm
 
Dear Sergei,

That was great!
Thank you very much!!
Cheesy


Back to top
 
WWW WWW  
IP Logged
 
Miguel
Junior Member
**
Offline


Hi Hyphy!

Posts: 53
CBMSO, CSIC (Spain)
Gender: male
Re: ancestral sequence reconstruction for many sets
Reply #19 - Aug 17th, 2008 at 10:18am
 
Dear Sergei,

Finally with all those improvements all seem to be very well.
So, I tried to analyze my simulated data and I found a bad surprise. I know the real MRCA and I see that when I write the command
ACCEPT_ROOTED_TREES=1;
before ASR, I found a big error (respect to the real MRCA) about 20% (that is for any MRCA node, Node1 (ingroup) or Node0 (outgroup)), however if I do not write that command the error is only of 5% (like other software of ASR).
Then, thinking about that, I found that the tree (with the active ACCEPT_ROOTED_TREES=1; command) of the LikelihoodFunction in the ASR has the length of branches very different, where the outgroup had a very very long branch.

For example, for the same alignment:
- Without the command:
Results from the likelihood optimization:
Log Likelihood = -6051.63159193931;
Tree givenTree=((((((seq00001:0.00140814,seq00008:0.00562907)Node6:0.0510105,seq00010
:0.0550151)Node5:0.00451872,((seq00002:0.0115897,s
eq00005:0.0157931)Node11:0.0124699,seq00004:0.0280265)Node10:0.0258714)Node4:0.0
468666,seq00007:0.136883)Node3:0.0368719,(seq00003:0.04
23708,seq00006:0.047428)Node16:0.0849871)Node2:0.222705,seq00009:0.350053,outgro
up:0.123559);
       [RECONSTRUCTING ANCESTORS]
       [WRITING ANCESTORS TO Results/sequences00001_ancestors.fas]


- With the command:
Results from the likelihood optimization:
Log Likelihood = -6642.04685957071;
Tree givenTree=(((((((seq00001:0.0013966,seq00008:0.00564641)Node6:0.0513962,seq00010
:0.0546931)Node5:0.00396752,((seq00002:0.0115815,s
eq00005:0.015796)Node11:0.0124217,seq00004:0.0280487)Node10:0.0263444)Node4:0.04
58819,seq00007:0.137755)Node3:0.0318315,(seq00003:0.041
6297,seq00006:0.0477905)Node16:0.0913618)Node2:0.0270367,seq00009:0.535481)Node1
:0.09,outgroup:7499.98);
       [RECONSTRUCTING ANCESTORS]
       [WRITING ANCESTORS TO Results/sequences00001_ancestors.fas]


I cannot understand why those trees are so different in the length of branches, perhaps the ASR does not work fine with rooted trees.  :-/
I attach you the complete Batch File on this.

Cheers,
Miguel

Back to top
 
WWW WWW  
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: ancestral sequence reconstruction for many sets
Reply #20 - Aug 17th, 2008 at 4:06pm
 
Dear Miguel,

The very long branch length is probably due to a numerical artifact.
If the sequence is sufficiently divergent (or if initial branch length guesses are bad), HyPhy may drive (incorrectly) the branch length to numerical infinity (as you see here) and then become stuck there. You can see that the log-likelihood for the rooted tree is worse than for the unrooted tree (which is incorrect) You can fix it by bounding the length from above, e.g.

Code:
givenTree.outGroup.t :< 10;
 



before you call optimize.

Also you can try

Code:
USE_DISTANCES = 0;
 



to turn off initial heuristics for guessing branch lengths.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Miguel
Junior Member
**
Offline


Hi Hyphy!

Posts: 53
CBMSO, CSIC (Spain)
Gender: male
Re: ancestral sequence reconstruction for many sets
Reply #21 - Aug 17th, 2008 at 9:08pm
 
Dear Sergei,

That was great again!   Cheesy
Thank you very much!!

Cheers,
Miguel     Smiley
Back to top
 
WWW WWW  
IP Logged
 
Pages: 1 2