HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
HYPHY Package >> HyPhy bugs >> Odd bootstrap output
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1168532900

Message started by Jeff on Jan 11th, 2007 at 8:28am

Title: Odd bootstrap output
Post by Jeff on Jan 11th, 2007 at 8:28am
Hi Sergei,

Sorry to bug you again.  I am delving into variance estimates using post_npbs.bf and have run into some odd output.

It seems like the output is fine for the first half, where most of the ML estimates are very similar to the Mean BS estimates.  Then halfway through it spits out a funny Mean, which looks more like a log likelihood value than a paramater Mean.  After that, most of the Means are very different from the ML estimates.  Some sample output is below.

After Spec6.nonSynRate (where the problems seem to start), I think maybe the Means are right but they are not associating with the proper parameter name.  For example, there are two parameters with 0.23-0.25 ML estimates and two different params with 0.23-0.25 Mean estimates.  Ther are also four params with ML estimates of 0 and four with Means of essentially 0.  

This happens whether I am starting from AnalyzeCodonData.bf (MG94W9 w/local params) or AnalyzeNucProtData.bf (HKY85 w/local params), with multiple different data sets, with nonparametric and parametric analyses, and on the recent Windows and Unix builds.

Is it possibly a tree traversal issue, or maybe a loop indexing issue?  Any help would be greatly appreciated.  I can send you the appropriate files if you want them.


Thanks!
Jeff


______________READ THE FOLLOWING DATA______________
10 species:{Spec1,Spec2,Spec3,Spec4,Spec5,Spec6,Spec7,Spec8,Spec9,Spec10};
Total Sites:666;
Distinct Sites:210
______________RESULTS______________
Log Likelihood = -2518.56927129667;
Tree givenTree=(((Spec1:0.0804376,(Spec2:0.0706275,Spec3:0.0359743)Node5:0)Node3:7.05826e-31,Spec4:0.0303331)Node2:0.0142859,(Spec5:0.411042,Spec6:0.00961238)Node9:0.0177847,((Spec7:0.00570061,Spec8:0.0640759)Node13:0.0147856,(Spec9:0.0105448,Spec10:0.123455)Node16:0.00932126)Node12:7.05826e-31);

How many data replicates should be generated?:10

Iteration 1/10
Iteration 2/10
Iteration 3/10
Iteration 4/10
Iteration 5/10
Iteration 6/10
Iteration 7/10
Iteration 8/10
Iteration 9/10
Iteration 10/10

           BOOTSTRAPPING SUMMARY

+--------------------------------------------------------------------------+
| Parameter                  |    MLE      |     Mean     |    Variance    |
+--------------------------------------------------------------------------+
| givenTree.Spec8.nonSynRate |    0.030080 |     0.028633 |      0.0000643 |
| givenTree.Spec7.synRate    |    0.032956 |     0.033718 |      0.0006663 |
| givenTree.Spec8.synRate    |    0.277769 |     0.299612 |      0.0093143 |
| givenTree.Node13.nonSynRate|    0.014308 |     0.014675 |      0.0000638 |
| givenTree.Spec7.nonSynRate |    0.000000 |     0.000000 |      0.0000000 |
| givenTree.Spec6.synRate    |    0.006907 |     0.021542 |      0.0004916 |
| givenTree.Node9.nonSynRate |    0.015058 |     0.014620 |      0.0000307 |
| givenTree.Node9.synRate    |    0.056430 |     0.030523 |      0.0015137 |
| givenTree.Node13.synRate   |    0.041403 |     0.040092 |      0.0003577 |
| givenTree.Node16.synRate   |    0.043033 |     0.042696 |      0.0006757 |
| givenTree.Node16.nonSynRate|    0.003524 |     0.002781 |      0.0000066 |
| givenTree.Node12.nonSynRate|    0.000000 |     0.000000 |      0.0000000 |
| givenTree.Node12.synRate   |    0.000000 |     0.000000 |      0.0000000 |
| givenTree.Spec10.synRate   |    0.565417 |     0.548985 |      0.0027109 |
| givenTree.Spec9.nonSynRate |    0.013558 |     0.009385 |      0.0000165 |
| givenTree.Spec9.synRate    |    0.019195 |     0.019439 |      0.0002286 |
| givenTree.Spec10.nonSynRate|    0.048140 |     0.055572 |      0.0002249 |
| givenTree.Spec6.nonSynRate |    0.015797 | -2474.726445 |  10351.7719102 |
| givenTree.Node3.synRate    |    0.000000 |     0.093623 |      0.0014363 |
| givenTree.Spec1.nonSynRate |    0.071306 |     0.029711 |      0.0001926 |
| givenTree.Spec3.synRate    |    0.106497 |     0.000482 |      0.0000010 |
| givenTree.Node3.nonSynRate |    0.000000 |     0.000000 |      0.0000000 |
| givenTree.Node5.synRate    |    0.000000 |     0.232996 |      0.0049520 |
| givenTree.Spec4.synRate    |    0.073258 |     0.069984 |      0.0000960 |
| givenTree.Spec4.nonSynRate |    0.033145 |     0.251677 |      0.0038649 |
| givenTree.Spec1.synRate    |    0.245362 |     0.063980 |      0.0001488 |
| givenTree.Spec3.nonSynRate |    0.032941 |     0.000000 |     -0.0000000 |
| givenTree.Node2.synRate    |    0.026516 |     0.032334 |      0.0003458 |
| givenTree.Spec5.nonSynRate |    0.173676 |     0.166593 |      0.0008723 |
| givenTree.Spec5.synRate    |    1.841278 |     1.974571 |      0.1907162 |
| givenTree.Node2.nonSynRate |    0.018203 |     0.020235 |      0.0000209 |
| givenTree.Spec2.nonSynRate |    0.057648 |     0.000000 |      0.0000000 |
| givenTree.Spec2.synRate    |    0.230723 |     0.028381 |      0.0000396 |
| givenTree.Node5.nonSynRate |    0.000000 |     0.078498 |      0.0009763 |
| Ln-Lklhood                 |-2518.569271 |     0.000000 |      0.0000000 |
+--------------------------------------------------------------------------+

Title: Re: Odd bootstrap output
Post by Sergei on Jan 11th, 2007 at 9:15am
Dear Jeff,

I took a quick look at the simpleBootstrap.bf file (where the bootstrapping code is) and couldn't find any apparent indexing bugs.

Could you e-mail me the file and model settings I need to recreate the problem?

Cheers,
Sergei

Title: Re: Odd bootstrap output
Post by Sergei on Jan 12th, 2007 at 9:04am
Dear Jeff,

I did manage to introduce the indexing bug affecting the screen output when I added the code to output branch lengths to a CSV file for coding data. The offending lines are in the file simpleBootstrap.bf, lines 339 and 340, which should read:


Code (]
dataMatrix[0):

[dataDimension]=dataMatrix[0][dataDimension]+temp;
dataMatrix[1][dataDimension]=dataMatrix[1][dataDimension]+temp*temp;


Thanks for pointing this out yet again!
Cheers,
Sergei

Title: Re: Odd bootstrap output
Post by Sergei on Jan 12th, 2007 at 9:15am
Dear Jeff,

There is another indexing problem in the file ... stay tuned for an update.

Cheers,
Sergei

Title: Re: Odd bootstrap output
Post by Sergei on Jan 15th, 2007 at 10:58pm
Dear Jeff,

Download Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login and place it in the TemplateBatchFiles directory.

I'll roll this into the next build as well.

Thanks for the report!
Sergei

Title: Re: Odd bootstrap output
Post by Jeff on Jan 16th, 2007 at 6:28am
Hi Sergei,

Thanks for the quick update.  So far the means and variances look good for my smaller data sets, so it seems like the updated file takes care of the problem.  I'll let you know if I notice any issues with my larger data sets.

Thanks again!
Jeff

Title: Re: Odd bootstrap output
Post by Danny on Feb 23rd, 2009 at 1:01am
Dear Sergei,

I seem to be having similar strange results with nonparametric bootstrap estimates using simpleBootstrap.bf.  However, for the same dataset the parametric means and variances seem reasonable.  Below are some of the summary values that seem strange from the nonparametric bootstraps.

+--------------------------------------------------------------------------------------------+
| Parameter                  |    MLE      |     Mean     |    Variance    |
+--------------------------------------------------------------------------------------------+
| AC                      |    0.680938 |    26.357789 |  18220.0152105 |
| AT                      |    0.106453 |   101.102172 | 999795.7313688 |
| givenTree.Node114.nonSynRate  |    0.517753 |  332.868276 | 2954236.9980986 |
| givenTree.Node114.synRate | 1.072250 |  47.574958 | 135608.0128181 |
Ln-Lklhood  |-21483.516754 | -34435.874822 | 29173030835.8369141

I am using the simpleBootstrap.bf that comes with 0.99beta and is identical to 1.00beta.  I have modified the script to write the summary to a file and I have hardwired some of the #includes and ExecuteAFile arguments, but I don't see anything I changed that could have effected the results.  Again, the parametric bootstraps seemed fine with the same script.  I was using the MPI version.  I am using the same tree and dataset that I sent you a few days ago. Before I try to debug this myself I was hoping the difference in behaviour between the parametric and nonparametric might give you a clue to what could be happening.  Below are links to my slightly modified simpleBootstrap.bf and the nonparametric results.

Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login

p.s. I haven't been able to connect to Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login for several days.

Title: Re: Odd bootstrap output
Post by Sergei on Feb 23rd, 2009 at 7:31am
Hi Danny,

Let me take a look; non-parameteric bootstrap file that you are using has not been updated in a long while.

Cheers,
Sergei

Title: Re: Odd bootstrap output
Post by Danny on Feb 25th, 2009 at 3:10pm
Dear Sergei,

I just noticed that Syn_sites + NS_sites != Total_sites in both the parametric and nonparametric csv files.  This is true for the MLE rows also.

-Danny

Title: Re: Odd bootstrap output
Post by Sergei on Feb 25th, 2009 at 3:32pm
Dear Danny,

That's unfortunate wording -- I should change column labels. Total sites simply reports the number of codon columns in the alignment. Synonymous+Non-syn sites will actually always be less than the number of columns in the data, because the way HyPhy counts the number of expected syn and ns sites (Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login box on page 14 of the manuscript) does not give every codon 3 sites. For example, codon TGG will only have 7/9 sites because two of its one-mutation neighbors (TAG and TGA) are stop codons and contribute neither to syn nor to non-syn counts.

HTH,
Sergei

Title: Re: Odd bootstrap output
Post by Danny on Mar 5th, 2009 at 5:11pm
Dear Sergei,

I just noticed a problem with the simpleBootstrap.bf when run with HYPHYMPI.  The problem occurs with both parametric and nonparametric iterations.

The parameter values for the BS iterations are very scrabbled with respect to the headers in the output.

I'm guessing that possibly the simulated MLEs  are getting a different index when

simulatedResults = simulatedLF_MLES;

is set in the MPI version.

In any case the GetString order of simulatedLF does not match the order in simulatedResults.  For example during the simulated variable mapping loop in ProcessIterate:

i  _i  var  res[0][_i]  simulatedResults[0][i] simulatedResults[0][_i]
2  1  GT  0.295536  0.0143457                 0.291832

It seems that with MPI the simulatedLF gets out of sync with simulatedResults.  Interestingly, simulatedResults[0][_i] gives the right values because it is ordered like the native lf.

Maybe the simulatedLF is getting passed by value to the MPI functions?

Althought GetString seems to be able to step through simulatedLF, if I call:

LIKELIHOOD_FUNCTION_OUTPUT=5; fprintf(simulatedLF, simulatedLF, "\n");

just before the GetString stepping, only a zero is printed to the file.

-Danny

Title: Re: Odd bootstrap output
Post by Sergei on Mar 5th, 2009 at 5:14pm
Dear Danny,

That's bad! I'll clean that up. Thanks for pointing out the scrambling behavior!

Sergei

HyPhy message board » Powered by YaBB 2.5.2!
YaBB Forum Software © 2000-2024. All Rights Reserved.