Danny
YaBB Newbies
Offline
Posts: 31
|
I thought it might useful (e.g. adding additional bootstraps later) and probably faster to split up the bootstrapping into multiple jobs using the single processor version of hyphy.
I wanted to avoid optimizing on the original dataset every time so I needed to included the lf values and the optimization receptacle matrix for the optimized lf. These values can be printed out once and reused for different bootstrap runs. For example.
ExecuteAFile(lf_values_file);
where lf_values_file is the output of LIKELIHOOD_FUNCTION_OUTPUT=5; fprintf(lf_values_file, lf)
and
res = receptacle_matrix
where receptacle_matrix is the output of fprintf("receptacle_matrix", CLEAR_FILE, res);
If you do this, a problem occurs in the simpleBootstrap.bf script when the native global variables are reset for the next bootstrap iteration. If you have multiple global variables, their values may get scrambled and this will cause the simulation to create a dataset based on the wrong model. This seems to be caused by a reindexing of the lf array when Optimize is run. For example before Optimize your nt rate indices may be ordered as:
AC, AT, GT
but after Optimize it might change to something like
AC, GT, AT
One solution is to get the optimized global variable values from the input lf instead of the res matrix.
In the simpleBootstrap.bf you can change:
if (SAVE_GLOBALS) { globalSpoolMatrix = {1, SAVE_GLOBALS}; for (bsCounter = 0; bsCounter < SAVE_GLOBALS; bsCounter = bsCounter + 1) { globalSpoolMatrix[bsCounter] = res[0][bsCounter]; } }
to
if (SAVE_GLOBALS) { globalSpoolMatrix = {1, SAVE_GLOBALS}; for (bsCounter = 0; bsCounter < SAVE_GLOBALS; bsCounter = bsCounter + 1) { GetString (_i, lf, bsCounter); globalSpoolMatrix[bsCounter] = valueGrab(_i); } }
valueGrab should be defined somewhere before it's called:
function valueGrab (varName&) { return varName; }
Since the order of the values in the specified res matrix will likely be different than the GetString order of the specified lf, the MLE order in the tabulated and summary output files will not match the headers. This can be fixed by changing the order of the res matrix.
change
for (bsCounter = 0; bsCounter < dataDimension; bsCounter = bsCounter + 1) { GetString (_i, lf, bsCounter); _i = _i ^ {{"givenTree\\.", ""}}; /* gets rid of the givenTree. from variables */ _variableMap[_i] = Abs (_variableMap); }
to
for (bsCounter = 0; bsCounter < dataDimension; bsCounter = bsCounter + 1) { GetString (_i, lf, bsCounter); res[0][bsCounter] = valueGrab(_i); _i = _i ^ {{"givenTree\\.", ""}}; /* gets rid of the givenTree. from variables */ _variableMap[_i] = Abs (_variableMap); }
It would be nice if simpleBootstrap.bf didn't have to use the res matrix at all. I think all that is needed from it is the dataDimension and SAVE_GLOBALS. These can probably be obtained directly from the lf. Would
dataDimension = Columns (lf_summary["Global Independent"]) + Columns(lf_summary["Local Independent"]) + Columns(lf_summary["Global Constrained"]);
and
SAVE_GLOBALS = Columns (lf_summary["Global Independent"]);
be reliable?
-danny
|