This glossary provides definitions for terms in the HyPhy-Vision reports for each method. See here for descriptions of each method.

Shared Terms#

  • log L - The log likelihood estimate of the respective model fit.
  • # par. - The number of estimated parameters in the respective model.
  • Time to fit - Total clock time to fit the respective model.
  • AICc - Small-sample Akaike Information Criterion score.
  • Ltree - The tree length under the respective model, where tree length represents the expected number of substitutions per site.

aBSREL Glossary#

Summary: Tree Table#

  • rate classes - Number of rate classes inferred.
  • # of branches - The number of branches inferred to have the respective number of rate classes.
  • % of branches - The proportion of branches inferred to have the respective number of rate classes.
  • % of tree length - The percentage of the total tree length inferred to have the respective number of rate classes.
  • # under selection - The number of branches inferred to have undergone positive selection (a rate class of ) at the designated p-value threshold, after correction for multiple testing.

Summary: Model Fits Table#

  • MG94 - The baseline model fit of MG94xREV that infers a single per branch.
  • Full Model - The full aBSREL model fit to the tree, where the number of classes per branch is inferred adaptively.
  • See these Shared Terms for additional reported information.

Full Table#

  • Name - Branch of interest, where bolded rows indicate that the branch shows evidence, at the designated p-value threshold, of positive selection.
  • B - Optimized (under the full aBSREL model) branch length for the branch of interest
  • LRT - Likelihood Ratio Test statistic for selection. This LRT was calculated by comparing the fitted aBSREL model to the null model where rate classes of are disallowed.
  • Test p-value - P-value corrected for multiple testing using the Holm-Bonferroni correction. These values are only calculated for branches with uncorrected p-values less than 1.
  • Uncorrected p-value - Raw p-value before correction for multiple testing.
  • distribution over sites - Inferred estimates and respective proportion of sites along the respective branch.

RELAX Glossary#

Summary: Model Fits Table#

  • Null - This model represents the null model used to test for selection relaxation. Here, the selection intensity parameter k is set to 1 for both branch sets (reference and test).
  • Alternative - This model represents the alternative model used to test for selection relaxation. Here, the selection intensity parameter k is allowed for vary on the "test" partition of branches. Note that a single k is inferred and shared across all test branches.
  • Partitioned MG94xREV - This baseline model fits a single value to each of the two branch sets (reference and test), respectively.
  • Partitioned Descriptive - This model infers three classes distributions for each partition, respectively, without using the selection intensity parameter k (e.g. k=1 throughout).
  • General Descriptive - This model fits three classes to the entire phylogeny, i.e. shared across all branches. This model then infers a single selection intensity parameter k for each branch.
  • Branch set - The set of branches (all, test, or reference) for which the given set of rate classes were inferred.
  • - The inferred value for the respective rate class (, , or ). The value in parentheses indicates the proportion of sites inferred to belong to this rate class.

BUSTED Glossary#

Summary: Model Fits Table#

  • Constrained model - This model represents the null model used in the BUSTED hypothesis test. For this model, the background and foreground branch partitions share all rate classes, but the value rate class used to test for selection is constrained to equal 1 (i.e. for both the background and foreground branches).
  • Unconstrained model - This model represents the alternative model used in the BUSTED hypothesis test. For this model, the value for rate class used to test for selection is permitted to exceed 1 on the foreground branches. (i.e. on foreground branches).
  • - The inferred value for the respective rate class (, , or ). The value in parentheses indicates the proportion of sites inferred to belong to this rate class.
  • See these Shared Terms for additional reported information.

Summary: Model Evidence Ratios Per Site Table#

  • Site Index - The codon site of interest.
  • Unconstrained Likelihood - The log likelihood score for the codon site of interest, calculated from the Unconstrained Model fit.
  • Constrained Likelihood - The log likelihood score for the codon site of interest, calculated from the Constrained Model fit.
  • Optimized Null Likelihood - The log likelihood score for the codon site of interest, calculated from the Optimized Null Model fit. The Optimized Null model is the Constrained model whose parameters have been re-optimized for this specific site.
  • Constrained Evidence Ratio - Evidence ratio for positive selection at the given codon site, using the Constrained Model as the null model.
  • Optimized Null Evidence Ratio - Evidence ratio for selection at the given codon site, using the Optimized Null model as the null model.