MEME : Mixed Effects Model of Evolution
The imprint of natural selection on protein coding genes is often difficult to identify because selection is frequently transient or episodic, i.e. it affects only a subset of lineages. Existing computational techniques, which are designed to identify sites subject to pervasive selection, may fail to recognize sites where selection is episodic: a large proportion of positively selected sites. We present a mixed effects model of evolution (MEME) which is capable of identifying instances of both episodic and pervasive positive selection at the level of an individual site. Using empirical and simulated data, we demonstrate the superior performance of MEME over older models under a broad range of scenarios. We find that episodic selection is widespread and conclude that the number of sites experiencing positive selection may have been vastly underestimated.
Note: In the section "Detecting individual branches subject to diversifying selection at a given site" of this paper we stated the caveat that "we do not recommend using this type of inference other than for the purposes of data exploration". This was intended to warn users against potentially overinterpreting the specific set of branches inferred to be under positive selection for a single episodically selected site, for which the data are, of necessity, not very informative. However, it has come to our attention that this caveat has been interpreted (incorrectly) as referring to MEME in general (inference at an individual site over multiple branches) rather than to inference at an individual branch-site combination as intended. We would like to clarify that the available MEME analysis tools are intended and have been validated for hypothesis testing and not just exploratory analysis.