Spidermonkey questions
Curious HyPhy user

Spidermonkey questions
Aug 24th, 2012 at 3:01pm

I am potentially interested in using Spidermonkey to look for co-evolution between proteins. Before I dive in too deeply, I have a few questions -
(1) Are there any in-principle concerns with using Spidermonkey on multiple proteins instead of just one? Presumably I could input a concatenated alignment to do so?
(2) Is there a minimum number of taxa that are recommended to see a signal if there is one?
(3) Again on the topic of power - given the choice between closely related taxa or more distantly related taxa, is there a preference?

Any guidance would be much appreciated! Thanks,
Alex Wong
Re: Spidermonkey questions
Reply #1 - Aug 28th, 2012 at 9:48am
Hi Alex,

In response to your queries:

1). No, so long as the proteins have the same gene trees. Spidermonkey assumes a single phylogeny for all sites.
2). Spidermonkey derives power from co-occuring substitutions, which are a function of the number of taxa and sequence divergence. I would say that you need at least 5 substitutions at a site to gain good power.
3). Basically you want to maximize the number of substitutions without losing homology. I would include a mix of distantly and closely related taxa -- the former give you power, the latter help map substitutions more accurately.

