GARD: whole vs partial dataset (Read 1324 times)
GARD: whole vs partial dataset
Jun 15th, 2011 at 3:37pm
Dear Datamonkey population,
I am using GARD (and SCUEAL but this post is about GARD) to analyze a dataset of ~130 HIV subtype B env sequences from individuals around the globe. When I process the entire dataset through GARD, there are no significantly supported breakpoints, BUT when a look at a distinct subpopulation (~35 taxa from the same geographic location) there are 4-6 breakpoints (depending on p-value I use) that are significantly supported. Alternatively, when I look at the rest of the remaining sequences (~95 taxa) there are 0 statistically supported breakpoints. Am I correct in thinking that when the entire dataset is analyzed that the lack of breakpoints in the 95 taxa dataset "muffles" the signal I see when looking only at the distinct subpopulation? Or could there be an ascertainment bias where breakpoints are more likely to be found in a smaller dataset? Thank you!

-Crystal  Cool
Re: GARD: whole vs partial dataset
Reply #1 - Jun 15th, 2011 at 9:01pm
Hi Crystal,

Your intuition about "muffling" the signal is correct:

