HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
Theoretical questions >> Sequence Analysis >> GARD: whole vs partial dataset
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1308177463

Message started by CrystalH on Jun 15th, 2011 at 3:37pm

Title: GARD: whole vs partial dataset
Post by CrystalH on Jun 15th, 2011 at 3:37pm
Dear Datamonkey population,
I am using GARD (and SCUEAL but this post is about GARD) to analyze a dataset of ~130 HIV subtype B env sequences from individuals around the globe. When I process the entire dataset through GARD, there are no significantly supported breakpoints, BUT when a look at a distinct subpopulation (~35 taxa from the same geographic location) there are 4-6 breakpoints (depending on p-value I use) that are significantly supported. Alternatively, when I look at the rest of the remaining sequences (~95 taxa) there are 0 statistically supported breakpoints. Am I correct in thinking that when the entire dataset is analyzed that the lack of breakpoints in the 95 taxa dataset "muffles" the signal I see when looking only at the distinct subpopulation? Or could there be an ascertainment bias where breakpoints are more likely to be found in a smaller dataset? Thank you!

-Crystal  8-)

Title: Re: GARD: whole vs partial dataset
Post by Sergei on Jun 15th, 2011 at 9:01pm
Hi Crystal,

Your intuition about "muffling" the signal is correct: see Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login

Sergei

HyPhy message board » Powered by YaBB 2.5.2!
YaBB Forum Software © 2000-2024. All Rights Reserved.