Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
SCUEAL (Read 2432 times)
CrystalH
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 23
SCUEAL
Jun 1st, 2011 at 1:31pm
 
Hello,
I have successfully created HIV envelope and gag reference datasets using the Los Alamos subtype reference alignments in the correct format for SCUEAL. From a previous posting, I understand that leaving gapped regions in the query sequences is okay, but is it better to remove gapped regions (although the gapped regions are present in the reference dataset)? Is SCUEAL likely to misidentify heterotachy as breakpoints or is it similar to GARD, where KH testing is used to designate whether a breakpoint is strongly supported? Thank you! Cool

-Crystal Hepp

Also, I can pass along the reference datasets to you that I made for the gag and env if you are interested.
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: SCUEAL
Reply #1 - Jun 1st, 2011 at 2:17pm
 
Hi Crystal,

Leaving gapped regions is indeed OK, but there could be some issues if there are very long gapped regions (e.g. over 100 bp in a given sequences). SCUEAL explicitly tests for topological incongruence, so heterotachy is not a problem. If the LANL references data sets contains CRFs, it needs to be processed differently, i.e. do NOT run MakeReferenceAlignment.bf on them as this will assume that a single tree is adequate to explain all regions of the alignment (which is not the case with CRFs). If you used BuildupReferenceAlignment.bf, that will be taken care of. Finally, some of the reference sequences in LANL seemed to be intra-subtype recombinants, which could also be a problem if you used MakeReferenceAlignment.bf

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
CrystalH
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 23
Re: SCUEAL
Reply #2 - Jun 1st, 2011 at 2:34pm
 
Thank you! The gapped areas are typically 10-50 nt (usually at the variable regions in env). Actually I recently read the SCUEAL paper and I plan on mimicking your reference dataset generation method to make sure I have acceptable datasets...and I am not interested in CRFs, so those are left out. Thanks again!
Back to top
 
 
IP Logged