Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Error while reading one of the coding frames? (Read 4108 times)
Jose_Patane
YaBB Newbies
*
Offline



Posts: 26
Brazil
Gender: male
Error while reading one of the coding frames?
Aug 31st, 2006 at 6:45am
 
Hi guys,

I've been analysing a data set of 5 protein-coding genes (most of them are not the complete sequences), and found out this strange behaviour of HyPhy when trying to read some of them (via a batch file): sequences starting with reading frame "231" - which coincidentally also happen to be ([divisible by 3 ]+ 1) bases long in my data - are missing 1bp when read into the 3 different codon-position partitions, so that each category has exactly the same number of bases. Due to that, the analyses are returning slightly different ML values, partition weights and branch length estimates when compared to PAUP.

How can I fix that? Should I just discard one base and stick to that, regarding those genes?

Regarding the other reading frames:

"312", ([div. by 3]+ 2 bases long), 1 gene - everything is normal.
"123", ([div. by 3] bases long), 2 genes - everything is normal.


Jose_Patane
Back to top
« Last Edit: Aug 31st, 2006 at 1:15pm by Jose_Patane »  
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Bug while reading one of the coding frames
Reply #1 - Aug 31st, 2006 at 7:42am
 
Dear Jose,

How exactly are you specifying the partition? If you want to read all nucleotides at positions 2-last (recall that in HyPhy all indexing is 0-based), you could say

Code:
DataSetFilter filter231 = CreateFilter (myData,1,siteIndex>=1);
 



By position:

Code:
DataSetFilter filter231_1 = CreateFilter (myData,1,(siteIndex-1)%3==0);
DataSetFilter filter231_2 = CreateFilter (myData,1,(siteIndex-2)%3==0);
DataSetFilter filter231_3 = CreateFilter (myData,1,siteIndex%3==0);
 



What filter specification did you use?

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Jose_Patane
YaBB Newbies
*
Offline



Posts: 26
Brazil
Gender: male
Re: Bug while reading one of the coding frames
Reply #2 - Aug 31st, 2006 at 8:36am
 
As always, thks again, your code worked out - branches and ML estimates are ok now!

Anyway, for future analysis I'd like to know what I've done wrong, so here's how I was doing it:

- position of "gene_231" (meaning the 1st codon position starts on site #2) in the concatenated sequence of all the genes (not 0-indexed, this is actually the raw position in the alignment):  2741-3119 (379 bp).

Here's the code I had formerly used to partition the data into codon positions:

------------------------------------------------------------------
DataSet All_seqs= ReadDataFile("my_concatenated_data");

DataSetFilter gene_231 = CreateFilter (All_seqs,1,"2740-3118");

DataSetFilter filteredData1 = CreateFilter (gene_231,1,"<010>");
DataSetFilter filteredData2 = CreateFilter (gene_231,1,"<001>");
DataSetFilter filteredData3 = CreateFilter (gene_231,1,"<100>");

   .....
   .....
   .....

fprintf (stdout,"n_sites_1st = ", filteredData1.sites);
fprintf (stdout,"n_sites_2nd = ", filteredData2.sites);
fprintf (stdout,"n_sites_3rd = ", filteredData3.sites);

   .....
   .....
   .....

-------------------------------------------------------------------------

... with results:

"n_sites_1st = 126
n_sites_2nd = 126
n_sites_3rd = 126"


Like I had said, the starting sequences at '312' and '123' return proper results using this kind of code... can you help me figuring out what went wrong?
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Bug while reading one of the coding frames
Reply #3 - Aug 31st, 2006 at 8:55am
 
Dear Jose,

When you use the comb filter, HyPhy will move the comb (in your case of length 3), by a full length every time. It will start at position 0 (relative to the filter), check which sites to include, then move to position 3, then to 6, etc. In your example, the comb is not moved past position 375 (i.e. indices 375, 376 and 377 are checked, but the last nucleotide, 378 is not).

HTH,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Jose_Patane
YaBB Newbies
*
Offline



Posts: 26
Brazil
Gender: male
Re: Bug while reading one of the coding frames
Reply #4 - Aug 31st, 2006 at 1:14pm
 
Once more, thank you Sergei, problem resolved!

Jose_Patane

P.S.: I changed the title of my first report in this topic, it was not a bug of the program after all... I'll be more careful next time!
Back to top
 
 
IP Logged