Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
LHT (Read 1552 times)
ofedrigo
YaBB Newbies
*
Offline


I love YaBB 1G - SP1!

Posts: 14
LHT
Nov 9th, 2006 at 8:23am
 
Hi,

I am implementing the Likelihood Heterogeneity Test (Huelsenbeck and Bull, 1996). Basically it is a LRT with H0= one tree for all dataset H1= a different tree for each dataset. So, for each partition I want to harvest the base frequency, I let the user choose the model (SelectTemplateModel) and I estimate the likelihood. I want to avoid to ask for a model every time, I just want to ask it once at the begining (I want to use the same model for all) and use a different basefreq for each partition. In the example files, "SelectTemplateModel" is after "createfilter". So If I want to ask for a model once at the begining, I will have to use the same base frequency for all my analyses. how can I do?

thanks,

Olivier

here is the relevant part of my code:

SetDialogPrompt ("Please locate an alignment file:");
DataSet ds  = ReadDataFile (PROMPT_FOR_FILE);
DataSetFilter filteredData = CreateFilter (ds,1);
fprintf (stdout, "\nLoaded a ", dsf.species, " sequence alignment with ", dsf.sites,"\n");

HarvestFrequencies (baseFreqs,ds,1,1,1);
fprintf (stdout, "\nBase composition:\n\tA: ", Format (baseFreqs[0],10,5),"\n\tC: ", Format (baseFreqs[1],10,5),"\n\tG: ", Format (baseFreqs[2],10,5),"\n\tT: ", Format (baseFreqs[3],10,5), "\n\n");

SelectTemplateModel(filteredData);

fprintf (stdout,"\n\n",Columns(DATA_FILE_PARTITION_MATRIX)," character sets found\n");

fprintf (stdout,"\n\n",Rows(NEXUS_FILE_TREE_MATRIX)," trees found\n");
ntree=Rows(NEXUS_FILE_TREE_MATRIX);

listOfTree={ntree,1};   
likList={Columns(DATA_FILE_PARTITION_MATRIX),1};
listOfLk={Columns(DATA_FILE_PARTITION_MATRIX),ntree};
bestLk={Columns(DATA_FILE_PARTITION_MATRIX),1};
bestTree={Columns(DATA_FILE_PARTITION_MATRIX),1};
bestLk2={Columns(DATA_FILE_PARTITION_MATRIX),1};
globalLk={ntree,1};

/*GET LIKELIHOOD FOR TREES FOR ALL PARTITIONS AND CHOOSE THE BEST TREE*/         
for (part=0; part<Columns(DATA_FILE_PARTITION_MATRIX); part=part+1)
    {
    DataSetFilter dsf=CreateFilter (ds,1,DATA_FILE_PARTITION_MATRIX[1][part]);
    fprintf(stdout,"Analysis for partition ",DATA_FILE_PARTITION_MATRIX[0][part]," given trees:\n");
    for (i=0; i<ntree; i=i+1)
       {
       Tree uniqueTree=NEXUS_FILE_TREE_MATRIX[i][1];
       LikelihoodFunction lf = (dsf,uniqueTree);
       Optimize (mles,lf);
       if (part==0) {fprintf(stdout,lf);}
       listOfLk[part][i]=mles[1][0];
       if (i==0)
           {
           bestLk[part]=mles[1][0];
           bestTree[part]=i;
           }
       else
           {
           if (bestLk[part]<mles[1][0])
               {
               bestLk[part]=mles[1][0];
               bestTree[part]=i;
               }
           }
       }
    }

/* GET LK1 and LK2 AND CHOOSE THE BEST TREE*/   
lk2=0;    /*H1: unconstrained different tree for different partition*/
for (part=0; part<Columns(DATA_FILE_PARTITION_MATRIX); part=part+1) {lk2=lk2+bestLk[part];}

lk1=0;    /*H0: constrained same tree for different partition*/
for (i=0; i<ntree; i=i+1)
    {
    globalLk[i]=0;
    for (part=0; part<Columns(DATA_FILE_PARTITION_MATRIX); part=part+1) {globalLk[i]=globalLk[i]+listOfLk[part][i];}
    if (i==0)
       {
       lk1=globalLk[i];
       theBestTree=i;
       }
    else
       {
       if (lk1<globalLk[i])
           {
           lk1=globalLk[i];
           theBestTree=i;
           }
       }
    }

fprintf (stdout,"\n|  Tree n.    |");
for (i=0;i<Columns(DATA_FILE_PARTITION_MATRIX);i=i+1) {fprintf(stdout," ",DATA_FILE_PARTITION_MATRIX[0][i],"\t|");}
fprintf (stdout," Total\n");
for (j=0; j<ntree; j=j+1)
    {
    fprintf (stdout,"|  ",NEXUS_FILE_TREE_MATRIX[j][0],"   |");
    for (i=0;i<Columns(DATA_FILE_PARTITION_MATRIX);i=i+1)
       {
       fprintf(stdout," ",Format(listOfLk[i][j],10,10));
       if (listOfLk[i][j]==bestLk[i]) {fprintf(stdout,"*\t|");} else {fprintf(stdout,"\t|");}
       }
    fprintf(stdout," ",Format(globalLk[j],10,10));
    if (globalLk[j]==lk1) {fprintf(stdout,"*\n");} else {fprintf(stdout,"\n");}
    }
fprintf(stdout,"\n");

fprintf(stdout,"L0= ",Format(lk1,10,10),"\n");
fprintf(stdout,"L1= ",Format(lk2,10,10),"\n");
lnlikDelta= 2*(lk2-lk1);
fprintf(stdout,"LRT= ",Format(lnlikDelta,10,10),"\n\n");
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: LHT
Reply #1 - Nov 9th, 2006 at 9:19am
 
Dear Olivier,

Most definitions of standard models (e.g. all nucleotide models) will have a function:

Code:
PopulateModelMatrix (ModelMatrixName&, EFV)
 



defined. It returns 1 if the definition of the model does not explicitly incorporate stationary frequencies (e.g. HKY85, F1), or 0 if it does (e.g. F84). They will also set a few flags, such as FREQUENCY_SENSITIVE, which tell you whether the model uses empirical frequencies or not (e.g. F81 does, but JC69 does not).

So, for you task, you could have code like this:

Code:
SelectTemplateModel (filteredData);

if (FREQUENCY_SENSITIVE)
{
	  /* for each partition */
	  HarvestFrequencies (partFreqs, filter_k, 1,1,1);
	  MBF = PopulateModelMatrix ("filterModelMatrix", partFreqs);
	  Model filterModel = (filterModelMatrix, partFreqs, MBF);
	  Tree ....
	  Likelihood...
}
else
{
/* use the same model for all partitions */
}
 


Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged