Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Category Matrix (Read 2874 times)
Dr. Lulz
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 9
Category Matrix
Sep 6th, 2013 at 11:36am
 
If I have a category matrix with 4 classes (0,1,2,3) and I use ConstructCategoryMatrix, it will return the value for each position.

I am interested in cases where I have 2 categories each with 4 classes and the site-specific class refinement for each category. How do I find this information from the number like "12" when the original categories are 0-3 classes each?

I assume that it is category1*4 + category2, but how do I distinguish between category1 and category2? is category1 always the one that has the "alpha" variables in the category statement? as the reference line? and not the "beta"?

Ie:

category category1 = (4, EQUAL, MEAN, GammaDist(_x_,alpha,alpha), CGammaDist(_x_,alpha,alpha), 0, 1e25, CGammaDist(_x_, alpha+1, alpha));


category category2 = (4, EQUAL, MEAN, GammaDist(_x_,alpha,beta), CGammaDist(_x_,alpha,beta), 0, 1e25, CGammaDist(_x_, alpha+1, beta)*alpha/beta);
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Category Matrix
Reply #1 - Sep 6th, 2013 at 11:46am
 
Hi there,

Yes, indeed, if you have Category1 with D rate classes and Category2 with S rate classes, then the index for the (i,j) pair of rates [0≤i<D, 0≤j<S] is i*S + j.
To see which of the category variables is first use code like this:

Code:
GetInformation (categVarIDs,lf);
 



categVarIDs is a matrix of strings listing the ordering of category variables.

HTH,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Dr. Lulz
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 9
Re: Category Matrix
Reply #2 - Sep 9th, 2013 at 10:12am
 
Hi,

Thanks very much for the information!

So, does it chose a different order for every site? or is it consistent throughout the whole category matrix?

Therefore, does "GetInformation" store the matrix of strings (of possibly varying category variable order) into "categVarIDs"? and so to recover the information, would I have to print every line? Maybe like:

GetInformation (categVarIDs,lf);
for (k=0; k < filteredData.sites; k=k+1)
{
  fprintf (stdout, categVarIDs[k], " ");
}
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Category Matrix
Reply #3 - Sep 9th, 2013 at 1:05pm
 
Hi there,

The order is consistent for all sites; you only need to call GetInformation once per likelihood function.

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Dr. Lulz
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 9
Re: Category Matrix
Reply #4 - Sep 11th, 2013 at 9:42am
 
Hello Sergei,

Thanks again!

Another question. If category1 and category2 are both initialized in the same manner in 2 different hyphy jobs, then is my assumption correct that GetInformation will return the same way in all cases? or does the program choose differently for each different job which to use first?

I have run 2 different jobs that take significant server time and don't want to re-run them; I want to run a shorter version to get the information.
Back to top
 
 
IP Logged
 
Dr. Lulz
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 9
Re: Category Matrix
Reply #5 - Sep 11th, 2013 at 10:40am
 
Hello again Sergei,

More significantly, the following code does not seem to work:

GetInformation (categVarIDs,lf)
fprintf (stdout, categVarIDs);

Please advise.
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: Category Matrix
Reply #6 - Sep 11th, 2013 at 10:42am
 
Hi there,

Yes, the ordering is deterministic (i.e. won't change between runs of the same code on different data).
What doesn't work for the

Code:
GetInformation (categVarIDs,lf)
fprintf (stdout, categVarIDs);
 



example? Are you using HyPhy v2.2 or an earlier version?

Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Dr. Lulz
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 9
Re: Category Matrix
Reply #7 - Sep 13th, 2013 at 9:29am
 
Hi Sergei,

I would also like to know that if all the other code is different except for the category variable initialization, including the order of the initialization, will the order still be the same, so long as the variables are initialized with the same code? Ie:

Test 1:
code code code
Category1 = xyz
Category2 = abc
code code code

Test 2:
different code different code different code
Category2 = abc
Category1 = xyz
different code different code different code

My HYPHYMP Version: "/HYPHY 2.1120130515beta(MP) for Linux on x86_64\"; should I upgrade?

As for "GetInformation (categVarIDs,lf)" I literally just want my HYPHY results to have the order printed so that I can make sense of the category matrix numbers. Perhaps my "fprintf (stdout, categVarIDs);" statement is incorrect? I was under the impression that categVarIDs was an object that I could just print?

Thanks kindly,


Armen

PS. My code is as follows:

ConstructCategoryMatrix (perSiteIndex, lf, INDEX);

counter = 0;
for (k=0; k < filteredData.sites; k=k+1)
{
  fprintf (stdout, perSiteIndex[k], " ");
}

fprintf (stdout, "\n\n");

GetInformation (categVarIDs,lf)       <---I just realized there was no semicolon; will rerun with a semi-colon

/*I have tried:*/
fprintf (stdout, categVarIDs);
/*and*/
for (k=0; k < filteredData.sites; k=k+1)
{
  fprintf (stdout, categVarIDs[k], " ");
}
Back to top
 
 
IP Logged
 
Dr. Lulz
YaBB Newbies
*
Offline


Curious HyPhy user

Posts: 9
Re: Category Matrix
Reply #8 - Sep 13th, 2013 at 10:47am
 
Update:

categVarIDs = "0, -1e+26, 1e+26"

These look like the numbers that were used to calculate the position (except for there are 3, with only 2 categories, which I also don't understand). I'm actually looking to identify the category that these numbers actually are. Ie: is 0 category1, or category2?
Back to top
 
 
IP Logged