Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
Pages: 1 2 
KH test error (Read 9622 times)
Sundy
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 33
KH test error
Feb 13th, 2009 at 9:45am
 
Dear everyone,

Did any one meet error like following, when you run KH test for GARD with HyPhy?

Operation MAccess is not defined for 0
Current BL command: jvec[1][k]=vec2[k] Current task has been terminated. Would you like to see the remaining error messages, if there are any?


I met this problem for several times recently. It always stop in the middle of analysis. I can not get the final results of KH test.
Could anyone tell me how to fix it? I appreciate!!

Best regards,

Sundy
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: KH test error
Reply #1 - Feb 13th, 2009 at 11:31am
 
Dear Sundy,

Try using the attached file (drop it into TemplateBatchFiles).
There was a bug in the older versions of KHTest.bf that could result in the error that you are seeing.

Keep those bug reports coming:)

Cheers,
Sergei
Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (7 KB | )

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Sundy
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 33
Re: KH test error
Reply #2 - Feb 13th, 2009 at 1:41pm
 
Dear Sergei,

Thank you so much. But I am sorry that this error is still there.
The attached file is my sequence and GARD.splits file. Could you please have a look of them, whether something wrong with these files?

Thank you!

Best regards,

Sundy
Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (7 KB | )
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: KH test error
Reply #3 - Feb 13th, 2009 at 3:27pm
 
Dear Sundy,

Please try replacing GARDProcessor.bf in TemplateBatchFiles with the file I attach. I was able to run your example through the script without problems.

Cheers,
Sergei
Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (17 KB | )

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Sundy
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 33
Re: KH test error
Reply #4 - Feb 14th, 2009 at 7:20am
 
Dear Sergei,

Thank you so much!!!! This problem was fixed Cheesy Cheesy
Cheers,

Sundy
Back to top
 
 
IP Logged
 
Sundy
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 33
Re: KH test error
Reply #5 - Feb 14th, 2009 at 8:30am
 
Dear Sergei,

I am sorry to trouble you again.
I have another question about the new "GARDProcessor.bf" you gived me.

When I replaced the old GARDProcessor.bf in TemplateBatchFiles with this new file. I don't have the problem as mentioned above. But when I run some my previous data which was found having several significant breakpoints with the old GARDProcessor.bf, I found that there were no significant breakpoints anymore.

So I am a little confused. which GARDProcessor.bf result are more credible?

For example, in the attached file, the old GARDProcessor.bf showed 4/5 significant, but the new GARDProcessor.bf indicated 0/5 significant.

Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (11 KB | )
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: KH test error
Reply #6 - Feb 15th, 2009 at 1:37pm
 
Dear Sundy,

The results from the new script are correct -- there was a bug in the older version. Some of the breakpoints in your alignment are still significant (i.e at 0.05 and not at 0.01) level which is the default in the script. I can modify the script to let you select the p-value if you'd like.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Sundy
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 33
Re: KH test error
Reply #7 - Feb 15th, 2009 at 4:33pm
 
Dear Sergei,

Thank you so much!
I will greatly appreciate if you can help me modify the script to select the p-value.

I have another question: Has the KH test in HyPhy already include Bonferroni’s correction, or not?

In the above file, the GARD and new KH test results are as following:
Breakpoint location      LHS vs. RHS      RHS vs. LHS
     454                        0.031            0.033
     729                        <0.001            0.022
     1091                        0.006            0.001
     1927                        0.001            0.026
     3321                        0.034            0.411

The KH test dosen't think site 1091 as significant, so I am thinking this is due to Bonferroni’s correction, although both p-value<0.01. After Bonferroni’s correction, 1091 is significant at 0.05 level, but not significant at 0.01 level. Is what I understand right?

As what I am understanding, for this 5 breakpoints test, p<0.01 equal to 0.05 significance, p<0.002 equal to 0.01 significance (0.01/5). Is this right?  

Thank you very much!!

Sundy
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: KH test error
Reply #8 - Feb 16th, 2009 at 10:32am
 
Dear Sundy,

I've modified GARDProcessor (attached) to summarize KH p-values (both raw and Bonferroni-corrected) at the end of the run and to also report how many KH-significant breakpoints there are at 3 different significance levels. This should make the interpretation of GARD results easier.

Example output follows:

Code:
 Breakpoint | LHS Raw p | LHS adjusted p | RHS Raw p | RHS adjusted p
	 454 |   0.46570 |	  1.00000 |   0.00290 |	  0.02900
	 729 |   0.00020 |	  0.00200 |   0.00010 |	  0.00100
	1091 |   0.25400 |	  1.00000 |   0.00580 |	  0.05800
	1927 |   0.02710 |	  0.27100 |   0.22930 |	  1.00000
	3321 |   0.40520 |	  1.00000 |   0.00010 |	  0.00100

At p = 0.01 there are 1 significant breakpoints
At p = 0.05 there are 1 significant breakpoints
At p = 0.1 there are 1 significant breakpoints

 



Cheers,
Sergei
Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (17 KB | )

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Sundy
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 33
Re: KH test error
Reply #9 - Feb 16th, 2009 at 11:21am
 
Dear Sergei,

Thank you very very very much!!!
This is really great. It is easy to use for someone like me.

But I found another problem:
I run this dataset on my computer again, I got totally different results as you gived me above. (as following)
I thought maybe my HyPhy was not the latest version, then I downloaded the latest HyPhy (2008.5.8). But I still got these results.

Could you please help me figure out what's this problem?

Thank you so much!!
Best regards,

Sundy


Breakpoint | LHS Raw p | LHS adjusted p | RHS Raw p | RHS adjusted p
      454 |   0.03630 |        0.36300 |   0.03060 |        0.30600
      729 |   0.02100 |        0.21000 |   0.00010 |        0.00100
     1091 |   0.00100 |        0.01000 |   0.00780 |        0.07800
     1927 |   0.02540 |        0.25400 |   0.00050 |        0.00500
     3321 |   0.40440 |        1.00000 |   0.03530 |        0.35300

At p = 0.01 there are 0 significant breakpoints
At p = 0.05 there are 0 significant breakpoints
At p = 0.1 there are 1 significant breakpoints

Mean splits identify: 0.15

Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: KH test error
Reply #10 - Feb 16th, 2009 at 12:42pm
 
Dear Sundy,

I am using a developmental version of HyPhy (not available for download yet). Let me confirm the results and get back to you.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: KH test error
Reply #11 - Feb 16th, 2009 at 1:29pm
 
Dear Sundy,

Your version is giving the correct p-values. Thanks for alerting me to this discrepancy - I found a bug in the developmental version because of it.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Sundy
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 33
Re: KH test error
Reply #12 - Feb 16th, 2009 at 1:42pm
 
Dear Sergei,

Thank you so much!! Cheesy Cheesy Cheesy
I finally figured out all my questions about GARD so far (with your help).
I really appreciate you for your quick reply every time.
Thanks!!

Sundy
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: KH test error
Reply #13 - Feb 18th, 2009 at 12:10pm
 
Dear Sundy,

Thank you very much for bringing the discrepancy between your results and the ones I was getting with the prerelease version to my attention. This helped me identify a serious bug in the new likelihood evaluation routines. Thanks for helping make HyPhy a better product.

Best,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Miguel Lacerda
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 36
Natl Univ of Ireland, Galway
Gender: male
Re: KH test error
Reply #14 - Apr 28th, 2009 at 4:46am
 
Dear Sergei

I have performed a GARD analysis and have noticed a few idiosyncrasies when processing the results using the latest versions of GARDProcessor.bf and KHTest.bf posted in this thread.

Firstly, running GARDProcessor.bf multiple times on the same dataset and splits file produces (slightly) different results. For example, here are the KH tests for 3 runs with the same input files:

Code:
RUN 1:
----------------------------------------------------------------------
Fitting tree 1 to partition 1
Log Likelihood = -4655.72380305908;
Fitting tree 2 to partition 1
Log Likelihood = -4730.55612145537;
KH Testing partition 1
Tree 2 base LRT = 149.665. p-value = 0.0044


Fitting tree 1 to partition 2
Log Likelihood = -5043.76888590518;
Fitting tree 2 to partition 2
Log Likelihood = -4836.52025035318;
KH Testing partition 2
Tree 1 base LRT = 414.497. p-value = 0.0001


Breakpoint | LHS Raw p | LHS adjusted p | RHS Raw p | RHS adjusted p
       507 |   0.00010 |        0.00020 |   0.00440 |        0.00880
----------------------------------------------------------------------

RUN 2:
----------------------------------------------------------------------
Fitting tree 1 to partition 1
Log Likelihood = -4655.72380305908;
Fitting tree 2 to partition 1
Log Likelihood = -4730.55612145537;
KH Testing partition 1
Tree 2 base LRT = 149.665. p-value = 0.0038


Fitting tree 1 to partition 2
Log Likelihood = -5043.76888590518;
Fitting tree 2 to partition 2
Log Likelihood = -4836.52025035318;
KH Testing partition 2
Tree 1 base LRT = 414.497. p-value = 0.0001


Breakpoint | LHS Raw p | LHS adjusted p | RHS Raw p | RHS adjusted p
       507 |   0.00010 |        0.00020 |   0.00380 |        0.00760
----------------------------------------------------------------------

RUN 3:
----------------------------------------------------------------------
Fitting tree 1 to partition 1
Log Likelihood = -4655.31331109413;
Fitting tree 2 to partition 1
Log Likelihood = -4730.60159498021;
KH Testing partition 1
Tree 2 base LRT = 150.577. p-value = 0.0042


Fitting tree 1 to partition 2
Log Likelihood = -5043.72987443926;
Fitting tree 2 to partition 2
Log Likelihood = -4836.4869482817;
KH Testing partition 2
Tree 1 base LRT = 414.486. p-value = 0.0001

Breakpoint | LHS Raw p | LHS adjusted p | RHS Raw p | RHS adjusted p
       507 |   0.00010 |        0.00020 |   0.00420 |        0.00840

------------------------------------------------------------------------
 



You'll notice that the likelihoods and p-values differ slightly between runs. Could this just be due to different starting values in the optimisation? (I'm actually just curious - the differences in the results are pedantic!)

Of more concern to me is that I get very different results when I run the processor file locally vs on my cluster with HYPHYMP_DEV SVN415 (probably due to different versions of HyPhy). I obtained the above results on my machine, but this is what I get from two runs on the cluster:

Code:
RUN 1:
---------------------------------------------------------------------
Fitting tree 1 to partition 1
Log Likelihood = -4655.74764230272;
Fitting tree 2 to partition 1
Log Likelihood = -4730.57601129128;
KH Testing partition 1
Tree 2 base LRT = 149.657. p-value = 0.3469

Fitting tree 2 to partition 2
Log Likelihood = -4836.49821145768;
KH Testing partition 2
Tree 1 base LRT = 414.467. p-value = 0.0001

Breakpoint | LHS Raw p | LHS adjusted p | RHS Raw p | RHS adjusted p
       507 |   0.00010 |        0.00020 |   0.34690 |        0.69380

---------------------------------------------------------------------


RUN 2:
---------------------------------------------------------------------
Fitting tree 1 to partition 1
Log Likelihood = -4655.74764230272;
Fitting tree 2 to partition 1
Log Likelihood = -4730.57601129128;
KH Testing partition 1
Tree 2 base LRT = 149.657. p-value = 0.354


Fitting tree 1 to partition 2
Log Likelihood = -5043.73191448845;
Fitting tree 2 to partition 2
Log Likelihood = -4836.49821145768;
KH Testing partition 2
Tree 1 base LRT = 414.467. p-value = 0.0001

Breakpoint | LHS Raw p | LHS adjusted p | RHS Raw p | RHS adjusted p
       507 |   0.00010 |        0.00020 |   0.35400 |        0.70800

 



Again, I have copied the GARDProcessor.bf and KHTest.bf files from this thread into TemplateBatchFiles on the cluster. As you can see, the breakpoint is no longer significant.

Please could you advise as to which of the above results is correct. I have attached the relevant files.

Thanks a lot as always!

Miguel

PS: While I'm here... I have one more question Smiley

I have a dataset with ~500 sequences and 1100 nucleotides - i.e. too few sequences relative to the number of sites in order to run GARD. So I have divided the dataset into random samples of 50 sequences and am running GARD on each of these smaller datasets. Is that what you would advise?

Thanks again....





Back to top
 
Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login (8 KB | )
 
IP Logged
 
Pages: 1 2