Welcome, Guest. Please Login
YaBB - Yet another Bulletin Board
 
  HomeHelpSearchLogin  
 
about the performance of different PS test (Read 6504 times)
zchou
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 12
about the performance of different PS test
Feb 27th, 2008 at 11:15am
 
Hi All,

I want to know the performance of the several different positive selection along the lineage-specific. As I know, the branch-model, branch-site model A implemented in the PAML, GA implemented in the HyPhy can handle this question. In the HyPhy, there are several alternative of the branch model in the PAML, that't to say can allow the background w value variation. However, many arguments about the branch and branch-model A, only several publications use the GA. Which is the best choice for testing the lineage-specific selection, especially performing the large scale analysis? Do we have the comparison among the branch, GA-branch and branch-site model A?

zhuocheng
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: about the performance of different PS test
Reply #1 - Feb 27th, 2008 at 2:31pm
 
Dear zhuocheng,

I am not aware of a formal comparison; it would take a lot of work and not result in an important paper - hence one has to ration the time. I have found examples where the assumption of uniform background omega can really mess up PAML's branch (and branch-site) tests, even in small trees (4 sequences). Take a look at section 1.5.2 of Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login.

It is difficult to say what might happen in a large scale analysis however. GA is not very well suited to that - it takes too long to run on >40 sequences, and the current version does not permit branch-site type tests. Frankly, there does not currently exist a test I would feel comfortable recommending for localized lineage-specific selection, especially for large scale analyses.

What size dataset are you talking about?

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
zchou
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 12
Re: about the performance of different PS test
Reply #2 - Feb 28th, 2008 at 6:58am
 
Dear Sergei,

I noticed the handbook of the HyPhy, that's very excellent introduction for both theory and application. That's the reason I opened this topic because I noticed some reviewers would reject the branch-site model test results in nature.

I agree that the improvement of branch model in the HyPhy with the variable w value for the background. I want to try to use the more robust branch model implemented in the HyPhy (BranchAPriori.bf) to anlyze the 2000 genes, each gene family included about the 15 genes. Before the formal running, I am aware of the assumptions for the branch model. Is it possilbe to evaluate the best DNA substitution model and then use the best DNA substituion model like the ModelTest? And then performe the local branch-specific selection test. If the HyPhy can handle all these questions, could you give us the sample scripts for using HyPhy? 

Thanks,
zhuocheng
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: about the performance of different PS test
Reply #3 - Feb 28th, 2008 at 8:29am
 
Dear zhuocheng,

One thing to keep in mind about the current HyPhy implementation is that it does not permit a branch omega to vary from site to site. It's easy enough to relax this assumption while keeping background omegas variable, for example by fitting a mixture of 'local' models to the alignment. How many taxa do you have per alignment (on average)?

As far as the model of nucleotide substitution goes, you could always settle for codon model x GTR, because in this case you are not interested in estimating nucleotide biases, just the omega ratio, hence it is not of great concern that you may be overfitting the nucleotide component a bit.

I should have a script or two lying around for automated processing of multiple alignments in parallel and it should be fairly trivial to add the analysis that you want to do to its repertoire . Let me dig it out and post a link here.

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
zchou
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 12
Re: about the performance of different PS test
Reply #4 - Feb 28th, 2008 at 2:19pm
 
Dear Sergei,

That's great. In my current database, I use the 15 taxa to do analysis.

I would enjoy your scripts when you add links. Does the HyPhy has the special requirment for the tree format when I add the marks on the prori branch to test?

Thanks very much,
Zhuocheng
Back to top
 
 
IP Logged
 
zchou
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 12
Re: about the performance of different PS test
Reply #5 - Mar 2nd, 2008 at 1:00pm
 
Dear Sergei,

I want to try use BranchAPriori.bf to test the selection on the specific branch. But I cannot find this .bf file. Where is the BranchAPriori.bf file?

Thanks,
zhuocheng
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: about the performance of different PS test
Reply #6 - Mar 3rd, 2008 at 9:49am
 
Dear zhuocheng,

It's downloaded from Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login at runtime.
The URL is Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
zchou
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 12
Re: about the performance of different PS test
Reply #7 - Mar 3rd, 2008 at 6:06pm
 
Hi Sergei,

I want to test the variable dn/ds values for the background and also for the foreground. I had lots of gene to test. So, I need to write scripts for the automated processing multiple alignment. As you said, you had wrote these scripts. Can you find it?

Another question is how to make test on the specific branch or node by using the script in the automated processing script? I don't know whether I can provide the use defined tree. Actually, I must provide the user defined tree topology and test the selection on the specific branch/node.

Third question is how about the performance of the HyPhy to test the purifying selection on the specific lineage? How can I perform it in the HyPhy script?

Best,
Zhuocheng
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: about the performance of different PS test
Reply #8 - Mar 3rd, 2008 at 10:24pm
 
Dear Zhuocheng,

Quote:
I want to test the variable dn/ds values for the background and also for the foreground. I had lots of gene to test. So, I need to write scripts for the automated processing multiple alignment. As you said, you had wrote these scripts. Can you find it?



I'll post a link on Friday; I am currently away from my office

Quote:
Another question is how to make test on the specific branch or node by using the script in the automated processing script? I don't know whether I can provide the use defined tree. Actually, I must provide the user defined tree topology and test the selection on the specific branch/node.



You could automate the process, because HyPhy prompts (you can feed in data non-interactively as well) for the branch (es) to test. The only problem arises if you have different sequence (and tree leaf) names in different alignments, or if you want to test an interior lineage but topologies differ between trees. I need more information about your analysis to tell you how to proceed.

Quote:
Third question is how about the performance of the HyPhy to test the purifying selection on the specific lineage? How can I perform it in the HyPhy script?


The same test can be used, except with a different one-sided constraint (dN<dS instead of dN>dS).

Cheers,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
zchou
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 12
Re: about the performance of different PS test
Reply #9 - Mar 4th, 2008 at 7:32am
 
Hi Sergei,

Thanks for your patient reply and help.

You know for the genomic level analysis, we alsways had to face the problem for the different gene number. That means the different sequence (and tree leaf) names in different alignments. I also want to test selection on the different branches on the same tree. So, I had to input the user defined tree topolgy and give the information for which branch I want to test.  I opened the BranchAProri.bf file and don't know how the program get  the "givetree".

I would do some work to read Hyphy manual to see how to perform the purifying selection.

Thanks,
zhuocheng
Back to top
 
 
IP Logged
 
zchou
YaBB Newbies
*
Offline


Feed your monkey!

Posts: 12
Re: about the performance of different PS test
Reply #10 - Mar 12th, 2008 at 12:33pm
 
Hi Sergei,

Do you find the script for running branch-specific selection ,including the purifying selection and positive selection?

I also want to know whether I can provide the use defined tree topology and add marks on the specific lineages.

Thanks,
zhuocheng
Back to top
 
 
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: about the performance of different PS test
Reply #11 - Mar 13th, 2008 at 9:06am
 
Dear zhuocheng,

Not yet; a very busy week here; sorry. Will dig them up today or tomorrow.
In terms of specifying branches, you can do that by simply listing their names as input to the analysis.
E.g. if your tree ((a,b)parent,(c,d),e) you can pass the list of branches:

a
b
parent

to specify that you want to test for selection in the (a,b)parent clade.

HTH,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged
 
Sergei
YaBB Administrator
*****
Offline


Datamonkeys are forever...

Posts: 1658
UCSD
Gender: male
Re: about the performance of different PS test
Reply #12 - Mar 18th, 2008 at 11:05pm
 
Dear zhoucheng,

Take a look at Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login for a general framework of making pipelines using HyPhy HBL. There is a 5 page PDF file that describes the basic components for the analysis. Branch specific analysis (Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login) has to be modified to fit within the framework. If you are happy with what it does already, you can also consider writing a simple multiple analysis wrapper (which requires no modification to the BranchAPriori.bf) file, as outlined in Multimedia File Viewing and Clickable Links are available for Registered Members only!!  You need to Login Login

HTH,
Sergei
Back to top
 

Associate Professor
Division of Infectious Diseases
Division of Biomedical Informatics
School of Medicine
University of California San Diego
WWW WWW  
IP Logged