Sergei
|
Dear Albert,
To add to what Simon had already said, you can also use the Standard Analysis approach if all you want to do is to fit the 'free-ratio' model (I don't particularly like this description, but it is the accepted one).
Choose Basic Analyses->AnalyzeCodonData.bf and follow the prompts until you get to the model parameter dialog.
Most models (e.g. MG94 and derivatives and GY94) will have the 'Local' option, which you need to select. HyPhy, unlike most other packages is able to fit an arbitrary number of model parameters per branch - take a look at the HyPhy walk-through document [url]http://www.hyphy.org/docs/HyphyDocs.pdf[/url] starting at page 12, for a detailed example in the context of GUI based nucleotide models. In the codon model context this means that dS (paramterized by 'synRate' in most models) and dN ('nonSynRate') will be fitted separately for every branch.
For instance, I used the MG94Custom codon modelwith 010010 model string, i.e. MG94xHKY85, on the p51.nex example file which is included with the distribution (in the 'data' directory).
[b]Global Model[/b] (dN/dS is the same for all branches)
[tt] ______________RESULTS______________ Log Likelihood = -3192.68505802982; Shared Parameters: AC=0.148414 R=0.23238 AT=AC=0.148414 CG=AC=0.148414 CT=1=1 GT=AC=0.148414
Tree givenTree=(((D_CD_83_ELI_ACC_K03454:0.0206494,D_CD_83_NDK_ACC_M27323:0.0109048)N ode3:0.0114611,D_UG_94_94UG114_ACC_U88824:0.0597732)Node2:0.00370895,D_CD_84_84Z R085_ACC_U88822:0.0313679,(B_US_83_RF_ACC_M17451:0.028637,((B_FR_83_HXB2_ACC_K03 455:0.0123754,B_US_86_JRFL_ACC_U63632:0.0194165)Node11:0.00196883,B_US_90_WEAU16 0_ACC_U21135:0.0228499)Node10:0.00563753)Node8:0.0264147); [/tt]
If I now go to 'Analyses->Results->View Results, and choose the 'Parameters+Values' option, HyPhy prints out the complete list of model parameters and their MLEs (in the command line version, include -p on the command line when you start HyPhy to call result processing at the end of an analysis):
[b]Local Model (every branch with own dN and dS)[/b]
[tt] ______________RESULTS______________ Log Likelihood = -3180.50428940175; Shared Parameters: AC=0.150083 AT=AC=0.150083 CG=AC=0.150083 CT=1=1 GT=AC=0.150083
Tree givenTree=(((D_CD_83_ELI_ACC_K03454:0.0195808,D_CD_83_NDK_ACC_M27323:0.0106406)N ode3:0.0130999,D_UG_94_94UG114_ACC_U88824:0.0582857)Node2:0.00383832,D_CD_84_84Z R085_ACC_U88822:0.0298677,(B_US_83_RF_ACC_M17451:0.0289752,((B_FR_83_HXB2_ACC_K0 3455:0.0124571,B_US_86_JRFL_ACC_U63632:0.0191141)Node11:0.00234226,B_US_90_WEAU1 60_ACC_U21135:0.023113)Node10:0.0047989)Node8:0.0283525); [/tt]
And the 'View Results' output
[tt] global AC=0.1500833478035055; givenTree.B_US_90_WEAU160_ACC_U21135.synRate=0.1849469616011374; givenTree.Node10.synRate=0.0229292435029055; givenTree.Node8.nonSynRate=0.03246789433625548; givenTree.Node8.synRate=0.2484940702663311; givenTree.Node11.synRate=0.02726373716021484; givenTree.B_US_86_JRFL_ACC_U63632.synRate=0.1275403099620458; givenTree.D_CD_83_NDK_ACC_M27323.nonSynRate=0.02818187734159924; givenTree.Node11.nonSynRate=0; givenTree.Node10.nonSynRate=0.01311409837601478; givenTree.B_US_90_WEAU160_ACC_U21135.nonSynRate=0.03348708338783461; givenTree.B_FR_83_HXB2_ACC_K03455.nonSynRate=0.01405335626654316; givenTree.B_US_86_JRFL_ACC_U63632.nonSynRate=0.03781224086019262; givenTree.B_US_83_RF_ACC_M17451.nonSynRate=0.04109613927773711; givenTree.D_CD_84_84ZR085_ACC_U88822.nonSynRate=0.06246557853601946; givenTree.D_CD_83_ELI_ACC_K03454.synRate=0.08830950094179651; givenTree.D_CD_83_NDK_ACC_M27323.synRate=0.05309109774992424; givenTree.D_CD_84_84ZR085_ACC_U88822.synRate=0.1908060683975823; givenTree.Node3.synRate=0.1424544294182528; givenTree.D_UG_94_94UG114_ACC_U88824.synRate=0.3941850584375284; givenTree.Node2.nonSynRate=0.0006161123625948533; givenTree.Node3.nonSynRate=0.003993412386792515; givenTree.B_US_83_RF_ACC_M17451.synRate=0.2340765380757457; givenTree.B_FR_83_HXB2_ACC_K03455.synRate=0.1097111783727254; givenTree.D_CD_83_ELI_ACC_K03454.nonSynRate=0.05559894940886865; givenTree.D_UG_94_94UG114_ACC_U88824.nonSynRate=0.1132037804037879; givenTree.Node2.synRate=0.04313067425617549; global AT:=AC; global CG:=AC; global CT:=1; global GT:=AC; [/tt]
Each branch now has two paramters, and compute the dN/dS for a branch you simply divide the two appropriate estimates. Note, that this parameterization allows for dS to be 0, and dN>0, which is the infinite ratio (and can lead to numerical issues) in PAML.
You can also call Analysis->Results->Syn and nonsyn trees to get the trees scaled on E[syn subs] and E[non syn subs].
Finally you can use the GUI (as described in the tutorial, pp13-15), to draw trees on dN/dS and do LRT as Simon suggested.
Cheers, Sergei
|