HyPhy message board
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl
Methodology Questions >> How to >> GARD for analyze multiple data files
http://www.hyphy.org/cgi-bin/hyphy_forums/YaBB.pl?num=1227722874

Message started by Miguel on Nov 26th, 2008 at 10:07am

Title: GARD for analyze multiple data files
Post by Miguel on Nov 26th, 2008 at 10:07am
Hi,

Is it possible to analyze multiple data files in GARD (the fastest case by homogeneous rate)? e.g. by datamonkey?
Or, Do I have to modify the batch file "GARD.bf" for that?. In this last case, I see that the batch file is not very friendly, perhaps it was done for execute in html and on MPI environment. Is there other GARD batchfile easier to work in?. Thanks.

Regards,
Miguel


Title: Re: GARD for analyze multiple data files
Post by Sergei on Nov 26th, 2008 at 10:43am
Dear Miguel,

By multiple files, you mean run a batch files through GARD? It's probably easiest to modify the GARD.bf file - what kind of output were you looking for?

Cheers,
Sergei

Title: Re: GARD for analyze multiple data files
Post by Miguel on Nov 27th, 2008 at 1:39am
Dear Sergei,

I have 1200 data files (alignments of DNA sequences), the idea is to analyze them independently by GARD to obtain the corresponding 1200 outputs. Each output should contain the breakpoints (or fragment intervals) and a phylogenetic tree for each fragment, such as GARD of datamonkey does for an alignment (but here I have 1200), nexus format is the perfect kind of output for me (datamonkey generates it). For example:

#NEXUS

[
Generated by HYPHY 1.0020080624beta(MPI) for Linux on x86_64 on Sat Nov 22 09:41:23 2008

]

BEGIN TAXA;
     DIMENSIONS NTAX = 11;
     TAXLABELS
           'seq00001' 'seq00002' 'seq00003' 'seq00004' 'seq00005' 'seq00006' 'seq00007' 'seq00008' 'seq00009' 'seq00010' 'outgroup' ;
END;

BEGIN CHARACTERS;
     DIMENSIONS NCHAR = 999;
     FORMAT
           DATATYPE = DNA

           GAP=-
           MISSING=?
           NOLABELS
     ;

MATRIX
"sequences"
END;

BEGIN TREES;
     TREE part_1 = (((seq00001:0.00989392,((seq00004:0,seq00005:0):0,seq00006:0):0.0312366):0.010171,(seq00003:0,seq00008:0.00630028):0.0189827):0,((seq00002:0.00631735,seq00009:0.0063189):0.00619616,seq00010:0.01929):0.0324627,(seq00007:0.0262537,outgroup:0.122788):0.0263472);
     TREE part_2 = ((((seq00001:0.00854775,seq00002:0.0171899):0,seq00009:0.0171894):0.00861809,seq00007:0.0172076):0.00859177,((((seq00003:0,(seq00005:0,seq00006:0):0.00854154):0,seq00008:0):0.0171795,seq00010:0):0.0171975,outgroup:0.0815803):0.0174034,seq00004:0.00863121);
     TREE part_3 = (((seq00001:0.0206147,(seq00002:0,seq00010:0.0204341):0.01012):0.0311044,((seq00004:0,seq00009:0):0,seq00007:0.0308687):0.0202623):0,(seq00003:0,seq00008:0):0.0206851,((seq00005:0.0101502,seq00006:0):0.0649942,outgroup:0.0876822):0.0306586);
     TREE part_4 = (((seq00001:0.0578165,((seq00004:0,seq00007:0):0.0344496,outgroup:0.157492):0.0307402):0.029442,(((seq00002:0.0102012,seq00010:0):0.0205504,seq00009:0):0.064412,(seq00005:0,seq00006:0):0.00971875):0.0332362):0.00913766,seq00003:0,seq00008:0.0101911);
     TREE part_5 = ((((seq00001:0.021233,((seq00002:0.005375,seq00010:0.00295404):0.0432586,outgroup:0.0940907):0.0156284):0.012343,(seq00003:0.00276942,seq00008:0):0.0143391):0.0135234,seq00004:0.030578):0.00384256,(seq00005:0,seq00006:0):0,(seq00007:0,seq00009:0):0);
     TREE part_6 = (((seq00001:0.0061545,(seq00009:0.0314923,seq00010:0):0.0188424):0.0189173,(seq00004:0.025081,(seq00005:0,seq00006:0):0):0.00620545):0.00620585,((seq00002:0,seq00007:0):0.0191061,outgroup:0.114216):0.0122447,(seq00003:0,seq00008:0):0);

END;

BEGIN ASSUMPTIONS;
     CHARSET span_1 = 1-159;
     CHARSET span_2 = 160-277;
     CHARSET span_3 = 278-376;
     CHARSET span_4 = 377-475;
     CHARSET span_5 = 476-837;
     CHARSET span_6 = 838-999;

END;

At this time I am trying to modify GARD.bf but this batch file is a bit complex, with many functions, MPI environment and it seems for executing in html (I'm not sure). So, the question is, is there other GARD batchfile easier to work in?
Thanks again!

Miguel




Title: Re: GARD for analyze multiple data files
Post by Miguel on Dec 4th, 2008 at 6:18am
I get it!!   :)  
Thanks anyway.
Cheers,

Miguel

Title: Re: GARD for analyze multiple data files
Post by Sergei on Dec 8th, 2008 at 4:41pm
Dear Miguel,

Were you able to work out the formatting? Sorry - I was away on holiday/business for a few weeks. Let me know if you still need my code.

Cheers,
Sergei

Title: Re: GARD for analyze multiple data files
Post by Miguel on Dec 9th, 2008 at 1:44am
Dear Sergei,

GARD.bf generates an output file in nexus format, so I just modified the batch file to run multiple data sets by a loop and deleting the prompt questions (I give the parameter values directly inside of the batch file). Then, it is running so good.
Do not worry for being away, that is normal. Here, I just tried it step by step and I get it.
Thanks a lot for your message!  ;)

Cheers,
Miguel


Title: Re: GARD for analyze multiple data files
Post by copenhagen on Nov 12th, 2013 at 1:28am
Dear All,

I have an identical analysis goal. I have 1600 FASTA-format data files of DNA alignments, the idea is to independent detect recombination breakpoints and split alignment using GARD. If the alignment has recombination breakpoints, the outputs are split alignments in two or more FASTA-format files.

I don't know how to prepare a proper batch file (GARD.bf) for running multiple data sets in HyPhy. I'm very appreciated your help on my analysis!

I have another question about running GARD.bf in Hyphy in Windows system. I have downloaded windows-version Hyphy 2.2.0. When I selected Standard analysis, recombination, GARD.bf, it showed the following error:
[ERROR] This analysis requires an MPI environment to run

I'm wondering Is there possible to run GARD in windows system or Should I install Hyphy and MPI in Linux for running GARD?

Thanks in advance!

Best regards,
Zhuofei

HyPhy message board » Powered by YaBB 2.5.2!
YaBB Forum Software © 2000-2024. All Rights Reserved.