*********************************************************************************** *********************************************************************************** ******************************** RecPars ****************************************** *********************************************************************************** *********************************************************************************** Author: Kim Fisker The program is currently maintained by: Thomas Christensen Department of Computer Science University of Aarhus Ny Munkegade, Bldg. 540 DK-8000 Aarhus C, Denmark phone: +45 89 3236 Copying: This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version. Description: RecPars does a parsimony analysis of DNA sequences. It tries to find the best phylogenies for different regions of the sequences and thereby postulating a recombination event between these segments. A more thoroughly explanation can be found in: J. Hein. "A Heuristic Method to Reconstruct the History of Sequences subject to Recombination.", Journal of Molecular Evolution, 36: 396--406, 1993. K. Fisker and J. Hein. "Reconstructing a Simulated History of Sequences Subject to Recombination.", Not publiced yet. Installation: Customize the Makefile to your environment -- It should be very easy. Type 'make' and the program 'RecPars' should be ready. How to use: RecPars -p [-o ] : substfile = HIV-subst datafile = HIV-align debug = yes rekombcost = 100 initial_phyl = best 0 100 rounds = 3 'substfile' is a file with substitution costs. It should have the following format: A C G T - A 0 1 1 1 1 C 1 0 1 1 1 G 1 1 0 1 1 T 1 1 1 0 1 - 1 1 1 1 0 'datafile' contains the aligned DNA sequences. It should have the following format: % '%' is the comment sign. % it start with the number of sequences and % their length 4 11 % sequence 1 AACT TT__ GGC % sequence 2 AACT TT__ GGC % sequence 3 AACT TT__ GGC % sequence 4 AACT TT__ GGC 'debug' = yes or no. yes gives a little more information. Default value is yes. 'recombcost' is the cost of a recombination event in the parsimony analysis. Default value is 100. Should be customized to the actual sequences. 'initial_phyl' tells what segment in the sequences that should be used when finding the initial phylogeny. Default is the first 100 positions (best 0 100). 'rounds' is the number of parses over the sequences performed. Default value is 2 and means that the sequences are scanned first forward and then backward. 'only_one_phyl' = yes or no. If set to yes RecPars only finds the best phylogeny for the segment pointed to by 'initial_phyl' and then returns. Default value is no. Results: Example outputfile: Comments indicated by ^^^^ SUBSTFILE = /users/kfisker/rekomb/simulation/data/subst DATAFILE = /users/kfisker/rekomb/simulation/sim/fileali DEBUG = YES REKOMBCOST = 10 INITIAL PHYLOGENI = BEST 0 100 ROUNDS = 1 ONLY_ONE_PHYL = NO ^^^^ first the given parameters Minimal Tree cost = 543 ^^^^ sum of necessary substitution costs in every position Removed positions has cost = 200 ^^^^ RecPars ignores non-informative positions Read 11 sequences with length 1000 Removed all positions(columns) with 11 or 10 equal bases ^^^^ definition of non-informative positions Remaining bases in each sequence = 408 ^^^^ number of informative positions left. Tree 1: start = 1 , end = 443 , cost = 349 , rekomb = 1 ^^^^ first phylogeny covers position 1 to 443 |------------------------ 6 |---| | |------------------------ 8 ---| | |---- 4 | |---| | | |---- 10 | |---| | | |-------- 0 | |---| | | |------------ 9 | |---| | | | |------------ 5 | | |---| | | |------------ 2 | |---| | | |-------------------- 7 |---| | |-------------------- 3 |---| |-------------------- 1 ^^^^ sequences are numbered according to position in the inputfile. ^^^^ CAUTION: The root is placed arbitrarily Tree 2: start = 450 , end = 999 , cost = 253 , rekomb = 0 |------------------------ 6 |---| | |------------------------ 8 ---| | |---------------- 5 | |---| | | |---------------- 2 | |---| | | | |---- 4 | | | |---| | | | | |---- 10 | | | |---| | | | | |-------- 0 | | | |---| | | | | |------------ 9 | | |---| | | |---------------- 7 |---| | |-------------------- 3 |---| |-------------------- 1 Total Cost = 612 ^^^^ total cost of entire solution