Module containing functionality for manipulating micro-satellite
genotype data.
Module containing functionality for manipulating micro-satellite
haplotype data.
Module containing functionality for manipulating SNP genotype data.
Module containing functionality for manipulating SNP haplotype data.
Module containing functionality common to all input types and MCMC algorithms supported by GeneRecon.
Creates a set of genotypes, split in affected and unaffected individuals.
(affected/unaffected region affected-list unaffected-list)
(define reg
(region (list
(marker 0.0 '(0.1 0.9))
(marker 0.1 '(0.2 0.8))
(marker 0.2 '(0.1 0.2 0.7)))))
(define affected
(list (genotype reg (list '(0 . 1) '(0 . 0) '(1 . 0)))
(genotype reg (list '(0 . 0) '(0 . 0) '(1 . 1)))
(genotype reg (list '(0 . 0) '(0 . 0) '(1 . 2)))
(genotype reg (list '(0 . 1) '(0 . 0) '(1 . 0)))
(genotype reg (list '(0 . 1) '(0 . 0) '(1 . 1)))
(genotype reg (list '(0 . 1) '(0 . 0) '(1 . 2)))))
(define unaffected
(list (genotype reg (list '(0 . 1) '(1 . 1) '(1 . 0)))
(genotype reg (list '(0 . 0) '(1 . 1) '(1 . 1)))
(genotype reg (list '(0 . 0) '(1 . 1) '(1 . 2)))
(genotype reg (list '(0 . 1) '(1 . 1) '(1 . 0)))
(genotype reg (list '(0 . 1) '(1 . 1) '(1 . 1)))
(genotype reg (list '(0 . 1) '(1 . 1) '(1 . 2)))))
(define au-genotype-set
(affected/unaffected-genotype-set reg affected unaffected))Creates a set of genotypes, split in a set of affected genotypes and a set of unaffected genotypes. The function takes three arguments: the genomic region the genotypes are over, a list of affected genotypes, and a list of unaffected genotypes; the genotypes in the two lists must be build over the region that is the first argument.
Creates a set of haplotypes, split in affected and unaffected individuals.
(affected/unaffected region affected-list unaffected-list)
(define reg
(region (list
(marker 0.0 '(0.1 0.9))
(marker 0.1 '(0.2 0.8))
(marker 0.2 '(0.1 0.2 0.7)))))
(define affected
(list (haplotype reg '(0 0 0))
(haplotype reg '(0 0 1))
(haplotype reg '(0 0 2))
(haplotype reg '(1 0 0))
(haplotype reg '(1 0 1))
(haplotype reg '(1 0 2))))
(define unaffected
(list (haplotype reg '(0 1 0))
(haplotype reg '(0 1 1))
(haplotype reg '(0 1 2))
(haplotype reg '(1 1 0))
(haplotype reg '(1 1 1))
(haplotype reg '(1 1 2))))
(define au-haplotype-set
(affected/unaffected-haplotype-set reg affected unaffected))Creates a set of haplotypes, split in a set of affected haplotypes and a set of unaffected haplotypes. The function takes three arguments: the genomic region the haplotypes are over, a list of affected haplotypes, and a list of unaffected haplotypes; the haplotypes in the two lists must be build over the region that is the first argument.
Build a set of haplotypes or genotypes, split in a part evaluated in the coalescent tree, the "mutation-cluster," and a part considered in a "null-cluster."
(cluster dataset mutation-cluster-size [tree-builder])
(define au-haplotype-set (affected/unaffected-haplotype-set reg affected-haplotypes unaffected-haplotypes)) (define au-genotype-set (affected/unaffected-genotype-set reg affected-genotypes unaffected-genotypes)) (define au-h-cluster (cluster au-haplotype-set 100)) (define au-g-cluster (cluster au-genotype-set 100))
Build a set of haplotypes or genotypes, split in a part evaluated in the coalescent tree, the "mutation-cluster," and a part considered in a "null-cluster."
The first argument to the function is the data set to be analyzed, the kind of cluster that is built is determined by this argument.
The second argument determines the size of the mutation cluster, i.e. how many individuals will be included in the coalescent tree during the MCMC. Any individual not in the mutation cluster is considered to be in a null-cluster, and are assumed to have been selected from the background distribution of haplotypes/genotypes rather than be related in a tree.
If the data set provided to the cluster function is an affected/unaffected set, the mutation cluster is build from the affected individuals only. During the MCMC, only affected individuals can be moved between the two clusters, unaffected will always be considered part of the null-cluster.
An optional third argument is a symbol that determines the algorithm used to build the tree in the cluster. Supported algorithms are: 'random-tree (a random topology), 'distance-tree (a tree build using a distance method), and 'weighted-distance-tree (a distance method algorithm that weights differences at markers close to a locus more than differences farther away).
If the 'weighted-distance-tree tree building method is chosen, an additional parameter is expected, the position to weight relateive to.
Builds a coalescent tree from a set of haplotypes, based on the distance between the haplotypes.
(distance-tree haplotype-set)
(distance-tree au-haplotype-set) (distance-tree au-genotype-set)
Builds a random coalescent tree from a haplotype set, using the distance
between the haplotypes to determine the topology.
The set can be either a haplotype set or a genotype set, build
using affected/unaffected-haplotype-set or
affected/unaffected-genotype-set respectively.
Creates a genotype object from a list of allele-pairs.
(genotype region allele-pairs
(define reg (region (list (marker 0.0 '(0.1 0.9)) (marker 0.1 '(0.2 0.8)) (marker 0.2 '(0.1 0.2 0.7))))) (genotype reg (list '(0 . 1) '(1 . 0) '(2 . 1)))
Creates a genotype object from a region and a list of allele pairs.
The allele pairs are given as a list of pairs, where there must be a pair for each marker in the region and where the two alleles in the pairs must be valid alleles for the markers, i.e. between 0 and one minus the lenght of the marker's frequency list.
Creates a haplootype object from a list of alleles.
(haplotype region allele-list)
(define reg (region (list (marker 0.0 '(0.1 0.9)) (marker 0.1 '(0.2 0.8)) (marker 0.2 '(0.1 0.2 0.7))))) (haplotype reg '(0 1 2))
Creates a haplotype object from a region and a list of alleles.
The alleles are given as a list, where there must be an allele for each marker in the region and where the alleles pairs must be valid alleles for the corresponding markers, i.e. between 0 and one minus the lenght of the marker's frequency list.
Creates a marker object from a position and a list of frequencies
(marker position frequencies)
(marker 0.1 '(0.2 0.8))
Creates a marker object from a position and a list of frequencies. The position must be a positive real number and the frequencies a list of real numbers in the range [0,1] that sums to 1. The frequency list must contain at least two elements.
The alleles at the created marker are afterward refered to using
indices from 0 to one minus the length of the allele
frequencies. The allele identified by 0 is the allele with
frequency (car freq-list)---i.e. the first element
in the frequency list, the allele identified by 1 is the allele
with frequency (car (cdr freq-list))---i.e. the
second element in the frequency list, and so forth.
Creates a Markov chain object from a list of tables.
(markov-chain . tables)
(markov-chain
(list (list 0.1 0.4) (list 0.3 0.3))
(list (list 0.1 0.4) (list 0.2 0.3))) FIXME
Build a parameter set for the MCMC calculation.
(parameter-set parameters) (parameter-set region position population-size data-set tree) (parameter-set region position population-size cluster)
(define au-h-tree (affected/unaffected-random-tree au-haplotype-set)) (define au-g-tree (affected/unaffected-random-tree au-genotype-set)) (define initial-pos 0.12) (define initial-pop-size 1000) (define h-ps (parameter-set reg initial-pos initial-pop-size au-haplotype-set au-h-tree)) (define g-ps (parameter-set reg initial-pos initial-pop-size au-genotype-set au-g-tree)) (define au-h-cluster (cluster au-haplotype-set 100)) (define au-g-cluster (cluster au-genotype-set 100)) (define h-c-ps (parameter-set reg initial-pos initial-pop-size au-h-cluster)) (define g-c-ps (parameter-set reg initial-pos initial-pop-size au-g-cluster))
Build a parameter set for the MCMC algorithm. Different kinds of parameter sets are build dependent on the arguments to this function.
The first argument to the function is the genomic region being analyzed.
The second argument is the initial population size. The third argument
is either a data-set built using
affected/unaffected-haplotype-set or
affected/unaffected-genotype-set or a cluster built with the
cluster function. If a data set is given, the next
parameter is a corresponding tree.
Builds a random coalescent tree from an affected/unaffected set.
(random-tree au-set)
(random-tree au-haplotype-set) (random-tree au-genotype-set)
Builds a random coalescent tree from an affected/unaffected set.
The set can be either a haplotype set or a genotype set, build
using affected/unaffected-haplotype-set or
affected/unaffected-genotype-set respectively.
Creates a genomic region from a list of markers.
(region kappa mu marker-list)
(region kappa mu
(list
(marker 0.0 '(0.1 0.9))
(marker 0.1 '(0.2 0.8))
(marker 0.2 '(0.1 0.2 0.7))))Constructs a region from a recombination rate (kappa), a mutation rate (mu), and a list of markers. The region can afterward be used to create haplotypes or genotypes.
The markers in the region must all be on distinct positions.
After the region is created, the markers are refered to by their
order with relation to their position not the order of the input
list given to region. This means that, for
instance, when creating a haplotype, the order of the alleles
given to the haplotype function must match the order
of the markers on the region, ordered by their position.
An optional final argument, a Markov chain for modelling the background haplotypes can be provided. The tables in this Markov chain must match the markers starting from index 1. The ordering must be the sorted ordering of the markers, not any other they might be provided in for this function.
Run the MCMC algorithm on a parameter set.
(run-mcmc parameter-set sampler no-iterations)
(define ps (parameter-set reg initial-pos initial-pop-size au-genotype-set au-g-tree rho kappa mu)) (define s (sampler (list '(disease-locus 10) '(likelihood 10)))) (run-mcmc ps s 100000)
Run the MCMC algorithm on a parameter set. The parameter set is any
parameter set created by the parameter-set function. The
appropriate MCMC algorithm will be selected based on the parameter set.
The first argument to the function is the parameter set for the MCMC run. The second argument is a sampler object, determining how parameters should be sampled. The third argument determines the number of iterations to run.
Construct the sampling hooks for the MCMC iteration.
(sampler hook-list)
(sampler (list '(disease-locus 100)
'(likelihood 100 "likelihood.out")
'(population-size 100 "population-size.out")
'(coalescent-tree 1000 "tree.out")))Construct the sampling hooks for the MCMC iteration.
The sampler is constructed from a list of hooks. The hooks are lists where the first element is a symbol that determines what is to be sampled and the second determines how often that parameter should be sampled. An optional third parameter specifies the filename to write the sampled values to.
The supported values to sample are:
Set a global MCMC option
(set-mcmc-option option value)
(set-mcmc-option 'max-pop-size-change 50) (set-mcmc-option 'max-allele-freq-change 0.25)
Sets an MCMC option. The options are options to the MCMC runs that have reasonable default values and thus need not be explicitly set in most cases. They are therefore not used as parameters to the MCMC functions, but can be set using this function.
The supported options are:
Maximum change of the disease locus.
A fraction of the total region size the locus is allowed to move in one step; i.e. if the region has size 2 and max-locus-change is 0.5, the locus can move at most 1.0 in each move.
The default value is 0.3.
The closes the disease locus is allowed to be to a marker.
The minimal distance between the disease locus and any marker allowed. If this is zero, the disease locus can be placed on a marker, if it is 0.1 it can never be closer than 0.1 and so on.
By default, 10% of the total region size is remove by making areas around the markers where the locus cannot be placed.
The maximal change of allele frequencies allowed.
The maximal amount the frequency for a single allele at a marker can change in a single step.
By default, 100% -- i.e. the frequencies can change arbitrary (within the bound that they should be frequencies and sum to 1).
Minimal population size.
The minimal value the population size is allowed to reach. By default 500.
Maximal population size.
The maximal value the population size is allowed to reach. By default 100000.
Maximal population size change.
The maximal amount the population size is allowed to change in a single step. By default 1000.
Maximal population size change.
The maximal amount the waiting time in a coalescent tree can change. By default 0.4.
Temperature for accepting more proposed changes.
The acceptance probability is normally exp(L'-L) where L' is the log-likelihood of the proposed locus and L the log-likelihood of the current position. With a temperature, it will instead be exp((L'-L)/temp). To get a "flatter" curve, use a temp>1.
By default this is 1.0, corresponding to no temperature.
The smalleste interval that can contain a recombination.
The likelihood for the locus can get very low around markers (for some choices of recombination and mutation rates) which results in poor mixing. This variable will set a minimal distance to use to calculate the probability of no recombination between two points (shorter intervals will be set to this point); setting it to a value above 0 (the default value) will smooth out the likelihood around markers.
Set the weight of a change proposal
(set-mcmc-weight change weight)
(set-mcmc-weight 'locus-change 10)
A function for explicitly setting the weighting of the proposed changes. By default they are initialized with default values (mainly) taken from Morris et al. 2002.
The supported changes are:
Weight for changing the disease locus
The default value is 1.
Weight for changing the allele frequency of markers.
The default value is the number of markers.
Weight for changing the effective population size.
The default value is 1.
Weight for changing the allele at a site where the allele is unknown.
The default value is the number of such sites.
Weight for changing the phase of a genotype.
The default value is the number of heterozygote sites.
Weight for changing the tree topology by moving a branch.
Default is the number of branches: 2*(no-leaves - 1).
Weight for changing the parent bit.
Default is the number of branches: 2*(no-leaves - 1).
Weight for changing the ordering of events.
Default is no-leaves - 2.
Weight for changing a waiting time.
Default is the number of waiting times: no-leaves - 1.
Weight for changing the mutation cluster.
Default is the size of the cluster.
Builds a coalescent tree from a set of haplotypes, based on the distance between the haplotypes, weighting alleles on markers close to the disease locus higher than markers farther away.
(weighted-distance-tree haplotype-set locus)
(weighted-distance-tree au-haplotype-set locus) (distance-tree au-genotype-set locus)
Builds a random coalescent tree from a haplotype set, using the distance
between the haplotypes to determine the topology, weighting alleles on
markers close to the disease locus higher than markers farther away.
The first parameter is a set of haplotypes or genotypes, build
using affected/unaffected-haplotype-set or
affected/unaffected-genotype-set respectively. The second
parameter is the locus used to weight the distance.
Module containing functionality for manipulating micro-satellite
genotype data.
Calculates the allele frequencies for each marker in the parameter lists.
(calc-frequencies known-allele-lists . list-of-list-of-allele-lists)
(use-modules (generecon MS genotype)) (define known-allele-lists (collect-alleles-lists genotypes-1 genotypes-2)) (calc-frequencies known-allele-lists genotypes-1 genotypes-2)
Calculates the allele frequencies for each marker in the parameter lists.
Takes as input a number of lists of lists of micro-satellite alles and calculates a list of frequencies for each allele in each position in the lists, discarting missing-data (value -1).
The first argument is a list of alleles for each marker, the allowed alleles on each marker, and is used to order the frequencies; the frequency lists are ordered with relation to the order the alleles appear in in the known-allele-list.
The input need not consist of a single list of allels but can be any number of such lists, e.g. (calc-frequencies known-allele-lists genotypes-1 genotypes-2).
Collect the distinct alleles from a list of allele pairs.
(collect-alleles allele-pairs)
(use-modules (generecon MS genotype)) (define distinct-alleles (collect-alleles '((1 . 23) (43 . 2) (1 . 34) (23 . 43))))
Collect the distinct alleles from a list of allele-pairs and return them sorted.
Collect a list of the the distinct alleles in each column of the input.
(collect-alleles-lists . list-of-list-of-allele-pairs)
(use-modules (generecon MS genotype))
(define distinct-alleles-lists
(collect-alleles-lists (list '((1 . 1) (34 . 23))
'((1 . 3) (14 . 23))
'((2 . 4) ( 2 . 65))))) Collect a list of the the distinct alleles in each colum of the input. That is, the input lists must consist of lists of equal length, and the distinct alleles on each index are collected and listed.
The input need not consist of a single list but can be any number of lists.
Translates a list of list of alleles into a list of genotype objects.
(genotype-list->genotype-list region index-tables genotype-list)
(use-modules (generecon MS genotype)) (define reg (region kappa mu markers)) (define genotype-list (read-genotype-data file)) (define index-tables (make-index-tables genotype-list)) (genotype-list->genotype-list reg index-tables genotype-list)
Translates a list of list of allels into a list of genotype objects.
This function takes a region, `reg', a list of mappings from alleles to frequency indices in the region, `index-tables', and a list of lists of alleles, `genotype-list', and translate the allele lists into genotype objects over the region.
Make a table of alleles to indices.
(make-index-table allele-pairs)
(use-modules (generecon MS genotype)) (define table (make-index-table '((1 . 23) (43 . 2) (1 . 34) (23 . 43))))
Make a table of alleles to indices, such that each distinct allele is mapped to its position in the sorted list of distinct alleles.
Makes a list of alleles->indices tables.
(make-index-tables list-of-list-of-allele-lists)
(use-modules (generecon MS genotype)) (define genotype-list (read-genotype-data file)) (define tables (make-index-tables genotype-list))
Makes a list of alleles->index mappings.
From a list of allele-lists, this function makes a table for each marker, mapping the alleles at this marker to indices, such that the numerically least allele is at the first index (0) and the numerically largest allele at the largest index.
The input need not consist of a single list of allels but can be any number of such lists, e.g. (make-index-tables allele-list-1 allele-list-2).
Read affected and unaffected genotypes and calculate their allele frequencies.
(read-affected/unaffected-data affected-file unaffected-file)
(use-modules (generecon MS genotype)) (read-affected/unaffected-data af-filename uaf-filename)
Read affected and unaffected genotypes and calculate their allele frequencies.
Read a list of affected genotypes from `affected-file' and a list of unaffected genotypes from `unaffected-file' and return them (as lists of lists of alleles) together with a list of the frequencies of each allele at each position. For the frequency list, only the unaffected genotypes are considered. The frequencies in the frequency list are sorted with relation to the numerical value of the alleles (least allele first) and missing data (-1) is ignored in the frequency calculations.
Read marker distances and affected and unaffected genotypes, calculate their allele frequencies, and build markers from it.
(read-distances-affected/unaffected-markers kappa mu
position-file
affected-file
unaffected-file) (use-modules (generecon MS genotype)) (read-distances-affected/unaffected-markers 0.01 1e-5 dist-file af-filename uaf-filename)
Read marker distances and affected and unaffected genotypes, calculate their allele frequencies, and build markers from it.
Read a list of marker-distances from `position-file', a list of affected genotypes from `affected-file' and a list of unaffected genotypes from `unaffected-file'; calculate the allele frequencies in the controls and use these frequencies to construct a list of affected and a list of unaffected markers. The constructed region and the pair of marker-sets is returned as a list
The frequencies in the frequency list are sorted with relation to the numerical value of the alleles (least allele first) and missing data (-1) is ignored in the frequency calculations. The alleles in the created genotypes are re-mapped as indices into the frequency lists.
The first two parameters, kappa and mu, are the recombination rate and mutation rate, respectively. These are used for making the region for the markers.
Reads a list of genotypes from a file.
(read-genotype-data file)
(use-modules (generecon MS genotype)) (read-genotype-data filename)
Reads a list of genotypes from a file.
The file must contains lines of white-space separated lists of micro-satellite alleles (non-negative numbers or -1, where -1 indicate missing data). Each line is interpreted as one genotype, and all lines must have the same number of alleles.
The function returns the parsed genotypes as a list of lists of alleles. This list can be translated into a list of genotype objects using `genotype-list->genotype-list'.
Read marker positions and affected and unaffected genotypes, calculate their allele frequencies, and build markers from it.
(read-positions-affected/unaffected-markers kappa mu
position-file
affected-file
unaffected-file) (use-modules (generecon MS genotype)) (read-positions-affected/unaffected-markers 0.01 1e-5 pos-file af-filename uaf-filename)
Read marker positions and affected and unaffected genotypes, calculate their allele frequencies, and build markers from it.
Read a list of marker-positions from `position-file', a list of affected genotypes from `affected-file' and a list of unaffected genotypes from `unaffected-file'; calculate the allele frequencies in the controls and use these frequencies to construct a list of affected and a list of unaffected markers. The constructed region and the pair of marker-sets is returned as a list
The frequencies in the frequency list are sorted with relation to the numerical value of the alleles (least allele first) and missing data (-1) is ignored in the frequency calculations. The alleles in the created genotypes are re-mapped as indices into the frequency lists.
The first two parameters, kappa and mu, are the recombination rate and mutation rate, respectively. These are used for making the region for the markers.
Remaps the alleles in a genotype into indices in a frequency list.
(remap-genotype index-tables genotype)
(use-modules (generecon MS genotype)) (define genotype-list (read-genotype-data file)) (define tables (make-index-tables genotype-list)) (map (lambda (h) (remap-genotype tables h)) genotype-list)
Remaps the alleles in a genotype into indices in a frequency list.
From a list of allele-lists, this function makes a table for each marker, mapping the alleles at this marker to indices, such that the numerically least allele is at the first index (0) and the numerically largest allele at the largest index.
The input need not consist of a single list of allels but can be any number of such lists, e.g. (make-index-tables allele-pairs-1 allele-pairs-2).
Module containing functionality for manipulating micro-satellite
haplotype data.
Calculates the allele frequencies for each marker in the parameter lists.
(calc-frequencies known-allele-lists . list-of-list-of-allele-lists)
(use-modules (generecon MS haplotype)) (define known-allele-lists (collect-alleles-lists haplotypes-1 haplotypes-2)) (calc-frequencies known-allele-lists haplotypes-1 haplotypes-2)
Calculates the allele frequencies for each marker in the parameter lists.
Takes as input a number of lists of lists of micro-satellite alles and calculates a list of frequencies for each allele in each position in the lists, discarting missing-data (value -1).
The first argument is a list of alleles for each marker, the allowed alleles on each marker, and is used to order the frequencies; the frequency lists are ordered with relation to the order the alleles appear in in the known-allele-list.
The input need not consist of a single list of allels but can be any number of such lists, e.g. (calc-frequencies known-allele-lists haplotypes-1 haplotypes-2).
Collect the distinct alleles from a list of alleles.
(collect-alleles alleles)
(use-modules (generecon MS haplotype)) (define distinct-alleles (collect-alleles '(1 23 43 2 1 34 23 43)))
Collect the distinct alleles from a list of alleles and return them sorted.
Collect a list of the the distinct alleles in each column of the input.
(collect-alleles-lists . list-of-list-of-allele-lists)
(use-modules (generecon MS haplotype))
(define distinct-alleles-lists
(collect-alleles-lists (list '(1 1 34 23 43)
'(1 3 14 23 3)
'(2 4 2 65 0)))) Collect a list of the the distinct alleles in each colum of the input. That is, the input lists must consist of lists of equal length, and the distinct alleles on each index are collected and listed.
The input need not consist of a single list but can be any number of lists.
Translates a list of list of alleles into a list of haplotype objects.
(haplotype-list->haplotype-list region index-tables haplotype-list)
(use-modules (generecon MS haplotype)) (define reg (region kappa mu markers)) (define haplotype-list (read-haplotype-data file)) (define index-tables (make-index-tables haplotype-list)) (haplotype-list->haplotype-list reg index-tables haplotype-list)
Translates a list of list of allels into a list of haplotype objects.
This function takes a region, `reg', a list of mappings from alleles to frequency indices in the region, `index-tables', and a list of lists of alleles, `haplotype-list', and translate the allele lists into haplotype objects over the region.
Make a table of alleles to indices.
(make-index-table alleles)
(use-modules (generecon MS haplotype)) (define table (make-index-table '(1 23 43 2 1 34 23 43)))
Make a table of alleles to indices, such that each distinct allele is mapped to its position in the sorted list of distinct alleles.
Makes a list of alleles->indices tables.
(make-index-tables list-of-list-of-allele-lists)
(use-modules (generecon MS haplotype)) (define haplotype-list (read-haplotype-data file)) (define tables (make-index-tables haplotype-list))
Makes a list of alleles->index mappings.
From a list of allele-lists, this function makes a table for each marker, mapping the alleles at this marker to indices, such that the numerically least allele is at the first index (0) and the numerically largest allele at the largest index.
The input need not consist of a single list of allels but can be any number of such lists, e.g. (make-index-tables allele-list-1 allele-list-2).
Read affected and unaffected haplotypes and calculate their allele frequencies.
(read-affected/unaffected-data affected-file unaffected-file)
(use-modules (generecon MS haplotype)) (read-affected/unaffected-data af-filename uaf-filename)
Read affected and unaffected haplotypes and calculate their allele frequencies.
Read a list of affected haplotypes from `affected-file' and a list of unaffected haplotypes from `unaffected-file' and return them (as lists of lists of alleles) together with a list of the frequencies of each allele at each position. For the frequency list, only the unaffected haplotypes are considered. The frequencies in the frequency list are sorted with relation to the numerical value of the alleles (least allele first) and missing data (-1) is ignored in the frequency calculations.
Read marker distances and affected and unaffected haplotypes, calculate their allele frequencies, and build markers from it.
(read-distances-affected/unaffected-markers kappa mu
position-file
affected-file
unaffected-file) (use-modules (generecon MS haplotype)) (read-distances-affected/unaffected-markers 0.01 1e-5 pos-file af-filename uaf-filename)
Read marker distances and affected and unaffected haplotypes, calculate their allele frequencies, and build markers from it.
Read a list of marker-distances from `distances-file', a list of affected haplotypes from `affected-file' and a list of unaffected haplotypes from `unaffected-file'; calculate the allele frequencies in the controls and use these frequencies to construct a list of affected and a list of unaffected markers. The constructed region and the pair of marker-sets is returned as a list
The frequencies in the frequency list are sorted with relation to the numerical value of the alleles (least allele first) and missing data (-1) is ignored in the frequency calculations. The alleles in the created haplotypes are re-mapped as indices into the frequency lists.
The first two parameters, kappa and mu, are the recombination rate and mutation rate, respectively. These are used for making the region for the markers.
Reads a list of haplotypes from a file.
(read-haplotype-data file)
(use-modules (generecon MS haplotype)) (read-haplotype-data filename)
Reads a list of haplotypes from a file.
The file must contains lines of white-space separated lists of micro-satellite alleles (non-negative numbers or -1, where -1 indicate missing data). Each line is interpreted as one haplotype, and all lines must have the same number of alleles.
The function returns the parsed haplotypes as a list of lists of alleles. This list can be translated into a list of haplotype objects using `haplotype-list->haplotype-list'.
Read marker positions and affected and unaffected haplotypes, calculate their allele frequencies, and build markers from it.
(read-positions-affected/unaffected-markers kappa mu
position-file
affected-file
unaffected-file) (use-modules (generecon MS haplotype)) (read-positions-affected/unaffected-markers 0.01 1e-5 pos-file af-filename uaf-filename)
Read marker positions and affected and unaffected haplotypes, calculate their allele frequencies, and build markers from it.
Read a list of marker-positions from `position-file', a list of affected haplotypes from `affected-file' and a list of unaffected haplotypes from `unaffected-file'; calculate the allele frequencies in the controls and use these frequencies to construct a list of affected and a list of unaffected markers. The constructed region and the pair of marker-sets is returned as a list
The frequencies in the frequency list are sorted with relation to the numerical value of the alleles (least allele first) and missing data (-1) is ignored in the frequency calculations. The alleles in the created haplotypes are re-mapped as indices into the frequency lists.
The first two parameters, kappa and mu, are the recombination rate and mutation rate, respectively. These are used for making the region for the markers.
Remaps the alleles in a haplotype into indices in a frequency list.
(remap-haplotype index-tables haplotype)
(use-modules (generecon MS haplotype)) (define haplotype-list (read-haplotype-data file)) (define tables (make-index-tables haplotype-list)) (map (lambda (h) (remap-haplotype tables h)) haplotype-list)
Remaps the alleles in a haplotype into indices in a frequency list.
From a list of allele-lists, this function makes a table for each marker, mapping the alleles at this marker to indices, such that the numerically least allele is at the first index (0) and the numerically largest allele at the largest index.
The input need not consist of a single list of allels but can be any number of such lists, e.g. (make-index-tables allele-list-1 allele-list-2).
Module containing functionality for manipulating SNP genotype data.
Calculates the allele frequencies for each marker in the parameter lists.
(calc-frequencies . list-of-list-of-allele-lists)
(use-modules (generecon SNP genotype)) (calc-frequencies affected unaffected)
Calculates the allele frequencies for each marker in the parameter lists.
Takes as input a number of lists of lists of SNPs and calculates a list of frequencies for 0 and 1 in each position in the lists, discarting missing-data SNPs (value -1).For example, (calc-frequencies (list '(0 1 2) '(0 1 1) '(-1 0 0))) will evaluate to a list of three lists, one for each position in the input set: ((1 0) (0.33 0.66) (0.5 0.5)), indicating that on the first postion, all SNPs are 0 (since -1 is not counted), on the second position one third is 0 and two thirds are 1, and on the third position one half 0 and one half is 1, since 0 counts as double 0, 1 counts as double 1, and 2 counts as one 0 and one 1.
The input need not consist of a single list of SNPs but can be any number of such lists, e.g. (calc-frequencies affected-haplotypes unaffected-haplotypes).
Translates a list of list of SNPs into a list of genotype objects.
(genotype-list->haplotype-list region genotype-list)
(use-modules (generecon SNP genotype)) (define reg (region kappa mu markers)) (define genotype-list (read-genotype-data file)) (genotype-list->genotype-list reg genotype-list)
Translates a list of list of SNPs into a list of genotype objects.
This function takes a region, `reg', and a list of lists of SNPs, `genotype-list', and translate the SNP lists into genotype objects over the region.
Read affected and unaffected genotypes and calculate their SNP frequencies.
(read-affected/unaffected-data affected-file unaffected-file)
(use-modules (generecon SNP genotype)) (read-affected/unaffected-data af-filename uaf-filename)
Read affected and unaffected haplotypes and calculate their SNP frequencies.
Read a list of affected haplotypes from `affected-file' and a list of unaffected haplotypes from `unaffected-file' and return them (as lists of lists of SNPs) together with a list of the frequencies of each SNP at each position. For the frequency list, only the unaffected genotypes are considered.
Read marker distances and affected and unaffected genotypes, calculate their SNP frequencies, and build markers from it.
(read-distances-affected/unaffected-markers kappa mu
distances-file
affected-file
unaffected-file) (use-modules (generecon SNP genotype)) (read-distances-affected/unaffected-markers 0.01 1e-5 dist-file af-filename uaf-filename)
Read marker distances and affected and unaffected genotypes, calculate their SNP frequencies, and build markers from it.
Read a list of marker-distances from `distances-file', a list of affected genotypes from `affected-file' and a list of unaffected genotypes from `unaffected-file'; calculate the SNP frequencies in the controls and use these frequencies to construct a list of affected and a list of unaffected markers. The constructed region and the pair of marker-sets is returned as a list
The first two parameters, kappa and mu, are the recombination rate and mutation rate, respectively. These are used for making the region for the markers.
Reads a list of genotypes from a file.
(read-haplotype-data file)
(use-modules (generecon SNP genotype)) (read-haplotype-data filename)
Reads a list of genotypes from a file.
The file must contains lines of white-space separated lists of SNPs (0, 1, 2, and -1, where 0 indicates homozygote 0, 1 indicates homozygote 1, 2 heterozygote 0/1, and -1 indicates missing data). Each line is interpreted as one genotype, and all lines must have the same number of SNPs.
The function returns the parsed genotypes as a list of lists of SNPs. This list can be translated into a list of genotype objects using `genotype-list->genotype-list'.
Read marker positions and affected and unaffected genotypes, calculate their SNP frequencies, and build markers from it.
(read-positions-affected/unaffected-markers kappa mu
position-file
affected-file
unaffected-file) (use-modules (generecon SNP genotype)) (read-positions-affected/unaffected-markers 0.01 1e-5 pos-file af-filename uaf-filename)
Read marker positions and affected and unaffected genotypes, calculate their SNP frequencies, and build markers from it.
Read a list of marker-positions from `position-file', a list of affected genotypes from `affected-file' and a list of unaffected genotypes from `unaffected-file'; calculate the SNP frequencies in the controls and use these frequencies to construct a list of affected and a list of unaffected markers. The constructed region and the pair of marker-sets is returned as a list
The first two parameters, kappa and mu, are the recombination rate and mutation rate, respectively. These are used for making the region for the markers.
Module containing functionality for manipulating SNP haplotype data.
Calculates the allele frequencies for each marker in the parameter lists.
(calc-frequencies . list-of-list-of-allele-lists)
(use-modules (generecon SNP haplotype)) (calc-frequencies affected unaffected)
Calculates the allele frequencies for each marker in the parameter lists.
Takes as input a number of lists of lists of SNPs and calculates a list of frequencies for 0 and 1 in each position in the lists, discarting missing-data SNPs (value -1).For example, (calc-frequencies (list '(0 1 0) '(0 1 1) '(-1 0 0))) will evaluate to a list of three lists, one for each position in the input set: ((1 0) (0.33 0.66) (0.66 0.33)), indicating that on the first postion, all SNPs are 0 (since -1 is not counted), on the second position one third is 0 and two thirds are 1, and on the third position one third is 1 and two thirds are 0.
The input need not consist of a single list of SNPs but can be any number of such lists, e.g. (calc-frequencies affected-haplotypes unaffected-haplotypes).
Translates a list of list of SNPs into a list of haplotype objects.
(haplotype-list->haplotype-list region haplotype-list)
(use-modules (generecon SNP haplotype)) (define reg (region kappa mu markers)) (define haplotype-list (read-haplotype-data file)) (haplotype-list->haplotype-list reg haplotype-list)
Translates a list of list of SNPs into a list of haplotype objects.
This function takes a region, `reg', and a list of lists of SNPs, `haplotype-list', and translate the SNP lists into haplotype objects over the region.
Read affected and unaffected haplotypes and calculate their SNP frequencies.
(read-affected/unaffected-data affected-file unaffected-file)
(use-modules (generecon SNP haplotype)) (read-affected/unaffected-data af-filename uaf-filename)
Read affected and unaffected haplotypes and calculate their SNP frequencies.
Read a list of affected haplotypes from `affected-file' and a list of unaffected haplotypes from `unaffected-file' and return them (as lists of lists of SNPs) together with a list of the frequencies of each SNP at each position. For the frequency list, only the unaffected haplotypes are considered.
Read marker distances and affected and unaffected haplotypes, calculate their SNP frequencies, and build markers from it.
(read-distances-affected/unaffected-markers kappa mu
distances-file
affected-file
unaffected-file) (use-modules (generecon SNP haplotype)) (read-distances-affected/unaffected-markers 0.01 1e-5 dist-file af-filename uaf-filename)
Read marker distances and affected and unaffected haplotypes, calculate their SNP frequencies, and build markers from it.
Read a list of marker-distances from `distances-file', a list of affected haplotypes from `affected-file' and a list of unaffected haplotypes from `unaffected-file'; calculate the SNP frequencies in the controls and use these frequencies to construct a list of affected and a list of unaffected markers. The constructed region and the pair of marker-sets is returned as a list
The first two parameters, kappa and mu, are the recombination rate and mutation rate, respectively. These are used for making the region for the markers.
Reads a list of haplotypes from a file.
(read-haplotype-data file)
(use-modules (generecon SNP haplotype)) (read-haplotype-data filename)
Reads a list of haplotypes from a file.
The file must contains lines of white-space separated lists of SNPs (0, 1, and -1, where -1 indicate missing data). Each line is interpreted as one haplotype, and all lines must have the same number of SNPs.
The function returns the parsed haplotypes as a list of lists of SNPs. This list can be translated into a list of haplotype objects using `haplotype-list->haplotype-list'.
Read marker positions and affected and unaffected haplotypes, calculate their SNP frequencies, and build markers from it.
(read-positions-affected/unaffected-markers kappa mu
position-file
affected-file
unaffected-file) (use-modules (generecon SNP haplotype)) (read-positions-affected/unaffected-markers 0.01 1e-5 pos-file af-filename uaf-filename)
Read marker positions and affected and unaffected haplotypes, calculate their SNP frequencies, and build markers from it.
Read a list of marker-positions from `position-file', a list of affected haplotypes from `affected-file' and a list of unaffected haplotypes from `unaffected-file'; calculate the SNP frequencies in the controls and use these frequencies to construct a list of affected and a list of unaffected markers. The constructed region and the pair of marker-sets is returned as a list
The first two parameters, kappa and mu, are the recombination rate and mutation rate, respectively. These are used for making the region for the markers.
Module containing functionality common to all input types and MCMC algorithms supported by GeneRecon.
Creates a list of positions from a list of distances.
(read-numbers-from-port port)
(use-modules (generecon common)) (define distances (read-distances distances-file)) (define positions (distances->positions distances))
Creates a list of positions from a list of distances between the positions. The first position is placed at 0.
Translates a list of positions and a list of frequency-lists into a list of markers.
(make-markers positions frequency-lists)
(use-modules (generecon common)) (make-markers '(0.2 0.5) (list '(0.4 0.6) '(0.2 0.8)))
Translates a list of positions and a list of frequency-lists into a list of markers.
The `positions' list should be a list of numbers, and the `frequency-lists' list should be a list, of the same length as `positions' where each element is, in turn, a list of frequencies summing to 1.
The markers are created by mapping each position to the corresponding list of frequencies, making the frequencies the allele frequencies for the marker at the given position.
Read the distances between marker positions from a file and return them as a list.
(read-distances filename)
(use-modules (generecon common)) (read-distances distances-file-name)
Read the distances between marker positions from a file and return them as a list.
Read the distances between marker positions from a file and return them as a list of positions (placing the first marker at 0).
(read-distances->positions filename)
(use-modules (generecon common)) (read-distances->positions-from-port distances-filename)
Read the distances between marker positions from a port and return them as a list of positions (placing the first marker at 0).
Read the distances between marker positions from a port and return them as a list of positions (placing the first marker at 0).
(read-distances->positions-from-port port)
(use-modules (generecon common)) (read-distances->positions-from-port (current-input-port))
Read the distances between marker positions from a port and return them as a list of positions (placing the first marker at 0).
Read the distances between marker positions from a port and return them as a list.
(read-distances-from-port port)
(use-modules (generecon common)) (read-distances-from-port (current-input-port))
Read the distances between marker positions from a port and return them as a list.
Read a list of space or newline separated numbers from a file.
(read-numbers filename)
(use-modules (generecon common)) (read-numbers numbers-file-name)
Read a list of space or newline separated numbers from a file.
Read a list of space or newline separated numbers from a port.
(read-numbers-from-port port)
(use-modules (generecon common)) (read-numbers-from-port (current-input-port))
Read a list of space or newline separated numbers from a port.
Read the marker positions from a file and return them as a list.
(read-positions filename)
(use-modules (generecon common)) (read-positions positions-file-name)
Read the marker positions from a file and return them as a list. The list of positions will be sorted.
Read the marker positions from a port and return them as a list.
(read-positions-from-port port)
(use-modules (generecon common)) (read-positions-from-port (current-input-port))
Read the marker positions from a port and return them as a list. The list of positions will be sorted.
Run the function `program' in `no-calls' parallel processes.
(run-in-subprocess no-calls program)
(use-modules (generecon common))
(define (run-markov-chain id)
(let* ((au-set (affected/unaffected-haplotype-set
reg affected-haplotypes unaffected-haplotypes))
(au-tree (distance-tree au-set))
(s (sampler (list (list 'disease-locus 1 locus-file)
(list 'likelihood 1 likelihood-file)))))
(run-mcmc ps s 10000)))
(run-in-subprocesses 10 run-markov-chain) Creates a number of parallel running processes, executing the suplied function in each of them. The supplied function is called with a single argument, a number between 0 and no-chilren-1, that can be used uniquely identify the process.