Popularity

1.9

Declining

Activity

0.0

Stable

Stars 1

Watchers 3

Forks 0

Last Commit over 6 years ago

Monthly Downloads: 9

Programming language: Haskell

License: GNU General Public License v3.0 only

Tags: Bioinformatics Specific Industries

Latest version: v0.0.0.2

Gene-CluEDO alternatives and similar packages

Based on the "Bioinformatics" category.
Alternatively, view Gene-CluEDO alternatives based on common mentions on social networks and blogs.

hemokit

8.4 0.0 Gene-CluEDO VS hemokit

Haskell library for the Emotiv EEG, inspired by the Emokit code
cobot

7.7 1.8 Gene-CluEDO VS cobot

Computational biology toolkit to collaborate with researchers in constructive protein engineering

WorkOS - The modern identity platform for B2B SaaS

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

Promo workos.com

cobot-io

7.4 5.1 Gene-CluEDO VS cobot-io

Biological data file formats and IO
hPDB

7.2 0.0 Gene-CluEDO VS hPDB

PDB parser in Haskell
samtools

6.8 0.0 Gene-CluEDO VS samtools

DISCONTINUED. [Moved to: https://github.com/ingolia/SamTools]
RNAlien

6.6 0.0 Gene-CluEDO VS RNAlien

RNAlien - unsupervised RNA family model construction
phybin

5.8 0.0 Gene-CluEDO VS phybin

Binning (Newick) Phylogenetic Trees by Topology
bioinformatics-toolkit

5.7 0.0 Gene-CluEDO VS bioinformatics-toolkit

A collection of bioinformatics algorithms
Genbank

5.6 0.0 Gene-CluEDO VS Genbank

Genbank format tools and parser
BlastHTTP

5.3 0.0 Gene-CluEDO VS BlastHTTP

Haskell cabal libary for submission and result retrieval from the NCBI Blast REST webservice
cmv

5.3 0.0 Gene-CluEDO VS cmv

Visualize HMMs, CMs and their comparisons
vcf

5.3 0.0 Gene-CluEDO VS vcf

Haskell library to handle VCF (Variant Call Format) files
FormalGrammars

5.1 0.0 Gene-CluEDO VS FormalGrammars

Context-free and linear grammars in Haskell (parsing, pretty-printing, embedded DSL)
EntrezHTTP

5.0 0.0 Gene-CluEDO VS EntrezHTTP

Haskell cabal libary for submission and result retrieval from the NCBI Entrez REST webservice
cobot-tools

5.0 3.6 Gene-CluEDO VS cobot-tools

DISCONTINUED. Biological data file formats and IO
SelectSequencesFromMSA

4.4 0.0 Gene-CluEDO VS SelectSequencesFromMSA

Tool to select representative sequences from a multiple sequence alignment
TaxonomyTools

4.1 0.0 Gene-CluEDO VS TaxonomyTools

Tools to process and visualize NCBI taxonomy data
Taxonomy

3.9 0.0 Gene-CluEDO VS Taxonomy

Haskell cabal Taxonomy libary contains tools, parsers, datastructures and visualisation for the NCBI (National Center for Biotechnology Information) Taxonomy datasources.
ClustalParser

3.6 0.0 Gene-CluEDO VS ClustalParser

Parse output of Clustal tools
BioHMM

3.4 0.0 Gene-CluEDO VS BioHMM

Libary containing parsing and visualisation functions and datastructures for Hidden Markov Models in HMMER3 format.
ViennaRNAParser

3.4 0.0 Gene-CluEDO VS ViennaRNAParser

Libary for parsing ViennaRNA package output
memexml

3.4 0.0 Gene-CluEDO VS memexml

Haskell cabal libary for parsing Meme motif finder xml output
StockholmAlignment

3.4 0.0 Gene-CluEDO VS StockholmAlignment

Libary containing parsing and visualisation functions and datastructures for Stockholm aligmnent format.
Forestry

2.2 0.0 Gene-CluEDO VS Forestry

Science and craft of forests
seqloc

1.9 0.0 Gene-CluEDO VS seqloc

Bio.SeqLoc
ADPfusionForest

1.9 0.0 Gene-CluEDO VS ADPfusionForest

Dynamic programming on tree and forest structures
MutationOrder

1.7 0.0 Gene-CluEDO VS MutationOrder

most likely order of mutation events in RNA
rank-product

1.6 0.0 Gene-CluEDO VS rank-product

Collects the functions pertaining to finding the rank product of a data set as well as the associated p-value.
uniprot-kb

1.3 0.0 Gene-CluEDO VS uniprot-kb

UniProt-KB format parser
mmtf

0.6 0.0 Gene-CluEDO VS mmtf

MMTF for Haskell
bio-sequence

- - Gene-CluEDO VS bio-sequence

DISCONTINUED. Initial project template from stack

Do you think we are missing an alternative of Gene-CluEDO or a related project?

Add another 'Bioinformatics' Package

Popular Comparisons

README

generalized Algebraic Dynamic Programming Homepage

Gene-CluEDO: Gene Cluster Evolution Determined Order

The first paper describes the biological problem. The 2nd and 3rd paper provide algorithmic background.

Prohaska, Sonja J. and Berkemer, Sarah and Externbrink, Fabian and Gatter, Thomas
and Retzlaff, Nancy and The Students of the Graphs and Biological Networks Lab 2017
and Hoener zu Siederdissen, Christian and Stadler, Peter F.
Expansion of Gene Clusters and the Shortest Hamiltonian Path Problem
2017
preprint: http://www.bioinf.uni-leipzig.de/~choener/pdfs/pro-ber-2017.pdf
Hoener zu Siederdissen, Christian and Prohaska, Sonja J. and Stadler, Peter F.
Algebraic Dynamic Programming over General Data Structures
2015, BMC Bioinformatics
oa: https://doi.org/10.1186/1471-2105-16-S19-S2
Hoener zu Siederdissen, Christian and Prohaska, Sonja J. and Stadler, Peter F.
Dynamic Programming for Set Data Types
2014, Lecture Notes in Bioinformatics, 8826,
preprint: http://www.bioinf.uni-leipzig.de/~choener/pdfs/hoe-pro-2014.pdf

This program accepts a matrix with distances between nodes (see below for an example). It then proceeds to calculate the Hamiltonian path with the shortest distance between each pair of nodes, where the path has to travel from the start, then to all other nodes, finally stopping at the last node.

We further calculate all neighbour probabilities via Inside/Outside. This means that for any two nodes we calculate the weight of the edge between these two nodes. The weight is between [0, ... ,1] where 0 denotes the the nodes are almost surely not direct neighbours on a weighted-randomly drawn path, while 1 denotes that they almost surely are.

Finally, we calculate the probability that a node is one of the terminal nodes in the Hamiltonian path, i.e. either the first or the last node.

Installation / Pre-compiled Binaries

Binaries are available from github for Linux x86-64. They can be downloaded here: https://github.com/choener/Gene-CluEDO/releases
Installation from sources is possible using the Haskell stack tool, as described at the bottom of this page: http://www.bioinf.uni-leipzig.de/~choener/software/Gene-CluEDO.html
Another installation option is via cabal new-install (preferred for development, but more involved to setup)

Input data used for the Expansion of Gene Clusters paper

The data sets are available together with the sources or the binary release. Check the data folder. The run-all.sh script runs the four examples.

The Biological Problem We Solve

Wikipedia on Hox clusters.

Hox clusters are a set of genes that are linearly ordered. The genes are (assumed) to have a single originating gene, and repeated duplication has led to the cluster with unknown duplication tree.

The long time scales involved make it hard to produce a tree that can be trusted. This program therefore produces summary information in the form of edge path probabilities.

Example matrix:

In this artificial distance matrix, we have prime numbers as distances between nodes. Store the matrix in a file, say mat.dat.

#   A   B   C   D   E
A   0   2   3   5   7
B   2   0  11  13  17
C   3  11   0  19  23
D   5  13  19   0  27
E   7  17  23  27   0

Now, run the algorithm ./GeneCluEDO -o output.run ./mat.dat. After the program has run, output.run contains the a wealth of information about the input. The maximum likelihood path, the edge weights, end probabilities, and maximum expected accuracy path are calculated. Two additional files, here output.boundary.svg, and output.edge.svg are produced. The boundary plot provides graphical output of the probability that a node (or gene) is the start or end node. The edge probability plot provides probabilities for each edge (i,j) between nodes. This shows the most likely neighbors, and therefore genetic relationship, over all possible gene orders.

Contact

Christian Hoener zu Siederdissen
Leipzig University, Leipzig, Germany
[email protected]
http://www.bioinf.uni-leipzig.de/~choener/