Abstract:
Recent times have seen an enormous growth of “omics” data, of which high-throughput
gene expression data are arguably the most important from a functional perspective.
Despite huge improvements in computational techniques for the functional classification
of gene sequences, common similarity-based methods often fall short of providing full
and reliable functional information. Recently, the combination of comparative genomics
with approaches in functional genomics has received considerable interest for gene
function analysis, leveraging both gene expression based guilt-by-association methods
and annotation efforts in closely related model organisms. Besides the identification
of missing genes in pathways, these methods also typically enable the discovery of
biological regulators (i.e., transcription factors or signaling genes). A previously built
guilt-by-association method is MORPH, which was proven to be an efficient algorithm
that performs particularly well in identifying and prioritizing missing genes in plant
metabolic pathways. Here, we present MorphDB, a resource where MORPH-based
candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins)
are integrated across multiple plant species. Besides a gene centric query utility,
we present a comparative network approach that enables researchers to efficiently
browse MORPH predictions across functional gene sets and species, facilitating efficient
gene discovery and candidate gene prioritization. MorphDB is available at http://
bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a
toolkit, named “MORPH bulk” (https://github.com/arzwa/morph-bulk), for running
MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their
own species of interest.