Integrating genomic data across species boundaries is
critical to the successful exploitation of previous investment in this area.
Systematic attempts to do this have thus far carried a single species focus
e.g. annotating the genome of one species using functional data
from a second. Due to the multiple potential views that could be applied
to the combined data set, a generalised ‘warehousing’ approach
We will develop a new GRID-based system to capture the details
of relationships between genomic data either within or across species in
a way that will enable complex ad-hoc queries to be run and demonstrate that
the underlying raw data can be combined to draw maximum benefit from those
data for all genomic communities.
- To define controlled vocabularies describing:
- Containment relationships
- Nomenclature relationships
relevant to comparative genomics
- To develop drop-in wrappers for primary and comparative
data sources, across a number of animal, plant and microbial species.
- To implement a Web/GRID middleware layer that will support
operations over the wrapped databases
including the integration of data by reference to controlled vocabularies.
- To demonstrate practical applications based on those web services
address biologically-relevant questions e.g. to assist in identifying
underlying QTL in farm animals or crop plant species
- To use existing comparative genomics knowledge to infer further comparative