Event

February 9-12, 2014 Arlington, Virginia

Genomic Sciences Program (GSP) 2014

Contractors-Grantees Meeting XII
Our phenotypic knowledgebase will complement the DOE KBase by providing a reference set of phenotypic data for nearly all published type strains of Bacteria and Archaea.
Our phenotypic knowledgebase will complement the DOE KBase by providing a reference set of phenotypic data for nearly all published type strains of Bacteria and Archaea.

Charles Parker and George Garrity will be presenting poster 170 (“Semantic Index of Phenotypic and Genotypic Data”, Abstract Book, pages 297-298) during Tuesday evening’s mixer (5:00pm-7:00pm) in Independence Center. We will be highlighting our team’s recent research on Information Extraction (IE), reasoning and ontology query.

This project has presented technical challenges that require creative solutions across several areas of information science.

Many ontologies consist of a large thesaurus of terms in a narrowly-defined domain and do not contain any reasoning capability beyond the taxonomic structure of the vocabulary and relations among concepts. Our objective is to develop an ontology that covers many broad feature domains and contains axioms encoded in first order logic that enable reasoning and inference over sparse phenotypic data, even in feature domains that contain partially-overlapping concepts and terms that map to undefined ranges of environmental conditions. In order to accomplish this, we have developed a core ontology model that maps between imprecise phenotypic features and precise environmental data.

In our current work, we are applying these novel modeling techniques to encode Tbox axioms for automatically resolving ambiguity attributed to the semantic equivalence and imprecision of phenotypic terms arising in literature. These axioms will enable reasoners to make appropriate inferences over the ontology and phenotypic data. We are also developing a query and retrieval service linked to the ontology that will provide researchers with consistent, accurate interpretations of these data that are usable for predictive modeling and in other research and commercial applications.

Several additional software components were developed to overcome technical barriers that arose during this project. Originally implemented as command-line utilities for vocabulary extraction, annotation and document analysis, we are now developing these into a commercial semantic desktop application for document/corpus analysis and for bootstrapping terminology/ontology development.

Parker et al., Semantic Index of Phenotypic and Genotypic Data

Download Abstract (46kB PDF) Download Poster (5.8MB PDF)

[permalink] Posted January 15, 2014.

Back to top