Skip to main content

NIEHS workshop on defining language standards for environmental health

This week Monarch team members co-chaired and attended a National Institutes of Environmental Health Science (NIEHS) workshop on Development of a Framework for an Environmental Health Science Language (agenda & report). From Love Canal to Chernobyl, from the Clean Water Act to pending regulation of dietary supplements, what we breathe and what we eat is known to contribute to human health outcomes. Consistent capture, transmission, and analysis of these data for comprehensive use in multiple research and clinical environments depends upon standardization and integration of the data across multiple disciplines.

Because we need to compare phenotypes based upon both genotypes and environmental variables over time, Monarch is very interested in understanding ways to represent and integrate these data. We currently have a great diversity of model and human environmental data: reagents targeting specific gene products, physiological perturbations such as exposure to light, drug treatments, and environmental exposures to complex toxicological mixtures.

The goal of the workshop was to initiate a new working group that will focus on requirements and implementation of environmental vocabulary standards for describing these environments. We had an amazing keynote from Elaine Faustman, where she discussed metagenomic profiling of antibiotic resistance determinants in Puget Sound to assess both human health and oceans impacts. Now that is large-scale (global) data integration! We also had the pleasure of hearing Alexa McCray discuss her groups' work on combining very many autism clinical instruments using an ontological approach to better support analysis and reuse of clinical autism diagnostic data in combination with genomic data to support elusive genetic and environmental correlations in autism patients.

And then there was the amusing example of how hard it is to simply find relevant specimens in NCBI BioSample Database due to lack of standardized language:
# records
Stool NOT faeces
Stool NOT feces

The outcome of the workshop was a new team consisting of expertise in many disciplines - from biodiversity, to ontologies, computer science, model organism biology, and the human exposome. The prediction is that the group will have a long and interesting history of solving what may be one of the hardest, yet most interesting, data integration problems facing biological science today.

If you are interested in following this work, you can subscribe to the new working group list.

Popular posts from this blog

Why the Human Phenotype Ontology?

We've often been asked, why should we use the Human Phenotype Ontology to describe patient phenotypes, rather than a more widely-used clinical vocabulary such as ICD or SNOMED? Here are the answers to some of these frequently asked questions:

1. We should use what other big NIH projects, like ClinVar, are using.

ClinVar is using HPO terms to describe phenotypes. This is done in collaboration with MedGen, which has imported HPO terms. Here is an example:

There are now many bioinformatics tools that use the HPO to empower exome diagnostics. The Monarch team has published two of these recently

1) Exomiser (Robinson et al., 2014 Genome Res.) => For discovering new disease genes via model organism data, several successful use cases at UDP and elsewhere

2) PhenIX (Zemojtel et al., 2014 Science Translational Medicine) => For clinical diagnostics of “difficult” cases. This paper was on Russ Altman's year in review at AMIA this year.

Also, a num…

Finally, a medical terminology that patients, doctors, and machines can all understand.

By Nicole Vasilevsky, Mark Engelstad, Erin Foster, Julie McMurry, Chris Mungall, Peter Robinson, Sebastian Köhler, Melissa Haendel
For many patients with rare and undiagnosed diseases, getting an accurate diagnosis, or even finding the appropriate experts is a long and winding road. To accelerate and facilitate this process, we developed a medical vocabulary (“HPO”) which is comprised of 12,000 terms that doctors can use to codify the precise and distinct observations about patients and their conditions. The HPO is structured in a way that enables machines to intelligently compare a patient’s profile with what scientists worldwide have already uncovered about diseases and their genetic causes.
Until now, most of the HPO labels and synonyms were composed of clinical terms unfamiliar to patients. For example, a patient may know they are ‘color-blind’, but may not be familiar with the clinical term ‘Dyschromatopsia’. This is why we developed a layer of 5,000 corresponding terms that can b…

How to annotate a patient's phenotypic profile

How to annotate a patient's phenotypic profile using PhenoTips and the Human Phenotype Ontology PurposeWe have observed that performance of computational search algorithms within and across species improves if a comprehensive list of phenotypic features is recorded. It is helpful if the person annotating thinks of the set of annotations as a query against all known phenotype profiles. Therefore, the set of phenotypes chosen for the annotation must be as specific as possible, and represent the most salient and important observable phenotypes. Towards this end, Monarch has been asked to provide guidance on how to create a quality patient profile using the Human Phenotype Ontology (HPO). Below we detail our annotation guidelines for use in the PhenoTips application, our partner organization. 

The guidelines can also be considered more generically so as to be applicable to any annotation effort using HPO or even using other phenotype ontologies. The annotations should be limited to th…