Wednesday, September 17, 2014

NIEHS workshop on defining language standards for environmental health

This week Monarch team members co-chaired and attended a National Institutes of Environmental Health Science (NIEHS) workshop on Development of a Framework for an Environmental Health Science Language (agenda & report). From Love Canal to Chernobyl, from the Clean Water Act to pending regulation of dietary supplements, what we breathe and what we eat is known to contribute to human health outcomes. Consistent capture, transmission, and analysis of these data for comprehensive use in multiple research and clinical environments depends upon standardization and integration of the data across multiple disciplines.

Because we need to compare phenotypes based upon both genotypes and environmental variables over time, Monarch is very interested in understanding ways to represent and integrate these data. We currently have a great diversity of model and human environmental data: reagents targeting specific gene products, physiological perturbations such as exposure to light, drug treatments, and environmental exposures to complex toxicological mixtures.

The goal of the workshop was to initiate a new working group that will focus on requirements and implementation of environmental vocabulary standards for describing these environments. We had an amazing keynote from Elaine Faustman, where she discussed metagenomic profiling of antibiotic resistance determinants in Puget Sound to assess both human health and oceans impacts. Now that is large-scale (global) data integration! We also had the pleasure of hearing Alexa McCray discuss her groups' work on combining very many autism clinical instruments using an ontological approach to better support analysis and reuse of clinical autism diagnostic data in combination with genomic data to support elusive genetic and environmental correlations in autism patients.

And then there was the amusing example of how hard it is to simply find relevant specimens in NCBI BioSample Database due to lack of standardized language:
Query
# records
Feces
22,592
Faeces
1,750
Ordure
2
Dung
19
Manure
154
Excreta
153
Stool
22,756
Stool NOT faeces
21,798
Stool NOT feces
18,314

The outcome of the workshop was a new team consisting of expertise in many disciplines - from biodiversity, to ontologies, computer science, model organism biology, and the human exposome. The prediction is that the group will have a long and interesting history of solving what may be one of the hardest, yet most interesting, data integration problems facing biological science today.

If you are interested in following this work, you can subscribe to the new working group list.