How Monarch Integrates and Curates Biological Data

As with most biomedical databases, the first step is to identify relevant data from the research community. The Monarch Initiative is focused primarily on phenotype-related resources. We bring in data associated with those phenotypes so that our users can begin to make connections among other biological entities of interest, such as:
genesgenotypesgene variants (including SNPs, SNVs, QTLs, CNVs, and other rearrangements big and small)models (including cell lines, animal strains, species, breeds, as well as targeted mutants)pathwaysorthologsphenotypespublications We import data from a variety of data sources in formats including databases, spreadsheets, delimited text files, XML, JSON, and Web APIs, on a monthly schedule, which is placed into a Postgres database (hosted by the NIF). Our curation team semantically maps each resource into our data model, primarily using ontologies. This involves both typing relevant columns, mappings between columns (such as between identifier and lab…