Skip to Main Content
Text size: SmallMediumLargeExtra-Large

Building a Better Database

November 23, 2010

Neighbor-joining treeIn 1895, Rudyard Kipling penned the classic line, “Yours is the Earth, and Everything in it,” a lyrical admonition to young men to hold their heads high, chase their dreams, and make good in the world.  Over the years, this famous line took on a life of its own, eventually morphing into a popular catchphrase for farmers and gardeners.  In this earth-friendly iteration, Kipling’s lyricism now celebrated the benefits of toil and process.  Before an autumn harvest “and everything in it” is possible, tillers of the earth must acquire the source materials, integrate them, and carefully cultivate the fruits of their hard labor.

The same holds true for the curators of online databases who, in collaboration with laboratories around the world, are assembling the first encyclopedic lists of the thousands of microbes that comprise the human body’s various biofilms, also referred to as microbiomes.*  How so?  Before a full harvest of each newly submitted microbe and “everything in it” is possible, database curators must know what they’ve received in their inbox.  The problem is many of the incoming clones and sequences arrive bearing only an automatically generated registration number from GenBank.  Like the temporary tags slapped onto a brand-new car in some states, the registration number says nothing about the microbe’s make, model, and growth patterns. The curators often are left, if possible, to double back and work out on their own the taxonomy of these otherwise cryptically numbered discoveries. 

Take the Human Oral Microbiome Database, or HOMD.  Launched two years ago, the database is a NIDCR-supported effort to assemble a range of scientific data on the estimated 1,000 predominant microorganisms that inhabit the oral cavity and adjacent tissues.  In a recent analysis of 35,000 oral clone sequences, scientists noted that roughly 35 percent originated from unnamed and uncultured phylotypes.  What’s a phylotype, you ask?  The word is roughly akin to “species.”  What’s different is a phylotype defines a species according to the encoded information within its genes, primarily the sequence of the highly conserved 16S ribosomal RNA gene.  For the past two decades, scientists have used this gene like a taxonomic fingerprint to group organisms into an evolutionary family tree. 

In the October issue of the Journal of Bacteriology, HOMD curators take an important step forward in cultivating their database “and everything in it.” First, the authors created a taxonomic framework for HOMD that is anchored on 16S ribosomal RNA gene data.  This phylogeny-based framework consists of 619 taxa (species) in13 phyla (divisions).  The authors hope this more standardized structure will help researchers to communicate better about the inhabitants and diversity of the human oral microbiome.  Secondly, the curators analyzed approximately 35,000 clone sequences that their laboratories had obtained over the past 20 years.  The analysis allowed them to validate the initial 619 species in their original taxonomic framework.  They also identified 434 additional named and unnamed candidate taxa that, upon further validation, will be added to HOMD.  As the authors noted, “The HOMD is the first example of a curated human body site-specific microbiome resource . . . One can foresee the development of additional body-site specific curated microbiome resources based on the HOMD model or the framework of the HOMD to be expanded to include the entire human microbiome.”

  • The Human Oral Microbiome, Dewhirst FE, Chen T, Izard J, Paster BJ, Tanner ACR, Yu WH, Lakshmanan A, Wade WG. J Bacteriol, Oct 2010;192(19):5002-5017.  



* Both terms refer to the complex polymicrobial communities that form throughout the human body.  In the mouth, these communities were primarily known as plaque in decades past.  




Share This Page

GooglePlusExternal link – please review our disclaimer

LinkedInExternal link – please review our disclaimer


This page last updated: February 26, 2014