The most interesting scientific papers, like the best detective novels, always come with unexpected twists and turns. A great example is an article that NIDCR scientists and colleagues published online recently in the journal Human Mutation that offers the first comprehensive look at normal and disease-causing sequence variations in the DSPP gene. If the acronym doesn’t ring a bell, the gene encodes dentin sialophosphoprotein, the major non-collagen protein in the bone-like dentin that forms the inner core of a tooth. In the 1990s, researchers determined that the DSPP gene is frequently altered in families with histories of dominantly inherited dentin malformations. Or, more accurately, they discovered that some family members had alterations in the gene’s protein-encoding regions called exons 2 through 4. Largely missing was an analysis of protein-encoding exon five, a bewilderingly repetitive stretch of sequence that some considered beyond the reach of current cloning and sequencing techniques. But, as reported in the Human Mutation paper, the NIDCR scientists succeeded in cloning exon five, cracking its repetitive genetic code, and gaining truly unexpected insights into the evolutionary biology of the gene and the genetics of inherited dentin malformations. Interestingly, their story begins not with a tooth, but a kidney cell. The Inside Scoop recently spoke with two of the paper’s authors: NIDCR’s Dee McKnight, the lead author, and Dr. Larry Fisher, also an NIDCR scientist and the senior author.
Let's start at the beginning. What does the DSPP gene and its protein do?
Fisher: We don’t know their exact function. But I can tell you that DSPP belongs to a family of genes called SIBLINGs that encode small, soluble proteins that are abundant in tooth and bone. There are five SIBLING genes stitched into our DNA and, interestingly, all are clustered in tandem on chromosome four. My laboratory studies three SIBLINGs expressed in bone, and we were obviously interested in studying DSPP, too. Its protein is unique structurally, and it’s extremely important in forming dentin.
Fisher: Well, two points. Dentin is made by a specialized cell in the tooth called odontoblasts. The odontoblast expresses the DSPP gene at levels that are many times higher than those of other tissues and organs. So it’s clearly important in making dentin. Secondly - and further supporting this point - the DSPP gene is frequently altered in people with dominantly inherited forms of dentinogenesis imperfecta (DGI) as well as dentin dysplasia (DD).
These are the two most commonly inherited dentin disorders, correct?
Fisher: That’s right. People with DGI typically have amber-brown, opalescent baby and adult teeth that readily fracture and shed their enamel. This exposes the dentin and leads to its erosion, often all the way down to the gums. People with DD have primary teeth that look similar to those with DGI, but they often have almost normal adult teeth with only mild discoloration and some unusual root structures that show up on standard dental X-ray films.
The clinical overlap is quite interesting.
Fisher: It is very interesting, and that’s another reason that we wanted to study the gene. But our group faced an ethical dilemma. The odontoblast was long thought to be the only cell in the body that produced the DSPP protein. That meant to clone the gene and study its protein, we would have to remove odontoblasts from a child’s growing tooth. Obviously, we couldn’t do that. Removing odontoblasts would leave the donor permanently without one of his or her natural teeth.
But you forged ahead with an ethically neutral Plan B?
Fisher: Our Plan B was just basic biology. The DSPP gene, although it might not be expressed throughout the body, is written into and present in all DNA-containing cells. For a couple of summers, I’d had a pair of talented students in the lab trying to capture all five pieces of the gene. They tried to stitch the pieces together in the lab, produce a useful gene transcript from the pieces, and ultimately translate the gene transcript to produce the DSPP protein to study.
They were having some success - and then the ballgame suddenly changed.
Fisher: My lab and others discovered that DSPP is expressed at low levels elsewhere in the body, including bone, salivary gland, and kidney. You might ask why would a protein expressed primarily in dentin also be expressed in a soft tissue like the kidney? We don’t know the answer. A real possibility is nature frequently uses the same protein twice for completely different reasons. For example, the crystallin protein is expressed at high levels in the lens of the eye and functions there as a bricks-and-mortar structural protein; in other parts of the body, crystallin is produced at lower levels and acts as an enzyme. So, it is entirely possible that nature has stumbled onto the fact that, when produced in large amounts, DSPP makes a better tooth. But, for now, that remains an open question.
How did this discovery change things?
Fisher: Well, we could get kidney material from a commercial source, and that helped in our cloning efforts. Even so, DSPP is really like no other protein structurally, particularly the part encoded by exon five. It repeats the same sequence of nine nucleotides, or units of DNA, over and over, leading to an equally repetitive protein. Normally, a long stretch of repeats is not translated into protein. DSPP is a rare exception to the rule.
What do these repeats encode?
Fisher: They encode seven or eight hundred amino acid residues that are actually tandem repeats of a group of three amino acids: serine-serine-aspartic acid. Because most of the serines have a phosphate molecule added to them in the cell, the final protein is extremely water-soluble. In fact, DSPP could very well be the most hydrophilic protein in the body.
And trying to clone exon five is where things got interesting, right?
Fisher: Exactly. We had to worry constantly about errors cropping up in the cloned portions of exon five. The repeats run for about 2,000 nucleotides and, for a variety of technical reasons, long stretches of repeat sequence are very difficult to copy. For example, at some point, the cloning bacteria can rearrange the DNA and various numbers of repeats can be lost. That’s when Dee [McKnight] arrived, and she took on the project.
Exon five became your beast of burden?
McKnight: It did. I noticed that they previously had been getting cloned sequences that were a little different than those published previously, and Larry asked me to see if these variations were really differences in the underlying DSPP genes. Or were they due to cloning errors caused by the bacteria? So, I started all over again and found clear differences between the sequences. That raised the obvious question: How many differences are there in this gene within humans?
How many are there?
McKnight: We started with DNA samples from 25 Caucasians and just saw tons of differences. That is, people inherit two copies of the DSPP gene – one from their mother, one their father. When we looked at the DSPP locus in each patient sample, the two copies almost invariably contained two different nucleotide sequences in exon five. We talked about it with some of our NIH colleagues who have expertise in this area, and they told us that these rates of sequence variation were far more rapid than normal. By that, I mean normally you’d have to look at the two sequences from many people to see even a single difference. That further piqued our interest, and our analyses just expanded. We looked at a range of different geographic populations to see if we could see any useful patterns of change.
McKnight: Right, we looked for a string of single nucleotide changes, or SNPs, that pop up periodically along the gene sequence. A good analogy is the occasional spelling error or difference that you sometimes notice in a novel or newspaper article. Each difference – say between “color purple” and “colon purple – would be a SNP-like single change. A grouping of related SNPs is called a gene haplotype.
When I compared the DNA samples, I did notice common haplotypes. Some were unique to certain populations. Back to our analogy for a second. An American publisher would use the spelling “color purple,” while a printer in the U.K. would opt for “colour purple.” In much the same way, we saw alleles, or sequence variations, that were spelled in ways that were specific to groups of Asian descent. We then noticed the same or related ones in those who originally migrated out of Asia into North America. We also saw some clusters that were more specific to Africa.
What’s interesting here is most studies in this area to date have tracked SNPs located throughout the genome. You’re tracking SNPs within a single gene.
Fisher: That’s right, most SNPs that are identified and studied for tracking diseases or traits are found randomly throughout the genome. They are very exciting new tools; but, for studying our ancestry and human migrations, the projects to date have usually limited the searches to SNPs found only in certain types of DNA. Let me explain. To study the genetics of human migration, you must track a DNA sequence that changes sufficiently in different populations over just a few thousand or tens of thousands of years. These sequences typically will not be under selection pressure. That is, they are usually extra DNA, meaning they are not essential to life itself and have the selective freedom to change without killing off the species. These genetic factors have led scientists to study specific parts of mitochondrial DNA, which is maternally inherited, or various genes on the sex-linkedY chromosome, which is paternally inherited.
But there was another consideration?
Fisher: Yes, a second - and very important - consideration is that your DNA of interest must not recombine, or exchange between the two copies of each chromosome that we inherited from our mother and father. Mitochondrial and Y chromosomal DNA do not undergo recombination, but most autosomal genes on our 22 numbered and paired chromosomes do. The DSPP gene is somewhat different for a gene on a numbered chromosome. It changes quite quickly and resides on the part of chromosome four that does not appear to recombine measurably over the important time periods relevant to recent human migration periods.
Does the DSPP gene have the potential to become a tool in tracking human migration?
McKnight: We hope so. We’re totally open to helping other groups that might be interested in giving it a try. The gene is extremely data rich, but the data come with a technical downside. Subcloning and sequencing DSPP is quite labor intensive. For example, it wouldn’t be practical to include DSPP in a study that involves processing thousands of DNA samples. It would be way too cumbersome. But, again, the gene is extremely data rich and, for some specific studies, it may be worth the effort.
So you set out to clone the gene and ended up knee deep in evolutionary genetics. Meanwhile, you pursued another investigative angle to the story.
Fisher: Right, about the time that Dee arrived in my lab, another group reported that differences in exon five were one cause of the form of DI that is seen in the much-studied Brandywine isolate, a tri-racial group of people from southern Maryland. As Dee looked at her preliminary results, we saw differences in people with normal teeth that were more dramatic than those that had been proposed in the previous paper to cause DI in the Brandywine isolate.
What do you mean by differences?
Fisher: Well, people had two possible differences in this repeat. The first is Dee noticed the repeats always occurred in multiples of nine DNA bases that correspond directly to the serine-serine-aspartic acid repeat in the protein. But the total number of these repeats can vary from person to person by multiples of nine, such as 18, 27, 90, or even more. The second is variations in the individual SNPs. Dee went back and looked at sample DNA from one of those Brandywine patients with DGI, and she verified that the previous report was correct. The specific changes in the total number of repeats in DSPP did indeed track with the disease. However, we concluded that it probably did not actually cause DGI.
Fisher: Although the variations certainly are there, we now know that’s to be expected in this exon. As I mentioned, Dee saw differences that were even more dramatic among people who have completely normal teeth. So, something else probably is causing DI in the Brandywine population.
And that’s where fortune again stepped in.
Fisher: Fortune arrived in the person of our NIDCR colleague Dr. Tom Hart. Dee gave a seminar at NIDCR a while ago and presented some of our preliminary data, and Tom mentioned to us afterwards that a lot of DGI and DD patients have alterations that track to that region of chromosome 4. But, for the technical reasons that we discussed earlier, nobody had taken the next step and actually sequenced the entire repeat part of DSPP. Tom graciously offered to share with us his collection of DNA samples from various DGI and DD patients, families, and kindreds, and that really opened the gates for us.
What did you find?
McKnight: There were frameshift mutations within the repeat. In other words, the DSPP gene often lost not nine or a multiple of nine bases but a single nucleotide from one repeat. That, in turn, threw the coding sequence out of alignment. Unlike our language, in which words can be of different lengths and all words are separated by spaces, the language of DNA is written in words of three letters with no spaces to delineate each word. So, a DNA sequence of AGCAGCGAC repeated many times will be read by the cell’s protein-making machinery as AGC is serine, then the second AGC is serine, and finally GAC aspartic acid. Lose the first A, and the protein-making machinery will read the remaining sequence of letters as GCA for alanine, GCG as alanine, and then ACA (the last A coming from the beginning of the next repeat) as threonine. The loss of A shifts the frame of repeats, and you end up with a completely changed string of amino acids and ultimately a different protein. In people with normal dentin, we never encountered these frame shifts. Because they typically have insertions or deletions in multiples of nine, they always stayed in frame. It’s just the total number of serine-serine-aspartic acid repeats that is changed, and that does not seem to change the DSPP function in any currently noticeable way.
The gene encodes a proprotein that gets snipped in two, right?
Fisher: Yes, after the entire DSPP protein is made, a special enzyme then comes along and clips the proprotein into two protein fragments. Most people think the two proteins are the functional end products of the process. But since the functions of these proteins are unknown, we don’t really know whether the clipping initiates the activity or stops their functionality.
How does all of this relate back to making dentin?
Fisher: That’s the ultimate question. What is this protein – or its fragments - actually doing? We don’t know, but we do think we can answer why people with one of their DSPP genes out of frame have poorly formed dentin. Odontoblasts make large amounts of the protein. The frame shift causes DSPP to precipitate in the odontoblast and, because the protein is now very hydrophobic, it just gums up the cell’s machinery. By analogy, think about the mess that would soon happen on a busy highway if suddenly every tenth car stopped running. It would be nice actually to see what these odontoblasts look like under a microscope. We predict that the cell’s transport highways are blocked. But that takes us back to our ethical problem. We can’t easily work in the living odontoblast on a growing tooth.
Given the variability of exon five, you’d think DI and DD would be extremely common.
Fisher: Well, we’ve been discussing two different types of DNA losses. Over the last few tens of thousands of years, the total number of the serine-serine-aspartic acid repeats has changed a lot among the human population. But this seems to change little or nothing about the quality of the dentin that those of us with “normal” teeth make. Our ancestors passed on such changes without any effect on our health or longevity. Losing a single DNA unit – an event that results in the production of a very different protein – causes the teeth to be very fragile.
Therefore, before modern medical and dental care, people with DD and DGI probably were less likely to survive and have children carry on their defective gene. So with time, we see many different number of repeats in the DSPP gene among people and very few with more losses of only one DNA unit. But it is still an interesting question as to whether people with more repeats have dentin that is in some subtle way different than those with fewer repeats. Typically, it’s the enamel that one notices because of its visibility. What about the dentin? Right now, we’re not studying this issue per se, but it’s in the back of our minds. To get at that question from a different angle, however, Dee is now looking at many different mammals. Mice and elephants, for example, both have very good, functional teeth. But they have significantly fewer repeats in exon five than humans. So, do you know the source of any good pangolin DNA? We’d be interested.
Thanks for talking about your paper.