New Gene Tool May Unlock Root Causes of Disease

Copyright 2005 Daily News Central Genetic researchers have made substantial advances in understanding the root causes of common diseases and the history of human evolution, according to a series of reports published in scientific journals this week. Chief among these accomplishments is the work of an international consortium of more than 200 scientists from Canada, China, Japan, Nigeria, the United Kingdom and the United States published in the October 27 issue of the journal Nature. The team studied DNA samples from four different parts of the world and concluded that genetic variants located physically close to each other are inherited collectively as groups, called haplotypes. The comprehensive catalog of all of these blocks is known as the "HapMap." "Built upon the foundation laid by the human genome sequence, the HapMap is a powerful new tool for exploring the root causes of common diseases," says David Altshuler, MD, PhD, director of the program in Medical and Population Genetics at the Broad Institute of Harvard and MIT. "Such understanding is required for researchers to develop new and much-needed approaches to understand the still-elusive root causes of common diseases, such as diabetes, bipolar disorder, cancer and many others," he adds. Altshuler and Peter Donnelly, PhD, of the University of Oxford in England are the corresponding authors of the Nature paper. Greatest Information in Most Efficient Manner It has been known for a long time that diseases run in families, with perhaps half the risk of any given common disease explained by genetic differences inherited from one's parents. Inheritance also can play a role in different responses to a drug or to an environmental factor. Because the underlying causes of these common diseases and therapeutic responses remain largely unknown -- and because knowing this information is necessary for successful development of new approaches to prevention, diagnosis and treatment -- identifying the genetic contributors to human health is a fundamental goal of biomedicine. A new genomics-based approach to human genetics was proposed nearly a decade ago to catalog common human DNA sequence variations comprehensively and to test them systematically for their association to disease in human populations. Although it is theoretically possible to capture all of this information by sequencing every individual human genome, this is neither technically nor financially feasible. "The data from the HapMap project allows scientists to select the particular DNA variants that provide the greatest information in the most efficient manner, lowering the costs and increasing the power of genetic research to identify the origin of disease," says Mark Daly, an associate member of the Broad Institute of Harvard and MIT. Daly led the Boston team's statistical and analytical work, and was a member of the writing group for the Nature paper. Millions of SNPs a Day Moreover, the HapMap project helped spur a remarkable advance in the technology for testing genetic variations in DNA, making it possible to undertake comprehensive studies in large patient samples. A single nucleotide polymorphism, or SNP (pronounced "snip"), is a small genetic change, or variation, that can occur within a person's DNA sequence. "When we started doing this work a number of years ago, determining the genotype of a SNP in a patient cost nearly a dollar, and we could do hundreds a day," notes Stacey Gabriel, director of the Broad Institute's Genetic Analysis platform and an author of the Nature paper. "Today the prices have dropped in many cases to a fraction of a penny per genotype, and we can do millions a day," Gabriel notes. "This is the difference between not being able to do the studies, and getting them done rapidly and well." Tag SNPs The HapMap provides excellent power to capture most human variation and link it to disease or other traits, according to a related paper published in the November issue of Nature Genetics. Paul de Bakker, Roman Yalensky and their colleagues demonstrated this finding by developing and evaluating methods to select "tag SNPs" that capture the genetic variation in each neighborhood with a minimum amount of work. Using these tags, scientists can compare the SNP patterns of people affected by a disease with those unaffected far more efficiently than previously has been possible. "Compared to directly genotyping all common SNPs in the genome in all individuals of a disease study, we observe that selected tag SNPs based on HapMap can save genotyping costs by almost an order of magnitude without losing much power to detect a true association," says de Bakker, a postdoctoral fellow in Altshuler and Daly's group at the Broad Institute. The widely used tool for tag SNP selection was developed by de Bakker and colleagues. Previous Computer Models Too Simplistic Another important observation revealed by the availability of the HapMap data is that previous computer models of human genetics are too simplistic and can lead to false conclusions about the role of genes or genetic loci in different diseases. Stephen Schaffner, Altshuler and their colleagues at the Broad Institute describe the limitations of these prior models in a paper published in the November issue of Genome Research. They also provide the entire scientific community with updated models that more closely approximate reality, based on the empirical data generated by the HapMap Consortium. "Better computer models can be valuable tools in understanding the nature of human DNA variation, past changes in human populations size, and evolutionary selection," says Schaffner, a computational biologist in Broad's program in Medical and Population Genetics. Candidates for Natural Selection The public availability of HapMap's genome-wide variation data set also makes it possible for scientists to make systematic examinations of potential natural selection sites in the human genome, as well as to re-evaluate previous claims for such selection. Pardis Sabeti, Eric Lander and their colleagues at the Broad Institute, together with Stephen O'Brien and his colleagues at the National Cancer Institute, used the HapMap data to examine a prominent reported case of natural selection related to HIV infection. A genetic variation in a T-cell receptor called CCR5-?32, which confers strong resistance to infection by HIV and has been implicated in resistance to the bubonic plague, did not arise recently in the human population, they report in the November issue of PLoS Biology. "With the benefit of greater genotyping and empirical comparisons from the HapMap, we were able to show that the pattern of genetic variation seen at CCR5-?32 does not stand out as exceptional relative to other loci across the genome and is consistent with neutral evolution," says Sabeti, a postdoctoral fellow at the Broad Institute. "In fact, the CCR5-?32 allele is likely to have arisen more than 5,000 years ago, rather than during the last 1,000 years as was previously thought," Sabeti adds. In addition to allowing the re-examination of previous claims of selection, the HapMap data give scientists a new way to identify novel candidates for natural selection. Attainment of Goal The successful completion of the HapMap has its roots not only in the completion of the human genome sequence in 2001, but also in the massive effort to characterize and catalog the millions of SNPs across the genome. Based on these initial data, the haplotype structure of the human genome was recognized as early as 2001, leading directly to the formation of the International HapMap Consortium. Finally, methods for identifying the influence of natural selection on the human genome were described in 2003. Altshuler, Lander, Gabriel, Daly and many other Broad Institute scientists led or contributed significantly to all of these efforts, in addition to their role in the completion of the HapMap and demonstrations of its utility, as outlined above. In October 2002, the International HapMap Consortium set the ambitious goal of creating the HapMap within three years. The Nature paper marks the attainment of that goal with its detailed description of the Phase I HapMap, consisting of more than 1 million SNPs. The consortium also is nearing completion of the Phase II HapMap, which will contain nearly three times more SNPs than the initial version and will enable researchers to focus their gene searches even more precisely on specific regions of the genome. In line with the Broad Institute's commitment to building critical resources for the scientific community, HapMap data are freely available in several public databases, including the HapMap Data Coordination Center (http://www.hapmap.org) the NIH-funded National Center for Biotechnology Information's dbSNP (http://www.ncbi.nlm.nih.gov/SNP/index.html) and the JSNP Database (http://snp.ims.u-tokyo.ac.jp) in Japan.