Genomics

On November 15, 2009, in Genomics

There have been so many articles on genomics-related topics in recent months that this issue of the Digest would need to be very long indeed to present even the tightest summary. However, the articles seemed to fit more or less nicely into one of three categories: The accelerating development of genomics technologies and initiatives, the impact this has on genomics understanding, and the impact that has in terms of genetic therapies for disease. Therefore, we have divided this issue into three parts, to be published serially. We begin the series with…

The Acceleration of Genomics Technologies and Initiatives

Genome Projects

The International HapMap Project resulted in a high-level map of more than 100 regions of the genome containing genetic variants associated with risk of diabetes, age-related macular degeneration, prostate and breast cancer, and coronary artery disease, among other conditions. More detailed maps are now needed to build on and extend that knowledge, and an international effort called the 1000 Genomes Project is under way to do just that.

Using new sequencing technologies and novel computational methods, the project will sequence the genomes of at least 1,000 people to produce a catalog of genetic variations, and their impacts, present at 1 percent or greater frequency in the human population. Project data will be made available free of charge to the worldwide scientific community.

A similar initiative, the Personal Genome Project (PGP), seems both more ambitious and further ahead. The PGP aims to advance sequencing technology, better understand human health and disease, and better understand the ethical and societal issues surrounding personal genomics. The genomes of the first ten of potentially as many as 100,000 volunteers have already been sequenced and analyzed, and cell lines have been created from their tissue samples. Both the database and the cell lines will be made available to researchers.

The project website will carry stripped-down, de-identified, versions of each volunteer’s medical record. Together with the genome data, this will facilitate research into links between genotype and phenotype. In contrast, the 1000 Genomes Project collects genetic information only.

Thousands have volunteered for the next phase of PGP. They will be asked to donate $1,000 to the project, to help cover costs.

Data Deluge

Projects such as these generate a data tsunami. Will we be able to analyze it? Probably, if the new supercomputer at Arizona State University is any indication. It can crunch data at a rate of 50 teraflops (50 trillion floating point operations per second)—not whopping, compared to the 3 petaflops (3,000 trillion flops) about to come on stream in IBM’s Blue Gene/P computer, but still quite respectable for the moment.

It will being used to crunch up to 1.5 petabytes (1.5 x 10¹⁵) of data gushing from DNA sequencing, genotyping, and bioinformatics technologies, all in the cause of “translational biomedicine”—targeted therapies for individual patients suffering from Alzheimer’s, autism, diabetes, coronary heart disease, melanoma, pancreatic cancer, prostate cancer, colon cancer, multiple myeloma, and breast cancer. The supercomputer’s ability to simulate systems at the molecular level eliminates much of the expense, time, and difficulty of experimentation in molecular medicine and biology.

The technologies feeding the voracious data appetites of biomedical supercomputers are likely to include DNA sequencers under development at Pacific Biosciences. The company has used US$178 million of funding to produce prototype sequencers that can read single strands of DNA in real time. The innovation in these sequencers is a biochip with several thousand wells each holding an enzyme to which is added the strand of DNA to be read as well as fluorescently labeled bases (A,C,G,T.) A camera then captures the ensuing sequencing reaction. The machines can currently sequence 12 million bases of DNA per hour, or about one-third of a percent of a human genome.

The most advanced sequencers currently on the market stop the sequencing reaction after the addition of each base, wash away extra bases, snap a picture, and then repeat. Real-time sequencing is much faster and, because it uses fewer chemicals, much cheaper. Pacific Biosciences’ machine may also “detect rare mutations with unprecedented accuracy, orders of magnitude better than others,” according to the company. The sequencer’s accuracy and speed make it ideal for diagnostics, including cancer and infectious disease detection.

Future versions of the sequencer, with higher resolution cameras, faster enzymes, and more densely packed chips (with up to 10 million wells), could result in an ability to read 100 gigabases an hour—which would mean a complete human genome (3 gigabases verified by multiple passes) in about 15 minutes. And that, in turn, will result in cheaper sequencing.

The Holy Grail

In fact, cheaper sequencing is already well on its way. Backed by US$46 million in venture capital and with help from two innovations and several superstars of genomics and systems biology, a startup called Complete Genomics expects to have a $5,000 complete human genome sequencing service on the market by Spring 2009.

The two innovations are a way to densely pack DNA into “nanoballs” that can be sequenced quickly with few (expensive) reagent chemicals and a method to randomly read DNA letters.

The company will not sell its sequencing machine. Instead, it will sell sequencing service and data for pharmacogenomics and other research. It has already signed a deal to sequence 100 genomes in 2009 and 2,000 genomes in 2010 for the Institute for Systems Biology.

But wait, there’s more. Backed by $20 million in funding and an innovation called nanopore sequencing, Oxford Nanopore Technologies hopes to reach the Holy Grail of a $1,000 genome in two years. Nanopore sequencing, which is also used by competitors Helicos BioSciences and Pacific Biosciences, eliminates the time and expense of fluorescent labeling and amplification of DNA samples in preparation for sequencing. Sample preparation for nanopore sequencing takes hours, compared to days for traditional sequencing.

Oxford’s breakthrough is a method to sequence using an enzyme that slices DNA into individual bases and directs them through the nanopore. Though the company has demonstrated the individual pieces of a system, it still has to integrate them into a whole, and that is acknowledged to be “a big challenge.” Ultimately, the system could be integrated on a chip, with artificial nanopores replacing the current biological ones.

Still, for some customers, price is no object. In August this year, an unidentified Knome customer received a USB drive holding his genome sequence, an analysis of his genetic risks for disease, and software for browsing the sequence data. It cost the customer US$350,000—not a bad price at this moment in time, and cheaper than most since the sequencing is outsourced to China. (Still, the customer could have saved $349,000 by waiting a couple of years for something like Oxford’s technology to work. You’d have to be a gazillionnaire not to care.)

A team of informatics experts and medical consultants then analyzes the customer’s sequence data and presents the results at a personalized symposium where scientists explain the process, the results, and their limitations.

Knome aims to sequence 20 genomes this year and expects to lower the price “soon.” Customers can choose, or not, to contribute their data to a public research database.

Unlike Knome, other personal-genomics companies such as 23andMe, Navigenics, and deCODE analyze only those parts of the genome known to contain disease markers, not the whole genome. 23andMe, which is backed by Google and Genentech, has cut the price of its test from $999 to $399 as a result, it says, of cheaper DNA analysis chips. The company hopes its price cut will result in more customers and therefore a bigger genomic database for research.

Genomics Regulation

Are legislation and regulation keeping up with these accelerating developments in personal genoming?

To some extent, yes. A Genetic Information Nondiscrimination Act (GINA) was signed into US law earlier this year. It prohibits health insurance companies from using genetic information to set premiums or determine enrollment eligibility, and prohibits employers from using genetic information in hiring, firing, or promotion decisions. By reducing the fear of repercussions, more people will undergo genetic testing, leading to a bigger genetics research database, leading to more and better research, leading to better disease diagnoses, prophylactics, and therapies.

And in April this year, the New York State Department of Health told 23 companies offering genetic tests that they needed a permit to sell their wares. The New England Journal of Medicine had earlier sounded the alarm about the validity and ethics of such tests. In June, California sent similar letters to 13 companies including Navigenics, 23andMe, and Iceland’s deCODE Genetics warning them not to provide tests to consumers without a physician’s order. At least two of the 13—Navigenics and 23andMe—subsequently received licenses to continue to do business in California.

According to professor Misha Angrist, commenting in MIT’s Technology Review, such regulation is a futile attempt to stem the rising tide of demand evidenced by surveys and by the 250,000 people who each paid the National Geographic Society’s Genographic Project $100 for a genetic ancestry test.

The validity of the tests is certainly questionable and their results can generate undue anxiety.

But Angrist argues that the more people are “allowed—encouraged, even” to experiment with genetic tests, the sooner the promise of personalized medicine can be realized

There remain, however, significant issues to be addressed. Partners HealthCare CIO John Glaser has listed four:

The EHR will need to be modified to incorporate patient genetic information—the entire genome—as well as decision support. There are no standards in place to ease this modification.

Reimbursement for genetic testing is unlikely without sufficient evidence of clinical validity and beneficial outcomes.

GINA does not put an end to privacy concerns. Privacy is a complex issue because genetic data can be used not only to predict the likelihood of a disease in an individual but also, for example, to reveal information about the individual’s family members.

Neither providers nor patients can keep up with the tsunami of tests sufficient to know how they work, whether they are accurate, when they are appropriate, or how to interpret the results.

Where is all this accelerating development in genomic technologies and initiatives taking us? We’ll tell you in the next issue.