Next-generation sequencing technologies are fast and effective. This is exactly the reason why the International Cancer Genome Consortium, of which the German Cancer Research Center is part, has set itself the goal of sequencing thousands of individual cancer genomes. Private companies also offer the possibility of sequencing genomes for private clients. It looks as if the problematic vision of “googling your genes” might become reality.
The Human Genome Project (HGP) stated in a 2001 publication that it had sequenced around 90 per cent of the three billion base pairs (bp) of the human genome over a period of around seven years and at a cost of around 3 billion US dollars. Since then, the costs and time required to sequence genomes have been falling.
Towards the thousand-dollar genome
In June 2007, 454 Life Sciences announced the sequencing of James Watson's entire genome. At the same time, Watson's rival, the highly controversial gene entrepreneur and researcher Craig Venter, announced the decoding of his genome whose sequence was reprinted and made freely available in the journal PLoS Biology. It is estimated that the sequencing costs of Watson's and Venter's genomes amounted to around one million dollars. The following year, the American author Richard Powers (R. Powers: The Book Me #9) was one of 11 people to have their genomes sequenced as part of the "Personal Genome Project" (PGP) and by that time the price for sequencing genomes was only a third of what it had originally been despite requiring around "2,000 work hours and 9,000 hours of premium computing time". According to an article in the German magazine "SPIEGEL" on 7th October 2008, the Californian company Complete Genomics, Inc. offers the sequencing of entire human genomes for as little as US$ 5,000.
The drastic price reductions over the last few years are down to enormous technological progress. Whilst the Human Genome Project was based on a technology for sequencing around 100 nucleotide stretches of no more than 1,000 nucleotides developed by Frederick Sanger several decades ago, “next-generation sequencing” involves devices with a capacity of hundreds to thousands and even millions of nucleotides. But progress continues and third-generation sequencing technologies based on single molecules are already on the way, requiring even less work time and money. It is envisaged that individuals will be able to afford the sequencing of their individual genome in a few years time. This is expected to cost around 1,000 dollars and will be in high demand.
The field of medicine has already started to use the knowledge obtained from the genome of patients with the objective of finding treatments that are adapted to the genetic constitution of individual patients. "Personalised medicine" promises greater efficacy and fewer undesired side effects. Genetic testing is already mandatory in Germany, for example, for cancers such as breast cancer, metatasing colon cancer and non-small cell lung cancer, before treatment involving targeted (and extremely costly) drugs is initiated. Those working in the field of oncology in particular expect that comprehensive systematic genome analyses will lead to progress in the treatment of cancer patients.
Cancer is a disease of the genome. American studies into colon and cancer patients have shown that no one single tumour resembles another, all cancer patients differ from each other and tumours are often characterised by 10,000 or more mutations. The mutation profile of one and the same type of cancer differs from one individual to another. These findings led to the launching of the International Cancer Genome Project, the largest and most ambitious biomedical research effort since the Human Genome Project.
After several years of preliminary work carried out by the International Cancer Genome Consortium (ICGC), which currently involves 22 countries, the institutions involved have now started to generate the data for the project. The ICGC’s goal is to “obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in 50 different tumour types and/or subtypes which are of clinical and societal importance across the globe” (ICGC, March 2010). Around 500 cancer samples of each cancer type will be analysed, with slightly fewer of rare and homogeneous cancers and a larger number of large and heterogeneous cancer types. A total of around 25,000 or more cancer genomes of all the important organ systems (and the corresponding healthy tissues) will be investigated. The ICGC envisages that the results will lead to new strategies for developing targeted therapies with few side effects.
A large number of German scientists are participating in the ICGC. Their work is being coordinated by the German Cancer Research Center (DKFZ) and focuses on paediatric brain tumours, which are the main cause of cancer-related deaths in children. The most important childhood brain tumours are medulloblastoma, which is diagnosed in approximately 100 children in Germany every year, and pilocytic astrocytoma, which is diagnosed in around 200 children every year. The DKFZ has already done comprehensive preliminary work on these two types of paediatric cancer and has compiled comprehensive tumour sample collections, explains Professor Peter Lichter, head of Molecular Genetics at the DKFZ and coordinator of the German ICGC consortium "PedBrainTumor".
The PedBrainTumor project is divided into several subprojects led by internationally acclaimed experts. Besides the DKFZ, the PedBrainTumor project also involves the University and University Hospital of Heidelberg, the National Centre for Tumour Diseases and the European Molecular Biology Laboratory. The Institute of Neuropathology at the University Hospital in Düsseldorf, directed by Professor Guido Reifenberger, will focus on the pathology, quality control and analysis of small RNAs. The transcriptome will be analysed in the laboratory of Professor Hans Lehrach at the Max Planck Institute for Molecular Genetics in Berlin. The DKFZ will sequence the genomic DNA, but "some of the sequencing work will be outsourced to specialised companies," explains Peter Lichter. "Rapid technological progress is causing the price of DNA sequencing to fall continuously. Therefore, we always contract out for six months only in order to reduce the costs of the project as a whole," Richter explains. The DKFZ also analyses the methylation state (i.e. epigenetic mutations) of specific genes.
Preliminary estimates indicate that approximately 30-fold genome coverage of the cancer samples will be required to ensure the quality of the results. A particular challenge is the storage and analysis of the huge quantity of data. Under ICGC regulations, the generated data will be classified as general data to which the international community has free access, and as personality-sensitive data that are strictly controlled. Personality-sensitive data also include individual genome sequences, gene expression and all phenotypic data. The data generated by the German ICGC projects will be collected by Professor Roland Eils, deputy spokesperson of the consortium. Eils, head of the Department of Theoretical Bioinformatics at the DKFZ, is currently setting up one of the world's largest life science storage units with a capacity of six million gigabytes at the BioQuant Centre at the University of Heidelberg, a project financed with funds provided by the German and Baden-Württemberg governments.
If it becomes possible for people around the world to have their individual genome sequenced for around 1,000 dollars and stored on an electronic chip, for example, the major problem in the future will be the handling and meaningful analysis of this huge amount of data. Conventional computer technologies will not be up to dealing with such a huge amount of data. Back in 2005, Craig Venter contacted Sergej Brin, the co-founder and President of Technology of Google, whose Google search engine's enormous success is based on its world-wide unique computing capacity. Venter described his vision ("googling your genes") as follows: "People will be able to log on to a Google site using the search engine and they will be able to understand things about themselves as they change in real time. What does it mean to have this particular gene variation? What else is known? And instead of having a few elitist scientists doing this and dictating to the world what it means, Google would create several million scientists" (quote from Googling Your Genes - Chapter 26 of "The Google Story" by David Vise).
Sergej Brin was more than happy to team up with Venter. In 2007, Brin married the biotechnologist Anne Wojcicki, co-founder of the Silicon Valley company "23andMe" (named after the number of human chromosomes), a "personal genomics company" that develops methods that enable customers to analyse and understand their genetic information. Brin, whose mother suffered from Parkinson's disease, used services offered by 23andMe to discover that he has inherited from his mother a mutation of the LRRK2 gene that predisposes carriers to familial Parkinson's. This mutation increases Brin's statistical risk of developing Parkinson's himself by 20 to 80 per cent. The British magazine "The Economist" wrote on 6th December 2008: "Mr. Brin regards his LRRK2 mutation as a bug in his personal code, which is no different from the bugs in computer code that Google's engineers fix every day. By helping himself, he can therefore help others as well. He sees himself as one of the lucky ones. Isn't knowledge always good, and certainly always better than ignorance?" Brin hopes that one day everybody will know their own genetic code and so this will help doctors, patients and researchers to analyse the data and repair the "bugs".
Not everyone, particularly sceptical Europeans, shares Brin’s belief in progress and his optimism. For many people, the names Craig Venter and Google are synonymous with unrestricted and potentially also unscrupulous use of all the different possibilities offered by genetic engineering and information technology. Many people are afraid of becoming a “transparent human”, in other words they are afraid of data about the most personal possession a human individual can own – his or her individual genes – falling into the wrong hands and being misused. In addition, they are afraid of the arrogant attitude that holds that everything thinkable is also feasible and will one day come about. The question also arises: Do I really want to know my genome, information that might tell me that I am predisposed to developing a disease for which no therapy is available?Nobody should give away his or her “right not to know”. Mistrust in what might happen to the data is not completely unfounded. In the modern era of systems biology, of which bioinformatics is an integral part, the generation of huge amounts of data has become a major research business. The differentiation between knowledge and application and between science and technology has become obsolete, although some researchers say otherwise. Professor Helga Nowotny, Vice President of the European Research Council, and Professor Giuseppe Testa from the European Institute of Oncology wrote in their book “Die gläsernen Gene – Die Erfindung des Individuums im molekularen Zeitalter”: Nowadays, knowing about life means changing it.