PLoS genetics | Year: 2010
Despite the recent rapid growth in genome-wide data, much of human variation remains entirely unexplained. A significant challenge in the pursuit of the genetic basis for variation in common human traits is the efficient, coordinated collection of genotype and phenotype data. We have developed a novel research framework that facilitates the parallel study of a wide assortment of traits within a single cohort. The approach takes advantage of the interactivity of the Web both to gather data and to present genetic information to research participants, while taking care to correct for the population structure inherent to this study design. Here we report initial results from a participant-driven study of 22 traits. Replications of associations (in the genes OCA2, HERC2, SLC45A2, SLC24A4, IRF4, TYR, TYRP1, ASIP, and MC1R) for hair color, eye color, and freckling validate the Web-based, self-reporting paradigm. The identification of novel associations for hair morphology (rs17646946, near TCHH; rs7349332, near WNT10A; and rs1556547, near OFCC1), freckling (rs2153271, in BNC2), the ability to smell the methanethiol produced after eating asparagus (rs4481887, near OR2M7), and photic sneeze reflex (rs10427255, near ZEB2, and rs11856995, near NR2F2) illustrates the power of the approach. Source
"The response to ResearchKit has been fantastic. Virtually overnight, many ResearchKit studies became the largest in history and researchers are gaining insights and making discoveries that weren't possible before," said Jeff Williams, Apple's chief operating officer. "Medical researchers around the world continue to use iPhone to transform what we know about complex diseases, and with continued support from the open source community, the opportunities for iPhone in medical research are endless." ResearchKit turns iPhone into a powerful tool for medical research by helping doctors, scientists and other researchers gather data more frequently and more accurately from participants anywhere in the world using iPhone apps. Participants enrolled in these app-based studies can review an interactive informed consent process, easily complete active tasks or submit survey responses, and choose how their health data is shared with researchers, making contributions to medical research easier than ever. By delivering ResearchKit as open source, any developer can quickly design a research study for iPhone. They can also build on the available software code and contribute their tasks back to the community to help other researchers do more with the framework. Using a new module just released to the open source community, researchers are now able to incorporate genetic data into their studies in a seamless, simple and low cost way. Designed by 23andMe, the module allows study participants to easily contribute their genetic data to medical research. Researchers are also working with the National Institute of Mental Health to deliver "spit kits" to study participants based on a series of survey results. "There's so much we still need to learn about postpartum depression and it may be DNA that provides the key to better understanding why some women experience symptoms and others do not," said Samantha Meltzer-Brody, MD, MPH, director of the Perinatal Psychiatry Program at the UNC Center for Women's Mood Disorders. "With ResearchKit, and now the ability to incorporate genetic data, we're able to engage women with postpartum depression from a wide geographic and demographic range and can analyze the genomic signature of postpartum depression to help us find more effective treatments." "Collecting this type of information will help researchers determine genomic indicators for specific diseases and conditions," said Eric Schadt, PhD, the Jean C. and James W. Crystal Professor of Genomics at the Icahn School of Medicine at Mount Sinai, and Founding Director of the Icahn Institute for Genomics and Multiscale Biology. "Take asthma, for example. ResearchKit is allowing us to study this population more broadly than ever before and through the large amounts of data we're able to gather from iPhone, we're understanding how factors like environment, geography and genes influence one's disease and response to treatment." Researchers continue to adapt ResearchKit and build on the framework by contributing new modules that bring exam room medical tests to iPhone apps. Key contributions include the ability to study tone audiometry; measure reaction time through delivery of a known stimulus to a known response; assess the speed of information processing and working memory; use the mathematical puzzle Tower of Hanoi for cognition studies; and conduct a timed walk test. ResearchKit studies continue to expand internationally and are available in Australia, Austria, China, Germany, Hong Kong, Ireland, Japan, Netherlands, Switzerland, the UK and the US. ResearchKit apps are available on the App Store for iPhone 5 and later, and the latest generation of iPod touch. Explore further: Apple to tap iPhone users for medical research
Misha Angrist is not worried about strangers discovering his personal genetic information, even though it was made public in 2007 and has his name attached. Angrist was the fourth person to submit his genetic sequence to the Personal Genome Project, an effort led by George Church, a geneticist at Harvard Medical School in Boston, Massachusetts, to advance medicine by publicly sharing genomic and health data. “It was kind of a political statement,” says Angrist, a geneticist who studies bioethics and science policy at Duke University's Social Science Research Institute in Durham, North Carolina. He had become frustrated that privacy considerations prohibited scientists involved in genetic studies from interacting with the people those genes belonged to. “We were not allowed to talk to the people we studied, and that always struck me as silly and wrong-headed,” he says. The restrictions prevented researchers from gathering additional information, such as recent medical histories or health-related habits, that might give them more insight into disease risk — and stopped them developing a trusting relationship with the DNA donors. The Personal Genome Project aims to share DNA sequences, medical histories and other personal information with researchers looking to link gene variants, environment and lifestyle habits to disease risk. The project explicitly does not promise anonymity, and warns that the data will be shared publicly. Each participant is put through an online, questionnaire-based screening process to ensure that they understand both the benefits and the risks of making such information available. The US Precision Medicine Initiative, meanwhile, is seeking to collect the genomic information and medical records of 1 million participants, and the UK 100,000 Genomes Project is gathering similar data through the National Health Service, raising concerns among privacy advocates that too much personal information could become public. Both projects promise to remove information that identifies participants from the data, and store the data on secure servers that are accessible only to authorized personnel, and they prohibit people from re-identifying the sequences. They concede, however, that anonymity cannot be absolutely guaranteed, and computer scientists have shown that at least some participants can be re-identified fairly easily. Scientists and policymakers are trying to work out exactly what the harm of such disclosures could be, and how they can reduce the risks, but any solutions are more likely to be policy-based than technological. Anonymous data are not as unidentifiable as the term suggests. Not all participants in the Personal Genome Project are identified by name like Angrist, but the project does not guarantee anonymity. In 2013, Latanya Sweeney, a computer scientist who heads Harvard's Data Privacy Lab, was able to put names to many of the profiles simply by comparing them with available public records. More than half of the nameless profiles available at the time contained the person's date of birth, gender and postal zip code. By cross-checking against public records such as voter registrations, she was able to attach a name and address to 241 of the 579 profiles. Staff at the Personal Genome Project confirmed that she was correct in all but 7 cases. The Personal Genome Project is not the only database that is vulnerable to re-identification. Yaniv Erlich, a computer scientist at Columbia University in New York City looked at repeating patterns of nucleotides, known as short tandem repeats (STRs), on the Y chromosomes of men whose DNA had been made publicly available by the international 1000 Genomes Project. He then compared them with data found on two public genealogy databases. The project had not collected names or other identifying information, such as birth date or social security number, and because it stored more samples than were used, there was no way to tell if a given sample was even part of the database. As the project's consent form reassuringly put it: “Because of these measures, it will be very hard for anyone who looks at any of the scientific databases to know which information came from you, or even that any information in the scientific databases came from you.” Despite that promise, however, Erlich was able to put names to nearly 50 people who had donated their DNA. Because the Y chromosome is inherited only by males, it is often linked to family surnames. This means that even if participants in the genome study had not also given their DNA to a genealogy website, people with matching STRs were probably relatives, allowing the researchers to infer more surnames. When his study was published in 2013, Erlich estimated that 12% of US males were vulnerable to this kind of breach. Three years later, with genome databases growing and algorithms for comparing data improving, that figure could be as high as 20%. “It definitely gets easier and easier,” he says. “With some knowledge and some dedicated effort, you can identify people from genomic data.” Even those who agree to make their data public may have some information that they would rather keep from other people — or even from themselves. One participant in the Public Genome Project was James Watson, co-discoverer of the double helix structure of DNA. Watson asked that information about his apolipoprotein E gene be redacted — a variant of that gene can indicate a heightened risk for developing Alzheimer's disease, and he did not want to know his risk. But researchers from the Queensland Institute of Medical Research in Australia and the University of Washington School of Medicine pointed out that merely removing the gene from the database would not hide the information. Other changes to the genome, some in fairly distant parts of the DNA, are correlated with the higher-risk mutation. Watson responded by deleting an even larger swathe of his genome from the database. But that could be a losing battle, the researchers warned. As our understanding of the genome improves, it will be easier to estimate risks for various diseases from different points along the genome. If privacy cannot be guaranteed, the next question is whether this is a problem. Some risks seem relatively minor, such as the potential embarrassment of having people find out that you participated in a particular study. But some adoptees have used genetic data to find birth parents who had not expected their identity to be revealed. Others might discover that someone they thought to be a parent or grandparent is not actually related to them. Include someone's medical history and the potential for awkward revelations grows. If a name can be attached to a genome, and the genome is attached to medical records, then treatments for sexually transmitted diseases, alcoholism or mental illness could be revealed. Some people worry that they may face job discrimination — or health-insurance discrimination in the United States — if a risk of debilitating and expensive diseases is made public. Some privacy advocates worry that despite the general guidelines developed for the Precision Medicine Initiative, the project lacks legal protections. The World Privacy Forum, a non-profit organization based in San Diego, California, says that data collected by the project are not covered by the main US health-privacy law, the Health Insurance Portability and Accountability Act of 1996. It also fears that courts may decide that when participants volunteer information to researchers, they give away their right to doctor–patient confidentiality. Courts have, after all, previously ruled that police do not need a warrant to collect mobile-phone location data because callers have already shared that information with telephone companies. “People are still worried about discrimination in health insurance and jobs,” says Robert Cook-Deegan, a biologist who studies genomics policy at Duke University's Sanford School of Public Policy. In the United States, the Genetic Information Nondiscrimination Act of 2008 is supposed to prohibit that, but it does not cover long-term care or disability insurance, so people who discover that they may need extensive care for a late-onset disease such as Alzheimer's could still face ruinous expenses. The Canadian government recently debated a similar law, and the European Union has a general mandate against genetic discrimination. There is no specific UK law against it, however, although the Association of British Insurers agreed to a moratorium until 2019 on using predictive genetic tests to inform insurance policies. Some of the concerns are speculative, such as the possibility that someone's DNA could be planted at a crime scene. Indeed, the trouble with figuring out how to handle privacy, Erlich says, is that “we really don't understand the concept of harm due to privacy loss.” If anything, the risk of personal information being revealed is probably no greater than that from other sources where people willingly provide information, Erlich says. He points to a 2013 study by researchers at the University of Cambridge, UK, and Microsoft Research that identified people's sexual orientation, political affiliation and race with high degrees of accuracy just by examining their 'likes' on Facebook. That is much more information than you could glean from a genome at present. “There is not a single genetic marker in the genome that can predict homosexuality,” Erlich says. Privacy may not even be the right focus, argues Jenny Reardon, a sociologist at the Center for Biomolecular Science and Engineering at the University of California, Santa Cruz, who in May chaired a conference focusing on the fraught issue of personal data in the age of precision medicine. “Privacy doesn't get us to what is more fundamental: what as a society should we be doing with this data,” she says. She would like to see more focus on how these large data sets can improve people's lives. But “no one wants to discuss this”, she says. Whatever the problem with privacy, the solution is unlikely to be technological, Erlich says. Techniques to encrypt data or disguise it with statistical noise are of limited value, he explains, because the more they protect privacy, the less useful they make the data. He thinks that a better approach is to rethink how privacy and consent are handled, and to treat the people who hand over their DNA with respect and honesty. In an example of this approach, Erlich and colleagues at the New York Genome Center, in collaboration with the National Breast Cancer Coalition in Washington DC, have created a project called DNALand to study the genetic risks of breast cancer. People donate the genetic information that they get from DNA-testing companies such as 23andMe, Family Tree DNA and Ancestry.com. In return, DNALand offers users free information about their genome and the possibility of identifying relatives based on genetic matches, as well as the chance to contribute to improving medical knowledge. The consent form spells out the risks and benefits of participating and allows people to withdraw at any time. It also promises to seek further consent before sharing data with a third party. One problem in obtaining consent is that, once collected, genomic data can be stored indefinitely and used in ways that the original researchers did not foresee. “That's the whole idea of research. You don't know what you're going to find,” Cook-Deegan says. The people who set up databases need to take a long view when making promises and asking for consent as they collect the data, he says. The Precision Medicine Initiative has a set of general guidelines about transparency and respect for participants' wishes, and these will be used to inform the future development of more concrete privacy protocols. “The problem we're going to have is to make sure we have a system that respects the rights and interests that were set up at the front end,” Cook-Deegan says. Not being clear about how participation in a study could lead to privacy breaches creates the risk that any problems that arise may make potential donors less willing to have their DNA sequenced. “We can't do research on human beings and look people in the eye and promise them that nothing bad will ever happen,” Angrist says. “If we reassure people and something bad happens, then it's that much worse.” Instead, he argues, engaging with donors and spelling out the risks and benefits can change the privacy equation. “If you talk to people who have children with undiagnosed diseases, they would tell you: 'We would gladly forgo privacy in the interest of accelerated research'.”
News Article | August 21, 2016
After decades of inconclusive results, researchers backed by Pfizer and Massachusetts General Hospital revealed that they had identified several genetic markers associated with depression earlier this month. It was the largest study of its kind, using data from more than 120,000 people. In February, a new paper explored the role that genetics plays on an individual being a morning person or a night owl, and in April another study looked at resilience to Mendelian childhood diseases, such as cystic fibrosis. Each of these studies used insights gathered from customers of 23andMe, the Google-backed company that makes a direct-to-consumer genetic test kit. Perhaps best known for its battles with regulators over its consumer genetics test in 2013, 23andMe has quietly expanded its business to include brokered access to its database of more than 1 million people’s DNA. Everyone who uses the company's $199 test kit receives a request to participate in research. If they agree, their health data is added to a separate database. With 80% of customers consenting, the company has amassed a health data gold mine—and researchers are eager to study it. 23andMe has now hired a team just shy of 70 academics who collaborate with researchers on their studies, many of which are published in top scientific journals. The researchers get access to genetic data coupled with "phenotypic" characteristics or traits, as well as feedback from online surveys. That's a juicy prospect for researchers. 23andMe is one of a growing number of companies that are developing consumer-friendly tools for researchers, although it is one of a small number focused on genomics. Large academic hospitals like Stanford Medicine and Duke are currently using Apple's ResearchKit to collect health information via iPhones. Fitbit is also investing in this area: Researchers are increasingly incorporating its step and heart rate data into large population health studies. Traditionally, clinical trials require raising a large sum of money and recruiting participants to get their genome sequenced, followed by in-person surveys. If a person can't get to one of the research sites, they won't be included. That means it's difficult, and costly, to get large numbers of participants to take part. 23andMe, in contrast, has partially sequenced some 1.2 million genomes already. And it conducts survey via mobile phone, which can be done anytime and almost anywhere. But despite those advantages, many researchers are still skeptical about the tools used by 23andMe. Not everyone will answer truthfully when asked about their weight or alcohol consumption, for instance—even in the privacy of a mobile phone survey. "Serious academic researchers, when they have money available, almost always gravitate toward more expensive scientifically advanced tools," says Matthew Amsden, CEO of ProofPilot, a startup that helps researchers conduct clinical trials. Maxine Mackintosh, a health data researcher at University College London in the U.K., adds that there's historically been a "discomfort and distrust" among academics with industry collaborations, but that's starting to change. 23andMe's research director Joyce Tung admits that it was an uphill battle to convince researchers of the merits of the data, despite that consumers have answered some 350 million survey questions so far. Yet her team's efforts are starting to bear fruit. "Academics were worried that the quality of the self-reported data wouldn't be good," she says, although the company has been experimenting with new ways to improve the accuracy. For a recent research study related to celiac disease, an autoimmune disorder where eating gluten leads to damage in the small intestine, participants responded in unusually large numbers to say that they had been diagnosed with the disease. Tung chalked that up to the growing trend around gluten-free diets. She didn't throw out the survey responses, but instead asked a follow-up question (something that 23andMe can easily do, since it doesn't require another on-site visit): "Have you ever been diagnosed with celiac disease through an intestinal biopsy?" The numbers then dropped to far more realistic levels. Tung says her team now can hardly keep up with the demand from academics. 23andMe had 25 applications from researchers in the fall of 2013, and that number jumped to 45 in the fall of 2014. The numbers aren't yet available for 2015 and 2016, but a company spokesman did share that the team received nearly 20 requests from academics to study the data in the wake of the depression study, which was published just two weeks ago. The company holds biannual meetings to review the applications and determine how many they will collaborate on in given year. The team tends to favor studies on factors that many would be surprised to learn have a genetic component, like taste preferences. The company also works with an external institutional review board, or "IRB," agency to determine any potential risks associated with the research and ensure that the participants understand what they're agreeing to. The database is now large enough that it's not all Caucasians with a European background, a group that is typically overrepresented in research. "We are at the point where we can run reasonable genome-wide association studies in non-European groups," says Tung. That's a big deal. Studying one ethnic group limits our knowledge about diseases that impact one ethnic group more than another, such as sickle cell anemia. The company can now also start to research how certain various factors will impact how an individual is likely to respond to a drug. For this kind of research, known as pharmacogenetics, researchers need a massive cohort. Out of 1 million people, only a small number are likely to be taking a certain drug. Of those, a smaller percentage still will be experiencing a side effect. Ideally, researchers would then want to study that by gender, ethnicity, and so on. Pharma will pay big bucks for that kind of information. While 23andMe is still making most of its money through sales of its consumer testing kit—the price of which increased $100 in October of 2015—its collaborations with drug makers like Genentech (to study Parkinson's disease) and Pfizer (inflammatory bowel disease and lupus) are a big part of its future growth. As 23andMe looks to monetize the health data gathered from its consumers, some patient advocates suggest that the company should consider sharing revenues with its participants. Tung says she has considered some kind of financial incentive in the past, but has some ethical concerns. Her biggest is the potential for coercion: "You can imagine that the level of pressure is unevenly distributed based on economic status," says Tung. But she hasn't ruled it out entirely. "We would consider everything that's ethical and what consumers want."
Clues to novel treatments could be gleaned from people who aren’t sick, but should be. The hunt is on for people who are healthy—even though their genes say they shouldn’t be. A massive search through genetic databases has found evidence for more than a dozen “genetic superheroes,” people whose genomes contain serious DNA errors that cause devastating childhood illnesses but who say they aren’t sick. The new study is part of a trend toward studying the DNA of unusually healthy people to determine if there’s something about them that can be discovered and bottled up as a treatment for everyone else. There’s already evidence from large families afflicted by genetic disease that some members are affected differently—or not at all. The current study took a different approach, scouring DNA data collected on 589,306 mostly unrelated individuals, and is the “the largest genome study to date,” according to Mount Sinai’s Icahn School of Medicine in New York. “There hasn’t been nearly enough attention paid to looking at healthy people’s genomes,” says Eric Topol, a cardiologist and gene scientist at the Scripps Institute. “This confirms that there are many people out there that should be manifesting disease but aren’t. It’s a lesson from nature.” The researchers, led by Stephen Friend, president of Sage Bionetworks, a nonprofit based in Seattle, and genome scientist Eric Schadt of Mount Sinai, reported today in Nature Biotechnology how they looked for people with mutations in any of 874 genes that should doom them to a childhood of pain or misery, but whose medical records or self-reports didn’t indicate any problem. In the end, they found 13 people who qualify as genetic “superheroes” but, under medical privacy agreements, were unable to contact them. That meant the scientists weren’t able to learn what’s actually different about them. “It’s like you got the box and couldn’t take the wrapping off,” Friend said during a media teleconference last week. The team consulted DNA data from nearly 400,000 people provided by 23andMe, the direct-to-consumer testing company. The team also used more detailed genome information contributed by BGI, a large genome center in China, and the Ontario Institute for Cancer Research. “The best approach to discovering large numbers of resilient individuals will involve data sharing on a global scale, involving many sequencing projects,” says Daniel MacArthur, who developed a pooled DNA database at the Broad Institute in Cambridge, Massachusetts, which he says also holds evidence of resilient individuals. Some companies, including the biotechnology company Regeneron (see “The Search for Exceptional Genomes”), have already started large searches for people whose genes seem to protect them against disease. Regeneron's focus is on common illnesses like heart disease and diabetes. Mayana Zatz, a geneticist in Sao Paulo, Brazil, who studies large families affected by inherited disease, says she’s found instances where people seem to dodge genetic destiny. For example, she located two Brazilian half-brothers with the same mutation that causes muscular dystrophy, but while one was in a wheelchair at age nine, the other is 16 and has no symptoms. Zatz says the reason could be some other gene that “rescues” the patient, or perhaps environmental factors. She says women are more often found to be resilient than men, though the reason isn’t clear. Friend says his “extraordinarily large pilot” study is meant to determine if the same sort of discoveries made by looking at affected families could be made by dredging large DNA databases. “The purpose was to see if the technology is ready, and worth the effort, and we think the answer is yes,“ he says.