Li H.,Cambridge Broad Institute
Bioinformatics | Year: 2015
Summary: FermiKit is a variant calling pipeline for Illumina whole-genome germline data. It de novo assembles short reads and then maps the assembly against a reference genome to call SNPs, short insertions/deletions and structural variations. FermiKit takes about one day to assemble 30-fold human whole-genome data on a modern 16-core server with 85 GB RAM at the peak, and calls variants in half an hour to an accuracy comparable to the current practice. FermiKit assembly is a reduced representation of raw data while retaining most of the original information. Availability and implementation: https://github.com/lh3/fermikit. © The Author 2015. Published by Oxford University Press. All rights reserved.
Yuan Y.,Cambridge Broad Institute
Cell death & disease | Year: 2013
The histone methyltransferase G9a is overexpressed in a variety of cancer types, including pancreatic adenocarcinoma, and promotes tumor invasiveness and metastasis. We recently reported the discovery of BRD4770, a small-molecule inhibitor of G9a that induces senescence in PANC-1 cells. We observed that the cytotoxic effects of BRD4770 were dependent on genetic background, with cell lines lacking functional p53 being relatively resistant to compound treatment. To understand the mechanism of genetic selectivity, we used two complementary screening approaches to identify enhancers of BRD4770. The natural product and putative BH3 mimetic gossypol enhanced the cytotoxicity of BRD4770 in a synergistic manner in p53-mutant PANC-1 cells but not in immortalized non-tumorigenic pancreatic cells. The combination of gossypol and BRD4770 increased LC3-II levels and the autophagosome number in PANC-1 cells, and the compound combination appears to act in a BNIP3 (B-cell lymphoma 2 19-kDa interacting protein)-dependent manner, suggesting that these compounds act together to induce autophagy-related cell death in pancreatic cancer cells.
Li H.,Cambridge Broad Institute
Bioinformatics | Year: 2012
Motivation: Eugene Myers in his string graph paper suggested that in a string graph or equivalently a unitig graph, any path spells a valid assembly. As a string/unitig graph also encodes every valid assembly of reads, such a graph, provided that it can be constructed correctly, is in fact a lossless representation of reads. In principle, every analysis based on whole-genome shotgun sequencing (WGS) data, such as SNP and insertion/deletion (INDEL) calling, can also be achieved with unitigs.Results: To explore the feasibility of using de novo assembly in the context of resequencing, we developed a de novo assembler, fermi, that assembles Illumina short reads into unitigs while preserving most of information of the input reads. SNPs and INDELs can be called by mapping the unitigs against a reference genome. By applying the method on 35-fold human resequencing data, we showed that in comparison to the standard pipeline, our approach yields similar accuracy for SNP calling and better results for INDEL calling. It has higher sensitivity than other de novo assembly based methods for variant calling. Our work suggests that variant calling with de novo assembly can be a beneficial complement to the standard variant calling pipeline for whole-genome resequencing. In the methodological aspects, we propose FMD-index for forward-backward extension of DNA sequences, a fast algorithm for finding all super-maximal exact matches and one-pass construction of unitigs from an FMD-index. © The Author 2012. Published by Oxford University Press. All rights reserved.
Polz M.F.,Massachusetts Institute of Technology |
Alm E.J.,Massachusetts Institute of Technology |
Alm E.J.,Cambridge Broad Institute |
Hanage W.P.,Harvard University
Trends in Genetics | Year: 2013
Many bacterial and archaeal lineages have a history of extensive and ongoing horizontal gene transfer and loss, as evidenced by the large differences in genome content even among otherwise closely related isolates. How ecologically cohesive populations might evolve and be maintained under such conditions of rapid gene turnover has remained controversial. Here we synthesize recent literature demonstrating the importance of habitat and niche in structuring horizontal gene transfer. This leads to a model of ecological speciation via gradual genetic isolation triggered by differential habitat-association of nascent populations. Further, we hypothesize that subpopulations can evolve through local gene-exchange networks by tapping into a gene pool that is adaptive towards local, continuously changing organismic interactions and is, to a large degree, responsible for the observed rapid gene turnover. Overall, these insights help to explain how bacteria and archaea form populations that display both ecological cohesion and high genomic diversity. © 2012 Elsevier Ltd.
Yaffe M.B.,American Association for the Advancement of Science |
Yaffe M.B.,Cambridge Broad Institute
Science Signaling | Year: 2013
The massive resources devoted to genome sequencing of human tumors have produced important data sets for the cancer biology community. Paradoxically, however, these studies have revealed very little new biology. Despite this, additional resources in the United States are slated to continue such work and to expand similar efforts in genome sequencing to mouse tumors. It may be that scientists are "addicted" to the large amounts of data that can be relatively easily obtained, even though these data seem unlikely, on their own, to unveil new cancer treatment options or result in the ultimate goal of a cancer cure. Rather than using more tumor genetic sequences, a better strategy for identifying new treatment options may be to develop methods for analyzing the signaling networks that underlie cancer development, progression, and therapeutic resistance at both a personal and systems-wide level. © 2013 American Association for the Advancement of Science.