Smith J.J.,Institute for Systems Biology |
Aitchison J.D.,Seattle Biomedical Research Institute
Nature Reviews Molecular Cell Biology | Year: 2013
Peroxisomes carry out various oxidative reactions that are tightly regulated to adapt to the changing needs of the cell and varying external environments. Accordingly, they are remarkably fluid and can change dramatically in abundance, size, shape and content in response to numerous cues. These dynamics are controlled by multiple aspects of peroxisome biogenesis that are coordinately regulated with each other and with other cellular processes. Ongoing studies are deciphering the diverse molecular mechanisms that underlie biogenesis and how they cooperate to dynamically control peroxisome utility. These important challenges should lead to an understanding of peroxisome dynamics that can be capitalized upon for bioengineering and the development of therapies to improve human health. © 2013 Macmillan Publishers Limited. All rights reserved.
Huang S.,Institute for Systems Biology
BioEssays | Year: 2012
The Neo-Darwinian concept of natural selection is plausible when one assumes a straightforward causation of phenotype by genotype. However, such simple 1:1 mapping must now give place to the modern concepts of gene regulatory networks and gene expression noise. Both can, in the absence of genetic mutations, jointly generate a diversity of inheritable randomly occupied phenotypic states that could also serve as a substrate for natural selection. This form of epigenetic dynamics challenges Neo-Darwinism. It needs to incorporate the non-linear, stochastic dynamics of gene networks. A first step is to consider the mathematical correspondence between gene regulatory networks and Waddington's metaphoric 'epigenetic landscape', which actually represents the quasi-potential function of global network dynamics. It explains the coexistence of multiple stable phenotypes within one genotype. The landscape's topography with its attractors is shaped by evolution through mutational re-wiring of regulatory interactions - offering a link between genetic mutation and sudden, broad evolutionary changes. © 2012 WILEY Periodicals, Inc.
Shteynberg D.,Institute for Systems Biology
Molecular & cellular proteomics : MCP | Year: 2011
The combination of tandem mass spectrometry and sequence database searching is the method of choice for the identification of peptides and the mapping of proteomes. Over the last several years, the volume of data generated in proteomic studies has increased dramatically, which challenges the computational approaches previously developed for these data. Furthermore, a multitude of search engines have been developed that identify different, overlapping subsets of the sample peptides from a particular set of tandem mass spectrometry spectra. We present iProphet, the new addition to the widely used open-source suite of proteomic data analysis tools Trans-Proteomics Pipeline. Applied in tandem with PeptideProphet, it provides more accurate representation of the multilevel nature of shotgun proteomic data. iProphet combines the evidence from multiple identifications of the same peptide sequences across different spectra, experiments, precursor ion charge states, and modified states. It also allows accurate and effective integration of the results from multiple database search engines applied to the same data. The use of iProphet in the Trans-Proteomics Pipeline increases the number of correctly identified peptides at a constant false discovery rate as compared with both PeptideProphet and another state-of-the-art tool Percolator. As the main outcome, iProphet permits the calculation of accurate posterior probabilities and false discovery rate estimates at the level of sequence identical peptide identifications, which in turn leads to more accurate probability estimates at the protein level. Fully integrated with the Trans-Proteomics Pipeline, it supports all commonly used MS instruments, search engines, and computer platforms. The performance of iProphet is demonstrated on two publicly available data sets: data from a human whole cell lysate proteome profiling experiment representative of typical proteomic data sets, and from a set of Streptococcus pyogenes experiments more representative of organism-specific composite data sets.
Farrah T.,Institute for Systems Biology
Molecular & cellular proteomics : MCP | Year: 2011
Human blood plasma can be obtained relatively noninvasively and contains proteins from most, if not all, tissues of the body. Therefore, an extensive, quantitative catalog of plasma proteins is an important starting point for the discovery of disease biomarkers. In 2005, we showed that different proteomics measurements using different sample preparation and analysis techniques identify significantly different sets of proteins, and that a comprehensive plasma proteome can be compiled only by combining data from many different experiments. Applying advanced computational methods developed for the analysis and integration of very large and diverse data sets generated by tandem MS measurements of tryptic peptides, we have now compiled a high-confidence human plasma proteome reference set with well over twice the identified proteins of previous high-confidence sets. It includes a hierarchy of protein identifications at different levels of redundancy following a clearly defined scheme, which we propose as a standard that can be applied to any proteomics data set to facilitate cross-proteome analyses. Further, to aid in development of blood-based diagnostics using techniques such as selected reaction monitoring, we provide a rough estimate of protein concentrations using spectral counting. We identified 20,433 distinct peptides, from which we inferred a highly nonredundant set of 1929 protein sequences at a false discovery rate of 1%. We have made this resource available via PeptideAtlas, a large, multiorganism, publicly accessible compendium of peptides identified in tandem MS experiments conducted by laboratories around the world.
Huang S.,Institute for Systems Biology
Cancer and Metastasis Reviews | Year: 2013
Genetic instability is invoked in explaining the cell phenotype changes that take place during cancer progression. However, the coexistence of a vast diversity of distinct clones, most prominently visible in the form of non-clonal chromosomal aberrations, suggests that Darwinian selection of mutant cells is not operating at maximal efficacy. Conversely, non-genetic instability of cancer cells must also be considered. Such mutation-independent instability of cell states is most prosaically manifest in the phenotypic heterogeneity within clonal cell populations or in the reversible switching between immature "cancer stem cell-like" and more differentiated states. How are genetic and non-genetic instability related to each other? Here, we review basic theoretical foundations and offer a dynamical systems perspective in which cancer is the inevitable pathological manifestation of modes of malfunction that are immanent to the complex gene regulatory network of the genome. We explain in an accessible, qualitative, and permissively simplified manner the mathematical basis for the "epigenetic landscape" and how the latter relates to the better known "fitness landscape." We show that these two classical metaphors have a formal basis. By combining these two landscape concepts, we unite development and somatic evolution as the drivers of the relentless increase in malignancy. Herein, the cancer cells are pushed toward cancer attractors in the evolutionarily unused regions of the epigenetic landscape that encode more and more "dedifferentiated" states as a consequence of both genetic (mutagenic) and non-genetic (regulatory) perturbations - including therapy. This would explain why for the cancer cell, the principle of "What does not kill me makes me stronger" is as much a driving force in tumor progression and development of drug resistance as the simple principle of "survival of the fittest." © 2013 Springer Science+Business Media New York.
Deutsch E.W.,Institute for Systems Biology
Molecular & cellular proteomics : MCP | Year: 2012
Targeted proteomics via selected reaction monitoring is a powerful mass spectrometric technique affording higher dynamic range, increased specificity and lower limits of detection than other shotgun mass spectrometry methods when applied to proteome analyses. However, it involves selective measurement of predetermined analytes, which requires more preparation in the form of selecting appropriate signatures for the proteins and peptides that are to be targeted. There is a growing number of software programs and resources for selecting optimal transitions and the instrument settings used for the detection and quantification of the targeted peptides, but the exchange of this information is hindered by a lack of a standard format. We have developed a new standardized format, called TraML, for encoding transition lists and associated metadata. In addition to introducing the TraML format, we demonstrate several implementations across the community, and provide semantic validators, extensive documentation, and multiple example instances to demonstrate correctly written documents. Widespread use of TraML will facilitate the exchange of transitions, reduce time spent handling incompatible list formats, increase the reusability of previously optimized transitions, and thus accelerate the widespread adoption of targeted proteomics via selected reaction monitoring.
Agency: NSF | Branch: Standard Grant | Program: | Phase: ENERGY FOR SUSTAINABILITY | Award Amount: 298.70K | Year: 2016
PI Name: Nitin Baliga
Proposal Number: 1606206
Microscopic algae are a promising future platform for the sustainable production of biofuels. These organisms use sunlight, atmospheric carbon dioxide, and nutrients such as nitrogen and phosphorous dissolved in liquid medium to make lipids which can be processed into liquid transportation fuel. Most algal biofuel processes require two stages. In the first stage, the algae consume nutrients and grow. In the second stage, the lipids used to make biofuel accumulate within the biomass, but only when all the nutrients are consumed so that the biomass does not grow any more. In order to improve the economic viability of algae-based biofuels, it is necessary to develop strains of algae that can generate lipids for biofuel while producing more biomass that sustains the process. The goal of this project is re-program the gene networks in a model strain of algae named Chlamydomonas reinhardtii to enhance both biofuel and biomass production at the same time. The research will attempt to make the systems biology platform more generic that it can be extended to other organisms. The educational activities associated with the project will develop high school curricular materials for renewable green biotechnology topics.
A critical challenge with algal biofuel production is that nutrient starvation is required to induce lipid accumulation. The proposed research will develop a systems biology strategy to predictably manipulate regulatory and metabolic networks using the model photosynthetic green microalga Chlamydomonas reinhardtii in an effort to significantly enhance lipid accumulation without growth arrest. A predictive Environment and Gene Regulatory Influence Network (EGRIN) model will be developed to rationally identify gene targets for systems level re-engineering. The EGRIN model will be built upon a compendium of transcriptomes from cultures of C. reinhardtii grown in a diverse set of nutritional and environmental conditions. Accuracy of the EGRIN model will be improved by incorporating experimentally mapped open chromatin structure. Further, the EGRIN model will be integrated with a metabolic network model to identify gene targets for engineering. Finally, model-guided genome engineering of algae using CRISPR/Cas9 technology will be used to enhance biomass or lipid production. The proposed research will generate a fundamental, model-guided strategy for predictably manipulating regulatory and metabolic networks in algae through a generalized approach which can be customized to perform similar strain engineering objectives other organisms.
Agency: NSF | Branch: Standard Grant | Program: | Phase: Systems and Synthetic Biology | Award Amount: 1.20M | Year: 2016
The process of gene expression, encompassing transcription and translation, is closely interconnected with cellular physiology, and is carried out by all organisms on Earth. Yet, the process of translation and its impact on cellular physiology is generally thought to be subordinate to transcriptional regulation and post-translational signaling. This notion continues despite the fact that both transcription and signaling depend on proteins, and hence on translation. This project seeks to identify a universal aspect of gene expression and physiology that has been overlooked. This project will test the novel idea that proteins needed under particular environmental conditions are preferentially translated when exposed to such conditions. The proposed studies may uncover a new molecular mechanism for gene regulation and will greatly advance our understanding of the unifying, central role of translation in controlling a myriad life processes. In addition, the project will contribute to training of K-12 students and teachers, and high school science curriculum development.
The proposed research will address whether functional diversity of components of the translation system and their transcriptionally generated modular expression regulate large-scale physiological state transitions in organisms. The underlying hypothesis is that environment-dependent physiological cell states are generated by the conditional production, assembly, and activity of distinct ribosomal complexes with variable subunit compositions. This hypothesis is based on the intriguing organization of translational machinery genes within gene regulatory networks of phylogenetically diverse organisms. Specifically, ribosomal subunits and other translation system proteins are conditionally co-regulated as multiple distinct, yet overlapping modules with un-correlated expression patterns across environmental shifts. The proposed research will attempt to observe conditional association of certain ribosomal subunits, and whether this association directs the translation complex to preferentially translate transcripts encoding functions for a particular environment-relevant physiological state. Protein (using SWATH mass spectrometry) and mRNA (with RNA-seq) compositions of ribosomal complexes will be characterized across environmental shifts (e.g., aerobic to anaerobic) that effect large physiological state transitions. Additionally, the proposed research will predictably manipulate the physiological state of each organism by engineering environment-responsive regulation or knock outs of conditional ribosomal subunits. Altered regulation of specific conditionally expressed ribosomal subunits should manifest an inappropriate physiological state transition relative to the environmental shift. Generality of the hypothesis will be assessed by performing studies using model microorganisms from the three domains of life - H. salinarum (archaeon), E. coli (bacterium), and S. cerevisiae (eukaryote). These proposed activities will demonstrate whether variable translation complexes drive environment-dependent physiological transitions.
Agency: NSF | Branch: Standard Grant | Program: | Phase: BIOLOGICAL OCEANOGRAPHY | Award Amount: 752.00K | Year: 2016
Prochlorococcus is a photosynthetic organism that is tremendously abundant in the ocean and influences biogeochemical cycles on global scales. This project aims to link Prochlorococcus community structure to primary productivity in situ. The twelve known Prochlorococcus ecotypes exhibit extensive diversity. It is thought that this diversity allows the Prochlorococcus collective to maintain numerical dominance across gradients in light, nutrients, and temperature that accompany changes in depth, season, and latitude. A large gap in our understanding lies in whether we should assess the ecosystem value of Prochlorococcus by its abundance or by its community structure or both. Ecosystem models assign all ecotypes the same role. However, genomic and physiological evidence from cultivated isolates and wild populations suggests tentatively that distinct genotypes may contribute differently to the ecosystem through variation in light and nutrient physiologies and interactions with other microorganisms. The consequences of these molecular-level differences to primary productivity in situ are unknown. This project tests whether absolute abundance, or community structure, determines the contributions of Prochlorococcus to biogeochemical dynamics by measuring the contributions of different ecotypes to primary productivity. The results of this project will inform ecosystem models towards better representation of how shifts in climate and Prochlorococcus diversity will affect global nutrient cycles, trophic cascades, and interactions with other bacteria, viruses, and grazers. The insights and approaches delineated by this work will be generally applicable to the ecology of abundant microbial populations in the open ocean such as pigmented and non-pigmented eukaryotes, heterotrophic bacteria, and other cyanobacterial lineages. A basic understanding of differences between coexisting ecotypes will provide inroads into understanding mechanisms of cooperation, competition, and collaboration among ecotypes in all microbial ecosystems. The investigators will build a teaching module to expose high school students to microbial oceanography, big data, and systems biology through virtual ocean exploration. The primary objective will be to impress upon students the importance of an invisible forest of microorganisms in the ocean. Students will examine the distribution patterns of abundant microbial groups in the context of oceanographic data from large publically available databases. High school teachers and student interns, a graduate student, the investigators, and an educational specialist will design, implement, and test the module for classrooms nationwide. This effort will follow a successful education model (Systems Education Experience - SEE) developed previously.
The investigators will address an overarching hypothesis that Prochlorococcus ecotypes vary in their contribution to the ecosystem as primary producers. More specifically, the investigators hypothesize that patterns of cell division and carbon fixation vary between coexisting ecotypes, and these differences are a function of genome content, gene expression, environmental conditions, and community composition. The technical approach will involve two field-based experiments will be applied to three different depths, at the oceanographic Station ALOHA, that differ in Prochlorococcus community composition. Experiment 1 will examine whether coexisting ecotypes vary in cell division, using 16S rRNA sequencing to quantify ecotype abundance in G1, S, and G2 cells. Experiment 2 will examine how carbon fixation varies between coexisting ecotypes using RNA-stable isotope probing and 16S rRNA sequencing of RNA enriched in 13C after incubation with 13C-bicarbonate. These experiments will be performed with Prochlorococcus communities under native in situ conditions and shifts in conditions to mimic light and temperature of other depths. In both experiments, the temporal gene expression of a selected set of carbon fixation and cell division genes will be examined to link gene expression patterns to primary productivity. All data will be related to the oceanographic environment including its physical, chemical, and biological features.
Agency: NSF | Branch: Continuing grant | Program: | Phase: ADVANCES IN BIO INFORMATICS | Award Amount: 383.50K | Year: 2016
Living organisms have to adjust to changes in their environment in order to optimize their use of resources, minimize stress and maintain stability. This proposal aims to uncover the basic principles that direct how organisms tailor their physiological responses to environmental changes. Understanding why certain responses occur requires a precise map showing which genes are regulated in a given response and how they are coordinated. Using prior NSF ABI support, the Baliga Laboratory at the Institute for Systems Biology developed an approach to create precise maps of gene regulation for any microbial species. Using this approach, they mapped gene regulation in a set of important organisms including uranium-reducing Desulfovibrio vulgaris, lipid-accumulating Chlamydomonas reinhardtii, yeast, diatoms, and Mycobacterium tuberculosis. The next goal is to see how well the maps work as tools to predict the results when manipulating complex behaviors. Effective tools have wide-ranging implications for biotechnology, agriculture and medicine. Initially, new algorithms and software will be developed to further refine the gene regulation maps and identify factors affecting gene regulation during environmental changes. The focus will be understanding how organisms switch on or off specific behaviors in response to environmental cues. The work will be performed in two organisms - E. coli, a well-known, widely studied bacterium, and Halobacterium salinarum, an extremophile that thrives in high salt environments; it will be readily applicable to all sequenced microorganisms that are of significant industrial, agricultural, and medical importance. Part of the project is to develop and disseminate new high school curriculum to introduce the importance of computational modeling in solving real world problems such as food scarcity and climate change. Diverse populations of students and teachers from a variety of backgrounds, including those currently underrepresented in science, technology, engineering and math (STEM), will receive training and sustained support as they learn this interdisciplinary science. This curriculum and training is part of a program called Systems Education Experiences (SEE) that reaches thousands of students and teachers each month. SEE works toward cultivating systems thinkers who can tackle problems, contribute to a STEM-literate citizenry, and help build a more diverse population of STEM professionals.
The primary objective of this project is to develop a framework to elucidate and predictably manipulate the gene regulatory program of any microbe. In previous ABI-funded research, the Baliga lab developed a systems approach to reverse engineer the environment and gene regulatory influence network v2.0 (EGRIN 2.0) model directly from a compendium of transcriptome profiles. The EGRIN 2.0 model elucidates mechanisms for environment-specific transcriptional regulation of all genes with unprecedented nucleotide-level resolution, at canonical promoter locations and even within coding sequences and inside operons. Here, an approach will be developed to elucidate transcription factor interactions within EGRIN 2.0 and characterize how the topology of these interactions (i.e., network motifs) generates genome-wide, temporally coordinated transcriptional responses. These studies will be performed in the context of understanding how two phylogenetically distant organisms --Escherichia coli (a bacterium) and Halobacterium salinarum (an archaeaon)-- use distinct regulators to mediate physiologically different yet phenotypically similar transitions from aerobic growth to anaerobic quiescence. First, an approach will be developed to precisely map conditional binding of transcription factors to sequence elements within promoters of all genes in the genome (EGRIN 3.0). Next, EGRIN 3.0 will be used to identify, characterize, and manipulate topologies of transcription factor interactions (i.e., network motifs) to predictably alter oxygen (O2)-responsive state transitions in H. salinarum and E. coli. In addition to developing a generalized framework for manipulating a microbial gene regulatory program within any organism, the activities will test the hypothesis that similar environmental forcing drives convergent evolution of topologically similar network motifs in phylogenetically distant organisms. The high-level thinking and process used by this interdisciplinary group will be translated into curriculum and training experiences in the form of real-word cases studies for high school teachers and students. One of the goals will be for students to use experimentation and modeling to better understand the influence of environmental parameters (such as oxygen, nitrates, pH, light, etc.) on productivity and stability of food systems, such as aquaponic systems. Students, teachers, and STEM professionals will work together to iteratively develop and test curriculum and experiences through a modified Dick and Carey Instructional Design model. All curricula will be integrated with published national education standards. All needed technology, software, lesson plans and learning aides will be provided to teachers and students through multiple online sources, resource centers, and in-person and online trainings. For further information about this project and its products, visit the Baliga Laboratorys website at http://baliga.systemsbiology.net and http://see.systemsbiology.net.