Minnesota Supercomputing Institute
Minnesota Supercomputing Institute
Guhlin J.,University of Minnesota |
Silverstein K.A.T.,Minnesota Supercomputing Institute |
Zhou P.,95 Borlaug Hall |
Tiffin P.,University of Minnesota |
Young N.D.,95 Borlaug Hall
BMC Bioinformatics | Year: 2017
Background: Rapid generation of omics data in recent years have resulted in vast amounts of disconnected datasets without systemic integration and knowledge building, while individual groups have made customized, annotated datasets available on the web with few ways to link them to in-lab datasets. With so many research groups generating their own data, the ability to relate it to the larger genomic and comparative genomic context is becoming increasingly crucial to make full use of the data. Results: The Omics Database Generator (ODG) allows users to create customized databases that utilize published genomics data integrated with experimental data which can be queried using a flexible graph database. When provided with omics and experimental data, ODG will create a comparative, multi-dimensional graph database. ODG can import definitions and annotations from other sources such as InterProScan, the Gene Ontology, ENZYME, UniPathway, and others. This annotation data can be especially useful for studying new or understudied species for which transcripts have only been predicted, and rapidly give additional layers of annotation to predicted genes. In better studied species, ODG can perform syntenic annotation translations or rapidly identify characteristics of a set of genes or nucleotide locations, such as hits from an association study. ODG provides a web-based user-interface for configuring the data import and for querying the database. Queries can also be run from the command-line and the database can be queried directly through programming language hooks available for most languages. ODG supports most common genomic formats as well as generic, easy to use tab-separated value format for user-provided annotations. Conclusions: ODG is a user-friendly database generation and query tool that adapts to the supplied data to produce a comparative genomic database or multi-layered annotation database. ODG provides rapid comparative genomic annotation and is therefore particularly useful for non-model or understudied species. For species for which more data are available, ODG can be used to conduct complex multi-omics, pattern-matching queries. © 2017 The Author(s).
Jagtap P.D.,University of Minnesota |
Blakely A.,Hamline University |
Murray K.,University of Minnesota |
Stewart S.,Carleton College |
And 5 more authors.
Proteomics | Year: 2015
Metaproteomics characterizes proteins expressed by microorganism communities (microbiome) present in environmental samples or a host organism (e.g. human), revealing insights into the molecular functions conferred by these communities. Compared to conventional proteomics, metaproteomics presents unique data analysis challenges, including the use of large protein databases derived from hundreds or thousands of organisms, as well as numerous processing steps to ensure high data quality. These challenges limit the use of metaproteomics for many researchers. In response, we have developed an accessible and flexible metaproteomics workflow within the Galaxy bioinformatics framework. Via analysis of human oral tissue exudate samples, we have established a modular Galaxy-based workflow that automates a reduction method for searching large sequence databases, enabling comprehensive identification of host proteins (human) as well as "meta-proteins" from the nonhost organisms. Downstream, automated processing steps enable basic local alignment search tool analysis and evaluation/visualization of peptide sequence match quality, maximizing confidence in results. Outputted results are compatible with tools for taxonomic and functional characterization (e.g. Unipept, MEGAN5). Galaxy also allows for the sharing of complete workflows with others, promoting reproducibility and also providing a template for further modification and enhancement. Our results provide a blueprint for establishing Galaxy as a solution for metaproteomic data analysis. All MS data have been deposited in the ProteomeXchange with identifier PXD001655 (http://proteomecentral.proteomexchange.org/dataset/PXD001655). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
News Article | December 2, 2016
International research involving a Monash University scientist is using new computer models and evidence from meteorites to show that a low-mass supernova triggered the formation of our solar system. The research is published in the most recent issue of leading scientific journal Nature Communications. About 4.6 billion years ago, a cloud of gas and dust that eventually formed our solar system was disturbed. The ensuing gravitational collapse formed the proto-Sun with a surrounding disc where the planets were born. A supernova--a star exploding at the end of its life-cycle--would have enough energy to induce the collapse of such a gas cloud. "Before this model there was only inconclusive evidence to support this theory," said Professor Alexander Heger from the Monash School of Physics and Astronomy. The research team, led by University of Minnesota School of Physics and Astronomy Professor Yong-Zhong Qian, decided to focus on short-lived radioactive nuclei only present in the early solar system. Due to their short lifetimes, these nuclei could only have come from the triggering supernova. Their abundances in the early solar system have been inferred from their decay products in meteorites. As the debris from the formation of the solar system, meteorites are comparable to the leftover bricks and mortar in a construction site. They tell us what the solar system is made of and in particular, what short-lived nuclei the triggering supernova provided. "Identifying these 'fingerprints' of the final supernova is what we needed to help us understand how the formation of the solar system was initiated," Professor Heger said. "The fingerprints uniquely point to a low-mass supernova as the trigger. "The findings in this paper have opened up a whole new direction of research focusing on low-mass supernovae," he said. In addition to explaining the abundance of Beryllium-10, this low-mass supernova model would also explain the short-lived nuclei Calcium-41, Palladium-107, and a few others found in meteorites. Professor Qian said the group would like to examine the remaining mysteries surrounding short-lived nuclei found in meteorites. The research is funded by the US Department of Energy Office of Nuclear Physics. Professor Heger and a new Monash Future Fellow, Dr Bernhard Mueller, also study such supernovae using computational facilities at the Minnesota Supercomputing Institute. To read the full paper, titled "Evidence from stable isotopes and Be-10 for solar system formation triggered by a low-mass supernova," visit the Nature Communications website
Cambiotti G.,University of Milan |
Wang X.,CAS Institute of Geology and Geophysics |
Sabadini R.,University of Milan |
Yuen D.A.,University of Minnesota |
Yuen D.A.,Minnesota Supercomputing Institute
Geophysical Journal International | Year: 2016
We challenge the perspective that seismicity could contribute to polar motion by arguing quantitatively that, in first approximation and on the average, interseismic deformations can compensate for it. This point is important because what we must simulate and observe in Earth Orientation Parameter time-series over intermediate timescales of decades or centuries is the residual polar motion resulting from the two opposing processes of coseismic and interseismic deformations. In this framework, we first simulate the polar motion caused by only coseismic deformations during the longest period available of instrumental seismicity, from 1900 to present, using both the CMT and ISC-GEM catalogues. The instrumental seismicity covering a little longer than one century does not represent yet the average seismicity that we should expect on the long term. Indeed, although the simulation shows a tendency to move the Earth rotation pole towards 133°E at the average rate of 16.5mmyr-1, this trend is still sensitive to individual megathrust earthquakes, particularly to the 1960 Chile and 1964 Alaska earthquakes. In order to further investigate this issue, we develop a global seismicity model (GSM) that is independent from any earthquake catalogue and that describes the average seismicity along plate boundaries on the long term by combining information about presentday plate kinematics with the Anderson theory of faulting, the seismic moment conservation principle and a few other assumptions. Within this framework, we obtain a secular polar motion of 8mmyr-1 towards 112.5°E that is comparable with that estimated from 1900 to present using the earthquake catalogues, although smaller by a factor of 2 in amplitude and different by 20° in direction. Afterwards, in order to reconcile the idea of a secular polar motion caused by earthquakes with our simplest understanding of the seismic cycle, we adapt the GSM in order to account for interseismic deformations and we use it to quantify, for the first time ever, their contribution to polar motion. Taken together, coseismic and interseismic deformations make the rotation pole wander around the north pole with maximum polar excursions of about 1 m. In particular, the rotation pole moves towards about Newfoundland when the interseismic contribution dominates over the coseismic ones (i.e. during phases of low seismicity or, equivalently, when most of the fault system associated with plate boundaries is locked). When megathrust earthquakes occur, instead, the rotation pole is suddenly shifted in an almost opposite direction, towards about 133°E. © The Authors 2016.
Jagtap P.,Minnesota Supercomputing Institute |
Bandhakavi S.,Bio Rad Laboratories Inc. |
Higgins L.,University of Minnesota |
Mcgowan T.,University of Minnesota |
And 6 more authors.
Proteomics | Year: 2012
LTQ Orbitrap data analyzed with ProteinPilot can be further improved by MaxQuant raw data processing, which utilizes precursor-level high mass accuracy data for peak processing and MGF creation. In particular, ProteinPilot results from MaxQuant-processed peaklists for Orbitrap data sets resulted in improved spectral utilization due to an improved peaklist quality with higher precision and high precursor mass accuracy (HPMA). The output and postsearch analysis tools of both workflows were utilized for previously unexplored features of a three-dimensional fractionated and hexapeptide library (ProteoMiner) treated whole saliva data set comprising 200 fractions. ProteinPilot's ability to simultaneously predict multiple modifications showed an advantage from ProteoMiner treatment for modified peptide identification. We demonstrate that complementary approaches in the analysis pipeline provide comprehensive results for the whole saliva data set acquired on an LTQ Orbitrap. Overall our results establish a workflow for improved protein identification from high mass accuracy data. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Jagtap P.,Minnesota Supercomputing Institute |
Goslinga J.,University of Minnesota |
Kooren J.A.,University of Minnesota |
Mcgowan T.,University of Minnesota |
And 3 more authors.
Proteomics | Year: 2013
Large databases (>106 sequences) used in metaproteomic and proteogenomic studies present challenges in matching peptide sequences to MS/MS data using database-search programs. Most notably, strict filtering to avoid false-positive matches leads to more false negatives, thus constraining the number of peptide matches. To address this challenge, we developed a two-step method wherein matches derived from a primary search against a large database were used to create a smaller subset database. The second search was performed against a target-decoy version of this subset database merged with a host database. High confidence peptide sequence matches were then used to infer protein identities. Applying our two-step method for both metaproteomic and proteogenomic analysis resulted in twice the number of high confidence peptide sequence matches in each case, as compared to the conventional one-step method. The two-step method captured almost all of the same peptides matched by the one-step method, with a majority of the additional matches being false negatives from the one-step method. Furthermore, the two-step method improved results regardless of the database search program used. Our results show that our two-step method maximizes the peptide matching sensitivity for applications requiring large databases, especially valuable for proteogenomics and metaproteomics studies. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sun T.,University of Minnesota |
Wentzcovitch R.M.,University of Minnesota |
Wentzcovitch R.M.,Minnesota Supercomputing Institute
Chemical Physics Letters | Year: 2012
We introduce a new approach to calculate directly the electric current in Born-Oppenheimer molecular dynamics. In this approach the electric current is computed from the adiabatic variations of the Kohn-Sham eigenstates between consecutive time steps. This conceptually straightforward method is fairly efficient and can be easily implemented into existing electronic structure programs. We test the method in two representative systems: liquid D 2O and crystalline MgO. The polarization change and the electric current density computed from the present approach are in excellent agreement with those from the Berry phase method and explicit density functional perturbation theory calculations of Born-effective charges. © 2012 Elsevier B.V. All rights reserved.
Vermillion K.L.,University of Minnesota |
Jagtap P.,University of Minnesota |
Johnson J.E.,Minnesota Supercomputing Institute |
Griffin T.J.,University of Minnesota |
Andrews M.T.,University of Minnesota
Journal of Proteome Research | Year: 2015
This study uses advanced proteogenomic approaches in a nonmodel organism to elucidate cardioprotective mechanisms used during mammalian hibernation. Mammalian hibernation is characterized by drastic reductions in body temperature, heart rate, metabolism, and oxygen consumption. These changes pose significant challenges to the physiology of hibernators, especially for the heart, which maintains function throughout the extreme conditions, resembling ischemia and reperfusion. To identify novel cardioadaptive strategies, we merged large-scale RNA-seq data with large-scale iTRAQ-based proteomic data in heart tissue from 13-lined ground squirrels (Ictidomys tridecemlineatus) throughout the circannual cycle. Protein identification and data analysis were run through Galaxy-P, a new multiomic data analysis platform enabling effective integration of RNA-seq and MS/MS proteomic data. Galaxy-P uses flexible, modular workflows that combine customized sequence database searching and iTRAQ quantification to identify novel ground squirrel-specific protein sequences and provide insight into molecular mechanisms of hibernation. This study allowed for the quantification of 2007 identified cardiac proteins, including over 350 peptide sequences derived from previously uncharacterized protein products. Identification of these peptides allows for improved genomic annotation of this nonmodel organism, as well as identification of potential splice variants, mutations, and genome reorganizations that provides insights into novel cardioprotective mechanisms used during hibernation. © 2015 American Chemical Society.
Li Y.,Masonic Cancer Center |
Chan S.C.,Masonic Cancer Center |
Brand L.J.,Graduate Program in Microbiology |
Brand L.J.,Masonic Cancer Center |
And 5 more authors.
Cancer Research | Year: 2013
Persistent androgen receptor (AR) transcriptional activity underlies resistance to AR-targeted therapy and progression to lethal castration-resistant prostate cancer (CRPC). Recent success in retargeting persistent AR activity with next generation androgen/AR axis inhibitors such as enzalutamide (MDV3100) has validated AR as a master regulator during all stages of disease progression. However, resistance to next generation AR inhibitors limits therapeutic efficacy for many patients. One emerging mechanism of CRPC progression is AR gene rearrangement, promoting synthesis of constitutively active truncated AR splice variants (AR-V) that lack the AR ligand-binding domain. In this study, we show that cells with AR gene rearrangements expressing both full-length and AR-Vs are androgen independent and enzalutamide resistant. However, selective knock-down of AR-V expression inhibited androgen-independent growth and restored responsiveness to androgens and antiandrogens. In heterogeneous cell populations, AR gene rearrangements marked individual AR-V-dependent cells that were resistant to enzalutamide. Gene expression profiling following knock-down of full-length AR or AR-Vs showed that AR-Vs drive resistance to AR-targeted therapy by functioning as constitutive and independent effectors of the androgen/AR transcriptional program. Further, mitotic genes deemed previously to be unique ARV targets were found to be biphasic targets associated with a proliferative level of signaling output from either ARVs or androgen-stimulated AR. Overall, these studies highlight AR-Vs as key mediators of persistent AR signaling and resistance to the current arsenal of conventional and next generation AR-directed therapies, advancing the concept of AR-Vs as therapeutic targets in advanced disease. © 2012 AACR.
PubMed | University of Minnesota and Minnesota Supercomputing Institute
Type: Journal Article | Journal: Journal of proteome research | Year: 2016
Mammalian hibernation is a strategy employed by many species to survive fluctuations in resource availability and environmental conditions. Hibernating mammals endure conditions of dramatically depressed heart rate, body temperature, and oxygen consumption yet do not show the typical pathological response. Because of the high abundance and metabolic cost of skeletal muscle, not only must it adjust to the constraints of hibernation, but also it is positioned to play a more active role in the initiation and maintenance of the hibernation phenotype. In this study, MS/MS proteomic data from thirteen-lined ground squirrel skeletal muscles were searched against a custom database of transcriptomic and genomic protein predictions built using the platform Galaxy-P. This proteogenomic approach allows for a thorough investigation of skeletal muscle protein abundance throughout their circannual cycle. Of the 1563 proteins identified by these methods, 232 were differentially expressed. These data support previously reported physiological transitions, while also offering new insight into specific mechanisms of how their muscles might be reducing nitrogenous waste, preserving mass and function, and signaling to other tissues. Additionally, the combination of proteomic and transcriptomic data provides unique opportunities for estimating post-transcriptional regulation in skeletal muscle throughout the year and improving genomic annotation for this nonmodel organism.