Seymour S.L.,AB SCIEX |
Farrah T.,Institute for Systems Biology |
Binz P.-A.,Swiss Institute of Bioinformatics |
Binz P.-A.,University of Lausanne |
And 12 more authors.
Proteomics | Year: 2014
Inferring which protein species have been detected in bottom-up proteomics experiments has been a challenging problem for which solutions have been maturing over the past decade. While many inference approaches now function well in isolation, comparing and reconciling the results generated across different tools remains difficult. It presently stands as one of the greatest barriers in collaborative efforts such as the Human Proteome Project and public repositories such as the PRoteomics IDEntifications (PRIDE) database. Here we present a framework for reporting protein identifications that seeks to improve capabilities for comparing results generated by different inference tools. This framework standardizes the terminology for describing protein identification results, associated with the HUPO-Proteomics Standards Initiative (PSI) mzIdentML standard, while still allowing for differing methodologies to reach that final state. It is proposed that developers of software for reporting identification results will adopt this terminology in their outputs. While the new terminology does not require any changes to the core mzIdentML model, it represents a significant change in practice, and, as such, the rules will be released via a new version of the mzIdentML specification (version 1.2) so that consumers of files are able to determine whether the new guidelines have been adopted by export software. © 2014 WILEY-VCH Verlag GmbH & Co.
Cologna S.M.,U.S. National Institutes of Health |
Crutchfield C.A.,U.S. National Institutes of Health |
Searle B.C.,Proteome Software Inc. |
Blank P.S.,U.S. National Institutes of Health |
And 7 more authors.
Journal of Proteome Research | Year: 2015
Protein quantification, identification, and abundance determination are important aspects of proteome characterization and are crucial in understanding biological mechanisms and human diseases. Different strategies are available to quantify proteins using mass spectrometric detection, and most are performed at the peptide level and include both targeted and untargeted methodologies. Discovery-based or untargeted approaches oftentimes use covalent tagging strategies (i.e., iTRAQ, TMT), where reporter ion signals collected in the tandem MS experiment are used for quantification. Herein we investigate the behavior of the iTRAQ 8-plex chemistry using MALDI-TOF/TOF instrumentation. The experimental design and data analysis approach described is simple and straightforward, which allows researchers to optimize data collection and proper analysis within a laboratory. iTRAQ reporter ion signals were normalized within each spectrum to remove peptide biases. An advantage of this approach is that missing reporter ion values can be accepted for purposes of protein identification and quantification without the need for ANOVA analysis. We investigate the distribution of reporter ion peak areas in an equimolar system and a mock biological system and provide recommendations for establishing fold-change cutoff values at the peptide level for iTRAQ data sets. These data provide a unique data set available to the community for informatics training and analysis. © 2015 American Chemical Society.
Searle B.C.,University of Washington |
Searle B.C.,Proteome Software Inc. |
Egertson J.D.,University of Washington |
Bollinger J.G.,University of Washington |
And 2 more authors.
Molecular and Cellular Proteomics | Year: 2015
Targeted mass spectrometry is an essential tool for detecting quantitative changes in low abundant proteins throughout the proteome. Although selected reaction monitoring (SRM) is the preferred method for quantifying peptides in complex samples, the process of designing SRM assays is laborious. Peptides have widely varying signal responses dictated by sequence-specific physiochemical properties; one major challenge is in selecting representative peptides to target as a proxy for protein abundance. Here we present PREGO, a software tool that predicts high-responding peptides for SRM experiments. PREGO predicts peptide responses with an artificial neural network trained using 11 minimally redundant, maximally relevant properties. Crucial to its success, PREGO is trained using fragment ion intensities of equimolar synthetic peptides extracted from data independent acquisition experiments. Because of similarities in instrumentation and the nature of data collection, relative peptide responses from data independent acquisition experiments are a suitable substitute for SRM experiments because they both make quantitative measurements from integrated fragment ion chromatograms. Using an SRM experiment containing 12,973 peptides from 724 synthetic proteins, PREGO exhibits a 40- 85% improvement over previously published approaches at selecting high-responding peptides. These results also represent a dramatic improvement over the rules-based peptide selection approaches commonly used in the literature. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Epstein J.A.,U.S. National Institutes of Health |
Blank P.S.,U.S. National Institutes of Health |
Searle B.C.,Proteome Software Inc |
Catlin A.D.,U.S. National Institutes of Health |
And 5 more authors.
Proteomics | Year: 2016
Current approaches to protein identification rely heavily on database matching of fragmentation spectra or precursor peptide ions. We have developed a method for MALDI TOF-TOF instrumentation that uses peptide masses and their measurement errors to confirm protein identifications from a first pass MS/MS database search. The method uses MS1-level spectral data that have heretofore been ignored by most search engines. This approach uses the distribution of mass errors of peptide matches in the MS1 spectrum to develop a probability model that is independent of the MS/MS database search identifications. Peptide mass matches can come from both precursor ions that have been fragmented as well as those that are tentatively identified by accurate mass alone. This additional corroboration enables us to confirm protein identifications to MS/MS-based scores that are otherwise considered to be only of moderate quality. Straightforward and easily applicable to current proteomic analyses, this tool termed “ProteinProcessor” provides a robust and invaluable addition to current protein identification tools. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
Jones A.R.,University of Liverpool |
Eisenacher M.,Ruhr University Bochum |
Mayer G.,Ruhr University Bochum |
Kohlbacher O.,University of Tübingen |
And 16 more authors.
Molecular and Cellular Proteomics | Year: 2012
We report the release of mzIdentML, an exchange standard for peptide and protein identification data, designed by the Proteomics Standards Initiative. The format was developed by the Proteomics Standards Initiative in collaboration with instrument and software vendors, and the developers of the major open-source projects in proteomics. Software implementations have been developed to enable conversion from most popular proprietary and open-source formats, and mzIdentML will soon be supported by the major public repositories. These developments enable proteomics scientists to start working with the standard for exchanging and publishing data sets in support of publications and they provide a stable platform for bioinformatics groups and commercial software vendors to work with a single file format for identification data. © 2012 by The American Society for Biochemistry and Molecular Biology, Inc.
PubMed | Proteome Software Inc. and U.S. National Institutes of Health
Type: Journal Article | Journal: Journal of proteome research | Year: 2015
Protein quantification, identification, and abundance determination are important aspects of proteome characterization and are crucial in understanding biological mechanisms and human diseases. Different strategies are available to quantify proteins using mass spectrometric detection, and most are performed at the peptide level and include both targeted and untargeted methodologies. Discovery-based or untargeted approaches oftentimes use covalent tagging strategies (i.e., iTRAQ, TMT), where reporter ion signals collected in the tandem MS experiment are used for quantification. Herein we investigate the behavior of the iTRAQ 8-plex chemistry using MALDI-TOF/TOF instrumentation. The experimental design and data analysis approach described is simple and straightforward, which allows researchers to optimize data collection and proper analysis within a laboratory. iTRAQ reporter ion signals were normalized within each spectrum to remove peptide biases. An advantage of this approach is that missing reporter ion values can be accepted for purposes of protein identification and quantification without the need for ANOVA analysis. We investigate the distribution of reporter ion peak areas in an equimolar system and a mock biological system and provide recommendations for establishing fold-change cutoff values at the peptide level for iTRAQ data sets. These data provide a unique data set available to the community for informatics training and analysis.
PubMed | Proteome Software Inc, Johns Hopkins University, Brock University, University of Illinois at Chicago and U.S. National Institutes of Health
Type: Journal Article | Journal: Proteomics | Year: 2016
Current approaches to protein identification rely heavily on database matching of fragmentation spectra or precursor peptide ions. We have developed a method for MALDI TOF-TOF instrumentation that uses peptide masses and their measurement errors to confirm protein identifications from a first pass MS/MS database search. The method uses MS1-level spectral data that have heretofore been ignored by most search engines. This approach uses the distribution of mass errors of peptide matches in the MS1 spectrum to develop a probability model that is independent of the MS/MS database search identifications. Peptide mass matches can come from both precursor ions that have been fragmented as well as those that are tentatively identified by accurate mass alone. This additional corroboration enables us to confirm protein identifications to MS/MS-based scores that are otherwise considered to be only of moderate quality. Straightforward and easily applicable to current proteomic analyses, this tool termed ProteinProcessor provides a robust and invaluable addition to current protein identification tools.
News Article | August 24, 2016
No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment. The DNA coding sequences (CDSs) of the human APC/C subunits (wild-type, mutant Apc2ΔWHB, Apc11–UbcH10 fusion and ΔApc15) were assembled by USER cloning into a modified version of the insect cell-baculovirus MultiBac expression system32, 51, 52. All APC/C subunit CDSs were distributed in two recombinant vectors that were used for recombinant baculovirus generation. For APC/C expression, Hi-5 cells at a density of 2 × 106 cells ml−1 were co-infected with two pre-cultures of Sf9 cells each pre-infected with one of the two recombinant APC/C baculoviruses. APC/C expression (unphosphorylated) was performed for 30 h. To obtain APC/COA (phosphorylated APC/C), okadaic acid at a final concentration of 0.1 μM was added after 24 h of infection. Cells were collected after 5 h of treatment. The CDSs of the human MCC subunits (Mad2, Cdc20, BubR1 and Bub3) used for structural analysis were cloned into a pU2 plasmid52 using the same method as for the APC/C. BubR1 was fused in frame with an N-terminal 3×Flag tag. Cdc20 for individual expression was cloned into a pFastbac1HTA in frame with the His -tag. In addition, a maltose-binding protein (MBP) tag, followed by a TEV site between the starting codon of Cdc20 and the N-terminal His tag, was added by restriction free cloning method (RF-cloning53). To obtain a vector containing Mad2, Cdc20 and BubR1 (residues 1–569) CDSs (miniMCC construct), a Mad2- and Cdc20-containing expression cassette from a pU1 vector was shuttled (by the AvrII and PmeI sites) into a pFastbacDual vector (BstZ171 and SpeI sites) that contained 3×Flag–BubR11–569 under the control of the p10 promoter. A C-terminal StrepIIx2 tag was added by RF-cloning into the BubR1 constructs used in ubiquitination assays. Expression of either the MCC or Cdc20 constructs was performed similarly to the APC/C (unphosphorylated) to avoid CDK-dependent inhibition of APC/C-Cdc20 interactions54, 55. Moreover, cells were collected 48 h after infection. To express MCC complexes with the tagged versions of BubR1, virus containing the BubR1-StrepII constructs was co-infected with MCC virus. To express the MCC complex with the Cdc20K485R,K490R mutations, viruses containing the individual MCC subunits were used for co-infection. Apc15∆NTH, a mutant form of Apc15 with a (Gly-Ser-Ala) linker substitution of the N-terminal helix (NTH: residues 23–57) was cloned into an Escherichia coli pOPIN expression vector and purified using a C-terminal StrepIIx2 tag. To generate mitotic phosphorylated APC/C (APC/COA) we incubated APC/C expressing insect cells with the phosphatase inhibitor okadaic acid (OA) (as described above). The extent of APC/C phosphorylation was monitored by assessing the migration of the Apc3 subunit on SDS–PAGE56 (Extended Data Fig. 1a, f). The recombinant APC/COA was phosphorylated on ~110 sites (Extended Data Table 3), correlating closely with those previously identified in endogenous APC/C isolated from HeLa cells arrested by the mitotic checkpoint56, 57, 58, and with sites phosphorylated in vitro by the mitotic APC/C activating kinases Cdk2-cyclinA2-Cks2 and Plk1 (ref. 22) (Extended Data Table 3). Compared with APC/C from untreated insect cells, and using Cdc20 as the coactivator, APC/COA readily ubiquitinates securin (Extended Data Fig. 1g, h). The APC/CMCC complex was reconstituted by co-lysing APC/COA expressing cells with insect cells expressing separately MBP-tagged Cdc20 and the MCC (BubR1, Bub3, Mad2 and untagged Cdc20). Hi-5 cell pellets expressing either APC/COA or MBP–Cdc20 or MCC were mixed together in reconstitution buffer containing 50 mM Hepes (pH 8.2), 150 mM NaCl, 5% glycerol, 0.5 mM TCEP, 1 mM EDTA, 0.1 mM PMSF, 2 mM benzamidine, 5 U ml−1 benzonase (Novagen), Complete EDTA-free protease inhibitors (Roche), 50 mM NaF, 20 mM β-glycerophosphate and 0.1 μM okadaic acid. After complete mixing the cells were co-lysed by sonication and the lysate was centrifuged for 60 min at 17,000g. The soluble fraction was loaded onto a Strep-Tactin Superflow Cartridge (Qiagen) for purification using the StrepIIx2 tag on Apc4 as described previously21. The eluate was then applied to an anti-Flag M2 Affinity Gel (A220, Sigma) column (directed against the N-terminal Flag tag on BubR1) and incubated overnight. The APC/CMCC complex was eluted with a 3×Flag peptide at a concentration of 50 μg ml−1. The resulting elution was concentrated to around 1.4 mg ml−1 and run on a Superose 6 3.2/300 (GE Healthcare Life Sciences) gel-filtration column pre-equilibrated with gel-filtration buffer containing 20 mM HEPES (pH 8.0), 150 mM NaCl and 0.5 mM TCEP. The gel filtration was run on a ÄKTAmicro (GE Healthcare Life Sciences) with a flow rate of 50 μl min−1. An SDS–PAGE of purified APC/CMCC showed both versions of Cdc20, consistent with the incorporation of two distinct subunits of Cdc20 into APC/CMCC (refs 2, 20) (Extended Data Fig. 1j). Reconstituted APC/CMCC is stable and homogeneous as shown by size-exclusion chromatography (Extended Data Fig. 2a). The APC/CApc15∆NTH complex was reconstituted by incubating recombinant APC/C∆Apc15 with Apc15∆NTH, at concentrations of 200 nM and 1 μM, respectively, followed by size exclusion chromatography. Anti-Apc15 antibodies were from Santa Cruz Biotechnology (sc-398448). To examine APC/C activity towards securin, the ubiquitination assay was performed with 60 nM of recombinant human APC/C, 150 nM UBA1, 300 nM UbcH10, 300 nM Ube2S, 20 μM ubiquitin, 2 μM securin, 5 mM ATP, 0.25 mg ml−1 BSA and 7 nM of recombinant human Cdc20. The ubiquitination products of securin were detected by western blot with either an anti-His antibody (631212; Clontech) or an anti-securin antibody (700791; Invitrogen). To test the activity of a pre-assembled APC/CMCC complex towards Cdc20MCC (Fig. 5c), ubiquitination reactions were performed with 250 nM of recombinant human APC/CCdc20-MCC and 10 μM of UbcH10 (40× excess). To test the activity of APC/C towards the Cdc20MCC from individually purified wild-type and mutant MCCBubR1-StrepII (purification by StrepIIx2 affinity and gel-filtration columns) ubiquitination reactions were performed with 200 nM of recombinant human APC/COA, 200 nM of recombinant human Cdc20 and either 300 or 600 nM of recombinant human MCCBubR1-StrepII (Fig. 5d, e). Either with a pre-assembled APC/CMCC complex or with a molar excess of MCC complex over free Cdc20 and APC/C only Cdc20MCC ubiquitination is promoted (data not shown)20. Cdc20 and the ubiquitination products of Cdc20MCC were detected by western blot with an anti-Cdc20 antibody (Cdc20 H-175 sc-8358; Santa Cruz Biotechnology). Freshly purified APC/CMCC samples were analysed by negative-stain EM to check the sample quality and to obtain a low-resolution reconstruction. Micrographs were collected on a 2k×2k CCD camera fitted to a FEI Spirit electron microscope at an accelerating voltage of 120 kV, operated at a nominal magnification of 42,000 with a resulting pixel size of 2.46 Å per pixel at specimen level. Defocuses were set at approximately −2 μm. Particles were automatically selected using the autoboxer program implemented in EMAN59. About 150 micrographs per sample were collected yielding ~10,000 particles. After 3D classification performed with RELION60 only the prominent best class (30–40% of total amount of particles) was used for auto-refinement and final low-resolution structure determination. Grid preparation for both negative-stain EM and cryo-EM was performed as described previously32, 51. Cryo-EM micrographs were collected with an FEI Tecnai Polara electron microscope at an acceleration voltage of 300 kV and Falcon III direct detector. Micrographs were taken using EPU software (FEI) at a nominal magnification of 78,000, yielding a pixel size of 1.36 Å per pixel at specimen level. A total exposure time of 1.6 s were used at a dose rate of 27 electrons per pixel. Defocus range was set at −2.0 to −4.0 μm. Movie frames were recorded as described32. Image processing was performed with RELION 1.4 (ref. 60). The initial steps including motion correction, CTF estimation, particle picking and particles sorting by Z-score and 2D classification were performed as described32. Selected particles were used for a first round of 3D classification with global search and a sampling angular interval of 7.5°, using a 60 Å low-pass filtered APC/CCdh1.Emi1 EM map as a reference32. Poorly characterized 3D classes, with poorly recognizable features, were discarded at this stage and the remaining particles were refined and corrected for beam-induced particle motion using particle polishing in RELION61. Polished particles were used for another round of 3D classification with a local search within 15° and a smaller angular sampling interval of 3.7° (Extended Data Figs 4 and 7). The reconstruction generated from all the polished particles, low-pass filtered at 40 Å, was used as reference. To isolate particles for the APC/CMCC-closed state, classes showing closed-like features for the MCC–Cdc20APC/C module (for example, proximity to Apc2, Apc4 and Apc10; Extended Data Fig. 4, classes 1–3) were combined and refined. The resultant map was used as reference for a subsequent 3D classification performed with a soft edge mask on the MCC–Cdc20APC/C module (Extended Data Fig. 4). The mask was created from a map converted from the fitted coordinates of the MCC–Cdc20 module, with three pixel extension and five pixels soft edge width. The MCC–Cdc20 module coordinates were created by fitting the MCC core coordinates and isolated Cdc20 (PDB code 4AEZ)18, on the best MCC–Cdc20APC/C module density map (Extended Data Fig. 4, class 1). To isolate particles for the APC/CMCC-open state, classes showing open-like features for the MCC–Cdc20 module (for example, proximity to TPR lobe and loss of contact with Apc2, Apc4 and Apc10; Extended Data Fig. 4, classes 4–5) were refined together. The obtained averaged class was used as a reference for a subsequent 3D classification performed with a larger mask (6 pixel extension and 6 pixel soft edge) created with the MCC–Cdc20APC/C module coordinates fitted into the corresponding density in the APC/CUbcH10-MCC reconstruction described below (Extended Data Figs 4 and 7). To obtain the APC/C∆Apc15-MCC structure, the best classes from the 3D classification with local searches step were refined together (Extended Data Fig. 7a, classes 1–3). To isolate the particles for the APC/CUbcH10-MCC reconstruction, instead of performing the 3D classification with local search steps, an initial classification with a large mask (similar to APC/CMCC-open) was performed. The latter allowed the identification of a class that features both the MCC–Cdc20 module and the UbcH10-Apc11-Apc2WHB-Apc2α/β domain assembly32. A large mask including the latter regions was created by fitting the MCC–Cdc20APC/C module coordinates and the UbcH10-Apc11-Apc2WHB-Apc2α/β domain assembly (PDB code 5A31)32 in the preliminary APC/CUbcH10-MCC reconstruction. The latter mask was used for a re-classification of the initial particles and allowed the isolation of the final APC/CUbcH10-MCC particles (Extended Data Fig. 7c). All resolution estimates were based on the gold standard Fourier shell correlation (FSC) = 0.143 criterion62. Final FSC curves were calculated using a soft mask (five pixel extension and three pixel soft edge) of the two independent reconstructions. To visualize high-resolution details, all density maps were corrected for the modulation transfer-function of the detector and sharpened by applying negative B-factors, estimated using automated procedures. Local resolution maps for all the cryo-EM reconstructions were calculated with RESMAP63 using a resolution range between 3.5 and 15 Å and displayed with Chimera64. For comparing structural features among the cryo-EM reconstructions, shown in Fig. 4 and Extended Data Fig. 3, which have different overall resolutions, a common filter of 8.5 Å was applied. This was selected based on the local resolution of the APC/CUbcH10-MCC map in the region assigned to Apc15 (the main region of relative comparison). APC/CUbcH10-MCC is the APC/C reconstruction with the lowest overall resolution. Filtering all the reconstruction to 8.5 Å resolution allowed a clear definition of the structural details of Apc15 and other regions without the appearance of noise. To visualize the connecting density between UbcH10 and Cdc20 the APC/CUbcH10-MCC map was filtered to 12 Å resolution based on the local resolution of this area and the threshold was slightly lowered. Initial fitting and superposition of coordinates was performed with Chimera64. Model building of APC/CMCC was performed in COOT65. APC/C platform, TPR lobe, Apc10 and accessory subunit coordinates from the atomic structure of APC/CCdh1.Emi1 (PDB code 4UI9)32 were individually rigid body fit into the APC/CMCC-closed cryo-EM density. A few regions such as Apc4HBD, Apc5NTD and Apc11 were also modified by flexible fitting. The Apc2WHB domain (PDB code 4YII)44 was rigid body fit into the corresponding density. Cdc20APC/C IR tail and NTD were rigid body fit from the coordinates of APC/CCdc20-Hsl1 cryo-EM structure22. The Cdc20MCC IR tail was modelled by superposing the TPR domain of Apc3 including Cdc20IR from APC/CCdc20-Hsl1 to the TPR domain of APC/CMCC Apc8A. Two copies of human the Cdc20WD40 domain (PDB code 4GGA)66, human C-Mad2 (PDB ID: 2V64)8 and the human BubR1TPR domain (PDB code 3SI5)67 were rigid body fit on the MCC–Cdc20 module density. Cdc20MCC CRY box, included in the human Cdc20WD40 domain crystal structure (PDB code 4GGA)66 was modelled by flexible fitting. In addition, the Cdc20 KILR motif was modelled by rigid body fit of the MCC core crystal structure (PDB code 4AEZ)18 into the corresponding density. A similar procedure was applied to model the first KEN1 and helix–loop–helix region of BubR1. BubR1 D1 and D2 were modelled by rigid body fit of Acm1 D-box 3 (PDB code 3BH6)38. Similarly BubR1 A1 and K2 were modelled by flexible fitting of the Acm1 region spanning the A-motif and KEN box as explained in the main text. BubR1 A2 was modelled as a rigid body fit of the Acm1 A-motif. Loop extensions were modelled as idealized polyalanine. Model refinement was performed with REFMAC 5.8 (ref. 68). A REFMAC weight of 0.04 was defined by cross-validation using half reconstructions69. A resolution limit of 4.0 Å was used. All available crystal structures or NMR structures were used for secondary structure restraints. The refinement statistics are summarized in Extended Data Table 2b. Figures were generated using Pymol and Chimera70. Structural conservation figures were generated using ConSurf71. Purified proteins were prepared for mass spectrometric analysis by in solution enzymatic digestion, without prior reduction and alkylation. Protein samples were digested with trypsin or elastase (Promega), both at an enzyme to protein ratio of 1:20. The resulting peptides were analysed by nano-scale capillary LC-MS/MS using an Ultimate U3000 HPLC (ThermoScientific Dionex) to deliver a flow of approximately 300 nl min−1. A C18 Acclaim PepMap100 5 μm, 100 μm × 20 mm nanoViper (ThermoScientific Dionex), trapped the peptides before separation on a C18 Acclaim PepMap100 3 μm, 75 μm × 250 mm nanoViper (ThermoScientific Dionex). Peptides were eluted with a 90-min gradient of acetonitrile (2% to 50%). The analytical column outlet was directly interfaced via a nano-flow electrospray ionization source, with a hybrid quadrupole orbitrap mass spectrometer (Q-Exactive Plus Orbitrap, ThermoScientific). LC–MS/MS data were then searched against an in house LMB database using the Mascot search engine (Matrix Science)72, and the peptide identifications validated using the Scaffold program (Proteome Software Inc.)73. All data were additionally interrogated manually.