Hinxton, United Kingdom
Hinxton, United Kingdom
Time filter
Source Type

No statistical methods were used to predetermine sample size. The investigators were not blinded to allocation during experiments and outcome assessment. Stable cell lines expressing affinity-tagged bait proteins were created according to protocols described previously in detail4. In brief, C-terminally HA–Flag-tagged clones targeting human bait proteins were constructed from clones included in version 8.1 of the human ORFeome (http://horfdb.dfci.harvard.edu)14. All expression clones used in this study are available from the Dana Farber/Harvard Cancer Center DNA Resource Core Facility (http://dnaseq.med.harvard.edu/). After sequence validation, clones were introduced into HEK293T, HCT116, or MCF10A cells (all from American Type Culture Collection) via lentiviral transfection. Cells were expanded under puromycin selection to obtain five 10-cm dishes per cell line before AP–MS. Bait proteins were selected from the ORFeome for high-throughput AP–MS analysis in batches corresponding to individual 96-well plates. Plates were selected for processing in random order. For AP–MS experiments in MCF10A cells, 1.15 × 106 cells per 15 cm dish were collected after 3 days (sub-confluent) or after 14 days in culture (contact inhibited) to allow for expulsion of YAP1 from the nucleus and Hippo pathway activation. MCF10A cells were grown in DMEM/F12 media supplemented with 5% horse serum, 20 ng ml−1 EGF, 10 μg ml−1 insulin, 0.5 μg ml−1 hydrocortisone, 100 ng ml−1 cholera toxin, 50 U ml−1 penicillin, and 50 μg ml−1 streptomycin. All cell lines were found to be free of mycoplasma using Mycoplasma Plus PCR assay kit (Agilent). Karyotyping (GTG-banded karyotype) of HeLa, HCT116, and HEK293T cells for cell line validation was performed by Brigham and Women’s Hospital Cytogenomics Core Laboratory. All AP–MS experiments were performed as presented previously in full4. In brief, cell pellets were lysed in the presence of 50 mM Tris-HCl pH 7.5, 300 mM NaCl, 0.5% (v/v) NP40, followed by centrifugation and filtration to remove debris. Immunoprecipitation was achieved using immobilized and pre-washed mouse monoclonal anti-HA agarose resin (Sigma-Aldrich, clone HA-7) that was incubated with clarified lysate for 4 h at 4 °C before removal of supernatant and four washes with lysis buffer followed by two washes with PBS (pH 7.2). Complexes were eluted in two steps using HA peptide in PBS at 37 °C and subsequently underwent TCA precipitation. Baits were processed in batches corresponding to 96-well plates in the ORFeome collection; plates were processed in random order. In preparation for LC–MS analysis, protein samples were reduced and digested with sequencing-grade trypsin (Promega). Peptides were then de-salted using homemade StageTips30 and approximately 1 μg of peptides were loaded onto C18 reversed-phase microcapillary columns and analysed on Thermo Fisher Q-Exactive mass spectrometers. Data acquisition methods were approximately 70 min long, including sample loading, gradient, and column re-equilibration. Tandem mass spectrometry (MS/MS) spectra were acquired in data-dependent fashion targeting the top 20 precursors for MS2 analysis. Unless noted otherwise, a single biological replicate of each bait was subjected to affinity purification followed by technical duplicate LC–MS analysis. For a complete description of data acquisition parameters, see ref. 4. A brief synopsis of our methods for identifying peptides and proteins from LC–MS data and distinguishing bona fide interacting proteins from background is provided here. For full details, refer to ref. 4. The BioPlex 2.0 network was generated by reanalysing Sequest search results from the BioPlex 1.0 dataset, combined with additional new AP–MS datasets. Sequest31 was used to match MS/MS spectra with peptide sequences from the Uniprot20 human protein database supplemented with sequences of green florescent protein (GFP) (our negative control), our Flag–HA affinity tag, and common contaminant proteins. This version of the UniProt database includes both SwissProt and Trembl entries and was current in 2013, at the outset of this project when the first AP–MS data were collected and searched. All protein sequences were included in forward and reversed orientations. Only fully tryptic peptides with two or fewer missed cleavages were considered, and precursor and product ion mass tolerances were set to 50 p.p.m. and 0.05 Da, respectively. The sole variable modification considered was oxidation of methionine (+15.9949). Target-decoy filtering32 was applied to control FDRs, using a linear discriminant function for peptide filtering and probabilistic scoring at the protein level33. Linear discriminant analysis considered Xcorr, D-Cn, peptide length, charge state, fractions of ions matched, and precursor mass error to distinguish correct from incorrect identifications. Peptide-spectral matches from each run were filtered to a 1% protein-level FDR with additional entropy-based filtering4 to reduce the final dataset protein-level FDR to well under 1%. Protein identifications supported by only a single peptide were discarded as well. These additional post-search filters further reduced the dataset-level FDR by over 100-fold. Scoring to identify HCIPs was performed in multiple stages after combining technical duplicate analyses of each AP–MS experiment and mapping all protein identifiers to Entrez Gene identifiers to minimize technical issues due to protein isoforms. Protein abundances in each immunoprecipitation were quantified using spectral counts averaged across technical replicates. The CompPASS algorithm34, 35 compared abundances of the proteins detected in each immunoprecipitation with their average levels across all other immunoprecipitations, returning a z score that quantified the extent to which a protein’s abundance exceeds its average levels across the dataset as well as the empirical NWD-score that accounted for a protein’s abundance, frequency of detection, and consistency across duplicate analyses. Subsequent filtering based on PSM counts, entropy scoring, and each protein’s frequency of detection within each batch of samples minimized false positives, liquid chromatography carryover, and technical artefacts. Putative bait–prey interactions were further filtered using CompPASS-Plus4, a naive Bayes classifier that learns to distinguish true interacting proteins from non-specific background and false positive identifications on the basis of CompPASS scores and several other metrics described previously. The algorithm modelled true interactions using examples from STRING36 and GeneMania37 databases. False positive protein identifications were modelled using decoy identifications that had survived previous filters. All remaining data were used to model background. Cross-validation was applied by batch, with each 96-well plate of immunoprecipitations scored using a model trained on ~57 different plates. Bait–prey interactions were then assembled across immunoprecipitations to produce a single network, combining scores of reciprocal interactions to increase their weight. BioPlex 2.0 was obtained by pruning this network to retain only those interactions that earned scores above 0.75, as described previously4. See Supplementary Table 1 for a list of baits as well as a complete list of interactions. BioPlex 2.0 interaction data were compared with data from BioGRID38, CORUM15, STRING36, GeneMania37, and MINT39 databases as described previously4. Because the BioPlex 2.0 dataset incorporates the contents of BioPlex 1.0 and data from this project have been deposited directly into BioGRID, released to the scientific community via the project website (http://bioplex.hms.harvard.edu), and otherwise distributed40 at intervals throughout the project, snapshots of these databases predating public disclosure of any BioPlex data were used to ensure that no interactions derived from BioPlex were included in the comparison. In Extended Data Fig. 1a, several data sources were used to determine the fractions of various protein families included as baits or preys in BioPlex 1.0 or 2.0. The list of human kinases was downloaded from kinase.com (http://kinase.com/web/current/human/; December 2007 update). Mitochondrial proteins were taken from MitoCarta 2.0 (ref. 41). Lists of transcription factors and chromatin-remodelling factors were drawn from http://www.bioguo.org. Drug target lists were taken from http://www.drugbank.ca. Cancer genes were taken from ref. 42. Disease genes were extracted from the curated set of disease–gene associations in the DisGeNET database25. ‘Essential’ genes were taken from recent papers describing clustered regularly interspaced palindromic repeat (CRISPR)–Cas9 screening to identify human genes that confer a fitness advantage6, 7. In each case, protein identifiers were converted to Entrez Gene identifiers, if necessary, and compared against those gene products included in either interaction network. Each of these analyses was performed exactly as described previously4. Brief summaries follow. Subcellular localization predictions relied upon localization information provided for a subset of proteins by the UniProt website (http://www.uniprot.org) in March 2016. These localization terms were manually condensed to 13 core localizations: nucleus, cytoplasm, cytoskeleton, endosome, endoplasmic reticulum, extracellular, Golgi, lysosome, mitochondrion, peroxisome, plasma membrane, vesicle, and cell projection. Fisher’s exact test was used to calculate the enrichment of each term among each protein’s primary and secondary neighbours, with multiple testing correction43. Predictions were made when enrichments were significant at an adjusted FDR of 1%. Localization predictions are provided in Supplementary Table 3. Domain–domain associations were uncovered by mapping PFAM domains onto the 56,553 protein–protein interactions in the BioPlex 2.0 network. After counting the numbers of interactions involving each domain individually and the number of interactions in which the domains were brought together within separate proteins, Fisher’s exact test was used to evaluate significance with subsequent correction for multiple hypothesis testing. Domains were considered significantly associated at an adjusted P value less than 0.01. Significant domain–domain associations are summarized in Supplementary Table 4. The enrichment of GO44 terms and PFAM22 domains was determined among each protein’s immediate neighbours and for each network community using Fisher’s exact test with multiple testing correction43. GO and PFAM data were downloaded from the UniProt website (http://www.uniprot.org) in March 2016. Only terms occurring at least twice were considered. Enrichments of GO terms and PFAM domains among each protein’s neighbours are summarized in Supplementary Table 5. The MCL algorithm5 was used to partition the BioPlex 2.0 network into communities of tightly interconnected proteins, using an implementation provided by the algorithm’s creator, S. van Dongen, at http://micans.org/mcl/. The option –force-connected=y was used to ensure that final clusters correspond to connected components. The MCL algorithm requires specification of one parameter, the inflation parameter, which controls the granularity of the clusters that are produced. Clustering of BioPlex 2.0 was repeated for several values of the inflation parameter between 1.5 and 2.5. After comparing experimentally derived clusters with known protein complexes, an inflation parameter of 2.0 was selected for final clustering. Clusters containing fewer than three proteins were discarded, producing a final list of 1,320 protein communities. Each cluster and its members are summarized in Supplementary Table 6; GO terms and PFAM domains enriched in each community are provided in Supplementary Table 7. One important question has been the extent to which each of the clusters observed in BioPlex 2.0 is also visible in BioPlex 1.0. To address this question, we mapped each cluster detected in BioPlex 2.0 onto the BioPlex 1.0 network. If a given cluster was also reflected in the BioPlex 1.0, then we would expect to see an enrichment of interactions; conversely, if interactions were not enriched among the relevant set of proteins above background, then there would be no evidence to support the indicated cluster. After mapping each cluster of tightly interconnected proteins from BioPlex 2.0 onto the BioPlex 1.0 network, we used a binomial test to evaluate the enrichment of BioPlex 1.0 interactions among matching proteins. The probability of interaction was estimated from the fraction of all possible interactions in the BioPlex 1.0 network that was actually detected (8.08 × 10−4); the number of trials was taken to be the maximum number of interactions possible among those proteins within the cluster that were part of the BioPlex 1.0 network; the number of interactions actually observed in this portion of BioPlex 1.0 was taken as the number of successes. A one-sided binomial test was performed and a correction for multiple testing was applied43. Overall, 45% of complexes detected in BioPlex 2.0 did not show any enrichment for protein interactions in BioPlex 1.0, suggesting that these were macromolecular complexes not covered in the first interaction network. Moreover, although the remaining 55% of complexes were at least partly reflected in BioPlex 1.0, the density of their coverage consistently increased with incorporation of additional AP–MS data into the BioPlex 2.0 network. In addition to using MCL clustering to partition the BioPlex 2.0 network into individual clusters of tightly interconnected proteins, we also wanted to explore patterns of interconnection within the network that related these clusters to each other. For this purpose, we searched for pairs of clusters that were connected to each other through interactions among their constituent proteins more often than would be expected. First, the full set of 56,553 interactions was trimmed to include only those interactions connecting one cluster with another, and the set of all cluster pairs connected by one or more interactions was identified. For each of these pairs of clusters, the number of interactions connecting the pair was determined, as were the numbers of interactions involving each cluster individually. Fisher’s exact test was used to identify pairs of clusters that were enriched for interactions among them, followed by multiple testing correction43. The 929 cluster–cluster associations that were accepted at a 1% FDR are displayed in Fig. 3a and Extended Data Fig. 9 and provided in Supplementary Table 6. GO and PFAM enrichments for each community are summarized in Supplementary Table 7. The first step towards examining network properties of fitness proteins was to combine lists of proteins associated with increased cellular fitness from refs 6, 7 into a single composite list. For our purposes, we used the union of both lists to define the set of fitness proteins. Entrez Gene identifiers were associated with proteins on this list and mapped onto the BioPlex 2.0 network. To assess network properties of fitness proteins, the composite list of proteins associated with increased cellular fitness was superimposed onto the BioPlex network, effectively subdividing all proteins in the network into two groups corresponding to fitness and non-fitness proteins. Vertex degrees, local clustering coefficients, and eigenvector centralities were then computed and averaged across all fitness proteins. To evaluate whether these values differed for fitness proteins compared with randomly selected protein subsets of equivalent size, fitness and non-fitness labels were scrambled across the network and a new average was calculated for the randomized list of fitness proteins. This process was repeated 10,000 times to define null distributions for each statistic. Since these distributions were normally distributed, Gaussian distributions were fitted to each and used to assign z scores and P values for each statistic associated with the true set of fitness proteins. To evaluate graph assortativity, the BioPlex network was subdivided into fitness and non-fitness proteins and the assortativity of the partitioned graph was calculated. This process was repeated 10,000 times, randomizing fitness and non-fitness labels, and the resulting distribution was fitted to a Gaussian distribution and used to determine a z score and P value associated with the true assortativity. A second goal was to identify clusters enriched with fitness proteins. For this purpose, a one-sided hypergeometric test was used to evaluate the enrichment of fitness proteins, taking into account the size of the cluster, the size of the BioPlex network, and the fraction of network proteins that were associated with increased cellular fitness. Only clusters containing two or more fitness proteins were considered for this analysis. Once a multiple testing correction43 was applied, 53 communities were found to be enriched with fitness proteins at a 1% FDR. These clusters are summarized in Extended Data Fig. 9. Levels of enrichment are summarized for those communities containing two or more cellular fitness proteins in Supplementary Table 8. To assess the tendency for clusters containing fitness proteins or enriched for fitness proteins to be centrally located within the cluster–cluster association network (Fig. 3a), all clusters were sorted according to their eigenvector centralities. The Kolmogorov–Smirnov test was used to compare distributions of clusters enriched and not enriched with fitness proteins within the ranked list of all clusters. This process was repeated to compare distributions of clusters containing multiple fitness proteins with clusters containing 0 or 1 fitness proteins, as shown in Fig. 3d. The basis for our study of protein complexes and disease was the DisGeNET database of disease–gene associations25. For our analysis we used the full database that relates over 16,000 genes with 13,000 partly redundant disease classifications. Each disease state and its associated proteins were then mapped onto each BioPlex 2.0 complex and evaluated for enrichment using a hypergeometric test, taking into account the size of the complex, the number of disease proteins in the complex, the number of disease proteins within the network, and the total network size. This process was repeated for each community and for each disease state. After multiple testing correction43, those complexes enriched with proteins involved with each disease at a 1% FDR were deemed associated. The resulting disease–complex associations were assembled into a network in which clusters and disease states are both represented as nodes, with edges connecting clusters with significantly associated disease states, depicted in full in Fig. 4a. All significant disease-cluster associations are provided in Supplementary Table 8. The eigenvector centralities assigned to disease states within the composite disease-community network were used to compare across a range of disease states. Disease classifications were taken from the DisGeNET database as reported in their SQLite download. All disease states in the network were ranked according to increasing eigenvector centrality. For each disease classification (for example, ‘neoplasms’), a Kolmogorov–Smirnov test was used to compare the distributions of matching and non-matching disease states within the entire ranked list. After multiple testing correction, disease states that appeared differentially distributed with respect to eigenvector centrality at a 1% FDR were identified and are highlighted in Fig. 4b. HEK293T cells were transfected with Flag–HA–GFP control plasmid, C13orf18–GFP, GFP–BECN1, or RUFY1–Flag–HA plasmids, and, after 48 h, cells were collected in lysis buffer (50 mM Tris pH 7.5, 150 mM NaCl, 1% NP-40), with protease and phosphatase inhibitors (Roche) on ice. Lysates were cleared by centrifugation, and subjected to affinity purification using anti-GFP antibodies (Chromotek, GFP–Trap, GTMA-20) or anti-Flag magnetic beads (Sigma-Aldrich, A2220)) for 2 h at 4 °C. Beads were washed four times with lysis buffer, and subsequently subjected to SDS–PAGE and immunoblotting with the following antibodies: BECN1 (Cell Signaling, clone D40C5), GFP (Roche, mouse IgG clones 7.1 and 13.1), C13orf18 (Proteintech, 21183-1-AP), and HA (Biolegend, clone HA.11). For validation of Hippo pathway interactions within BioPlex 2.0, we performed AP–MS experiments in MCF10A cells. Unlike HEK293T cells, MCF10A cells undergo contact inhibition and activate the Hippo signalling pathway; therefore we used cells under both sub-confluent and confluent conditions wherein YAP1 expulsion from the nucleus was verified by immunofluorescence (see section on ‘Clone construction and cell culture’). Affinity purification was performed essentially as described previously34, but eluted anti-HA immune complexes (Sigma-Aldrich, clone HA-7) were analysed in two ways. First, immune complexes for PDLIM7, MAGI1, YAP1, WWC1, NF2, and MPP5 (replicate 1) were subjected to LC–MS/MS analysis on an LTQ-Velos instrument and HCIPs identified using CompPASS34 in combination with a false positive background dataset derived in MCF10A cells45. The second replicate set for PDLIM7, MAGI1, YAP1, WWC1, NF2, and MPP5, as well as both replicates for PTPN14 and INADL, were processed identically to the first set except that the HA-eluted proteins were reduced and alkylated with DTT and iodoacetamide before trypsin digestion, and all the digested peptides corresponding to one sub-confluent and one confluent anti-HA immunoprecipitation were labelled heavy and light respectively, by reductive dimethylation46. Sub-confluent and confluent sample pairs corresponding to each bait were mixed to normalize the amount of bait present in each heavy and light fraction to 1:1 and analysed on an Orbitrap Elite Hybrid Ion Trap-Orbitrap Mass Spectrometer (ThermoFisher). Complexes from each growth condition were deconvolved using linear discriminant analysis parameters that filtered for either heavy-only or light-only labelled peptides. The heavy- or light-specific search results were subsequently imported into CompPASS for protein interaction analysis. Spectral count and CompPASS score data for the MCF10A dataset is provided in Supplementary Table 10. Anti-PTPN14 antibodies were from Sigma-Aldrich (GW21498A). We used CRISPR–Cas9 gene editing to knockout KIAA0196 using the gRNA sequence (GTCTAAGCCATTTAGACCAA) as described47. The KIAA0196 ORF (a gift from C. Clemen, University of Cologne) was cloned into pLenti-NTAP-IRES-Puro and expressed in KIAA0196−/− cells after selection using puromycin (1 μg ml−1). Immunoprecipitation with anti-Flag (Sigma-Aldrich, M2) antibodies, trypsinization, tandem mass tagging labelling, analysis by mass spectrometry, and quantification were performed as described previously4. Parallel immune complexes or whole-cell lysates were subjected to immunoblotting with anti-WASH1 (Sigma-Aldrich, SAB4200373), anti-KIAA0196 (Santa Cruz Biotechnology, sc-87442), anti-KIAA1033 (Bethyl Labs, A304-919A), anti-CCDC53 (Proteintech, 24445-1-AP), anti-PCNA (Santa Cruz Biotechnology, sc-56), or anti-actin (Santa Cruz Biotechnology, sc-69879) and immunoblot signals quantified using Protein Simple M in biological triplicate. HeLa cells (American Type Culture Collection) were plated on glass coverslips (Zeiss) and transiently transduced with lentiviral vectors expressing C-Flag–HA-tagged baits. At 48 h after infection, cells were fixed with 4% paraformaldehyde for 15 min at room temperature. Cells were washed in PBS, then blocked for 1 h with 5% normal goat serum (Cell Signaling Technology) in PBS containing 0.3% Triton X-100 (Sigma-Aldrich). Coverslips were incubated with anti-HA antibodies (mouse monoclonal, clone HA.11, BioLegend) or anti-HA plus anti-TOMM20 (rabbit polyclonal mitochondrial marker, Santa Cruz Biotechnology, clone FL-145, catalogue number 11415) for 2 h at room temperature in a humidified chamber. Cells were washed three times with PBS, then incubated for 1 h with appropriate Alexa Fluor-conjugated secondary antibodies (ThermoFisher). Nuclei were stained with Hoechst, and cells were washed three times with PBS and mounted on slides using Prolong Gold mounting media (ThermoFisher). All images were collected with a Yokogawa CSU-X1 spinning disk confocal scanner with Spectral Applied Research Aurora Borealis modification on a Nikon Ti-E inverted microscope using a 100 × Plan Apo numerical aperture 1.4 objective lens (Nikon Imaging Center, Harvard Medical School). Confocal images were acquired with a Hamamatsu ORCA-AG cooled CCD (charge-coupled device) camera controlled with MetaMorph 7 software (Molecular Devices). Fluorophores were excited using a Spectral Applied Research LMM-5 laser merge module with acousto-optic tuneable filter (AOTF)-controlled solid-state lasers (488 nm and 561 nm). A Lumencor SOLA fluorescence light source was used for imaging Hoechst staining. z series optical sections were collected with a step size of 0.2 μm, using the internal Nikon Ti-E focus motor, and stacked using MetaMorph to construct maximum intensity projections. We performed three major validation experiments using (1) analysis of a dozen bait proteins in both HCT116 colon cells and HEK293T cells to examine overlap in interaction partners, (2) reciprocal AP–MS experiments directed at interacting proteins for a set of 14-3-3 proteins, and (3) analysis of the PDLIM7–PTPN14–YAP1 adhesion network in MCF10A cells. As a validation approach, we selected 12 largely unstudied proteins displaying a range of interaction partners from 1 to 25 in HEK293T cells and performed AP–MS in HCT116 cells, a cell line of distinct tissue origin from HEK293T cells. After identification of HCIPs for proteins in HCT116 cells, we determined the interactions in common with HEK293T cells (Extended Data Fig. 1b–m). Over the 12 bait proteins identified, we observed 30–100% validation of interactions seen for individual baits in HEK293T cells. Cumulatively, this reflected an overall 60% validation (92 of 147 interactions seen in HEC293T cells were seen in HCT116). This rate of validation is comparable to that seen in focused studies examining F-box protein interactors in these two cell lines (51%)48. Thus, a substantial fraction of interactions seen in HEK293T cells are recapitulated in HCT116 cells. The 14-3-3 proteins represent a well-studied group of seven proteins (YWHAB, YWHAE, YWHAZ, YWHAH, YWHAQ, YWHAG, and SFN) that typically associate with phosphorylated proteins. Thirty-nine baits in BioPlex 2.0 were found to interact with one or more of these 14-3-3 proteins, with YWHAZ being detected most frequently (35 baits) and SFN being detected the least frequently (4 baits) (Extended Data Fig. 2). Seventeen of these proteins are not known to interact with 14-3-3 proteins on the basis of BioGrid. Because only the atypical 14-3-3 protein SFN had been targeted as a bait in BioPlex 2.0, the remaining six 14-3-3 proteins were submitted to our standard AP–MS pipeline using ORFeome 8.1 clones; while the clone for YWHAE failed at the sequence validation stage, the remaining five 14-3-3 proteins were processed successfully, identifying 130–360 HCIPs (Supplementary Table 2). While eight of 39 BioPlex 2.0 baits that had been observed to interact with one or more 14-3-3 proteins were not detected in HEK293T cells and thus may be impossible to detect in reciprocal immunoprecipitations, 63% of interactions eligible for reciprocal detection were confirmed (Extended Data Fig. 2a–c). This demonstrates that BioPlex 2.0 may reliably reveal novel reciprocally interacting partners even for proteins as well studied as 14-3-3 proteins. PTPN14 is a protein phosphatase that has recently been found to associate with several proteins within the Hippo pathway involving the transcription factor YAP1. The Hippo pathway is regulated by contact inhibition, and promotes YAP1 sequestration in the cytoplasm49. BioPlex 2.0 contains a highly connected group of proteins centred on PTPN14, MAGI1, MPP5, LIN7A/C, and INADL (Extended Data Fig. 2d). This network contained several interactions not seen in BioGrid. To validate these interactions, we performed an AP–MS analysis or immunoprecipitation–western analysis of PTPN14, MAGI1, MPP5, PDLIM7, INADL, WWC1, NF2, and YAP1 after stable expression in MCF10A cells in both sub-confluent and confluent states. This series of experiments strongly validated interactions seen in HEK293T cells (Extended Data Fig. 2d, f) with 65% of eligible interactions being seen in both cell lines, further validating our method and the ability of BioPlex 2.0 to robustly identify interactions. Furthermore, 63% of interactions identified in both BioPlex 2.0 and MCF10A cells were novel, having not been previously described in several previous interaction profiling experiments (Extended Data Fig. 2g). Overall, these three lines of study indicate the ability of BioPlex 2.0 to identify interactions that can be validated reciprocally or in other cell lines. The BioPlex 2.0 network and its underlying data are available in several formats. First, all interactions in the BioPlex network have been deposited in the BioGRID protein interaction database. Second, we have created a website devoted to the project (http://bioplex.hms.harvard.edu) which provides tools to download (1) the interactions that make up BioPlex 1.0 and 2.0, (2) a customized viewer that enables browsing of either network to examine the interactions of specific proteins, (3) an interface for download of nearly 12,000 individual RAW files containing mass spectrometry data from individual AP–MS experiments, and (4) an R package and web-based tool for performing CompPASS analyses. Third, the BioPlex 2.0 network as bait–prey pairs has been incorporated into NDEx40, a web-based platform for biological Network Data Exchange. Fourth, our RAW files have been submitted for inclusion in ProteomicsDB50. Finally, all RAW files (3 Tb) from this study will be provided to investigators upon request using investigator-provided hard drives. Finally, a table in.tsv format containing all proteins and spectral count information for all 5,891 AP–MS experiments reported here is available for download at the BioPlex website. All other data are available from the corresponding authors upon reasonable request.

News Article | April 27, 2016
Site: www.nature.com

No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment. From 2009–2013, environmental data (Supplementary Table 9) were collected across all major oligotrophic oceanic provinces in the context of the Tara Oceans expeditions20. Sampling stations were selected to represent distinct marine ecosystems at a global scale51. Note that Southern Ocean stations were not examined herein because they were ranked as outliers due to their exceptional environmental characteristics and biota23, 24. Environmental data were obtained from vertical profiles of a sampling package48, 49. It consisted of conductivity and temperature sensors, chlorophyll and CDOM fluorometers, light transmissometer (Wetlabs C-star 25 cm), a backscatter sensor (WetLabs ECO BB), a nitrate sensor (SATLANTIC ISUS) and an underwater vision profiler (Hydroptics UVP52). Nitrate and fluorescence to chlorophyll concentrations as well as salinity were calibrated with water samples collected with Niskin bottle48. Net primary production (NPP) data were extracted from 8-day composites of the vertically generalized production model (VGPM)53 at the week of sampling50. Carbon fluxes and carbon export, corresponding to the carbon flux at 150 m, were estimated based on particle concentration and size distributions obtained from the UVP49 and details are presented below. Previous research has shown that the distribution of particle size follows a power law over the micrometre to the millimetre size range3, 54, 55. This Junge-type distribution translates into the following mathematical equation, whose parameters can be retrieved from UVP images: where d is the particle diameter, and exponent k is defined as the slope of the number spectrum when equation (1) is log transformed. This slope is commonly used as a descriptor of the shape of the aggregate size distribution. The carbon-based particle size approach relies on the assumption that the total carbon flux of particles (F) corresponds to the flux spectrum integrated over all particle sizes: where n(d) is the particle size spectrum, that is, equation (1), and m(d) is the mass (here carbon content) of a spherical particle described as: where , is the average density of the particle, and w(d) is the settling rate calculated using Stokes Law: where , is the gravitational acceleration, the fluid density, and the kinematic viscosity. In addition, mass and settling rates of particles, m(d) and w(d), respectively, are often described as power law functions of their diameter obtained by fitting observed data, . The particles carbon flux can then be estimated using an approximation of equation (2) over a finite number (x) of small logarithmic intervals for diameter d spanning from 250 μm to 1.5 mm (particles <250 μm and >1.5 mm are not considered, consistent with the method presented in ref. 56) such as where A = 12.5 ± 3.40 and B = 3.81 ± 0.70 have been estimated using a global data set that compared particle fluxes in sediment traps and particle size distributions from the UVP images. For the sake of consistency between all available data sets from the Tara Oceans expeditions, we considered subsets of the data recently published in Science23, 24, 25. In brief, one sample corresponds to data collected at one depth (surface (SRF) or deep chlorophyll maximum (DCM) determined from the profile of chlorophyll fluorometer) and at one station. To study the eukaryotic community in our current manuscript, we selected stations at which we had environmental data and carbon export estimated at 150 m with the UVP and all size fractions. Consequently a subset of 33 stations (corresponding to 56 samples) has been created compared to the 47 stations analysed in ref. 24. A similar procedure has been applied to the prokaryotic and viral data sets, reducing the prokaryotic data set from ref. 23 to a subset of 104 samples from 62 stations and the viral data set from ref. 25 into a subset of 37 samples from 22 stations (See Supplementary Table 10). In addition a detailed table is provided summarizing which samples (depth and station) are available for each domain (Supplementary Table 11). Photic-zone eukaryotic plankton diversity has been investigated through millions of environmental Illumina reads. Sequences of the 18S ribosomal RNA gene V9 region were obtained by PCR amplification and a stringent quality-check pipeline has been applied to remove potential chimaera or rare sequences (details on data cleaning in ref. 24). For 47 stations, and if possible at two depths (SRF and DCM), eukaryotic communities were sampled in the piconano- (0.8–5 μm), micro- (20–180 μm) and mesoplankton (180–2,000 μm) fractions (a detailed list of these samples is given in Supplementary Table 12). In the framework of the carbon export study, sequences from all size fractions were pooled in order to get the most accurate and statistically reliable data set of the eukaryotic community. The 2.3 million eukaryotic ribotypes were assigned to known eukaryotic taxonomic entities by global alignment to a curated database24. To get the most accurate vision of the eukaryotic community, sequences showing less than 97% identity with reference sequences were excluded. The final eukaryotic relative abundance matrix used in our analyses included 1,750 lineages (taxonomic assignation has been performed using a last common ancestor methodology, and had thus been performed down to species level when possible) in 56 samples from 33 stations. Pooled abundance (number of V9 sequences) of each lineage has been normalized by the total sum of sequences in each sample. To investigate the prokaryotic lineages, communities were sampled in the picoplankton. Both filter sizes have been used along the Tara Oceans transect: up to station #52, prokaryotic fractions correspond to a 0.22–1.6 μm size fraction, and from station #56, prokaryotic fractions correspond to a 0.22–3 μm size fraction. Prokaryotic taxonomic profiling was performed using 16S rRNA gene tags directly identified in Illumina-sequenced metagenomes ( tags) as described in ref. 57. 16S tags were mapped to cluster centroids of taxonomically annotated 16S reference sequences from the SILVA database58 (release 115: SSU Ref NR 99) that had been clustered at 97% sequence identity using USEARCH v. 6.0.30759. 16S tag counts were normalized by the total reads count in each sample (further details in ref. 23). The photic-zone prokaryotic relative abundance matrix used in our analyses included 3,253,962 tags corresponding to 1,328 genera in 104 samples from 62 stations. For each prokaryotic sample, gene relative abundance profiles were generated by mapping reads to the OM-RGC using the MOCAT pipeline60. The relative abundance of each reference gene was calculated as gene-length-normalized base counts. And functional abundances were calculated as the sum of the relative abundances of these reference genes, annotated to OG functional groups. In our analyses, we used the subset of the OM-RGC that was annotated to Bacteria or Archaea (24.4 million genes). Using a rarefied (to 33 million inserts) gene count table, an OG was considered to be part of the ocean microbial core if at least one insert from each sample was mapped to a gene annotated to that OG. For further details on the prokaryotic profiling please refer to ref. 23. The final prokaryotic functional relative abundance matrix used in our analyses included 37,832 OGs or functions in 104 samples from 62 stations. Genes from functions of FNET1 and FNET2 subnetworks were taxonomically annotated using a modified dual BLAST-based last common ancestor (2bLCA) approach61. We used RAPsearch262 rather than BLAST to efficiently process the large data volume and a database of non-redundant protein sequences from UniProt (version: UniRef_2013_07) and eukaryotic transcriptome data not represented in UniRef (see Supplementary Tables 5 and 6, for full annotations). For prokaryote enumeration by flow cytometry, three aliquots of 1 ml of seawater (pre-filtered by 200-μm mesh) were collected from both SRF and DCM. The samples were fixed immediately using cold 25% glutaraldehyde (final concentration 0.125%), left in the dark for 10 min at room temperature, flash-frozen and kept in liquid nitrogen on board and then stored at −80 °C on land. Two subsamples were taken to separate counts of heterotrophic prokaryotes (not shown herein) and phototrophic picoplankton. For heterotrophic prokaryote determination, 400 μl of sample was added to a diluted SYTO-13 (Molecular Probes Inc.) stock (10:1) at 2.5 μ mol l−1 final concentration, left for about 10 min in the dark to complete the staining and run in the flow cytometer. We used a FacsCalibur (Becton & Dickinson) flow cytometer equipped with a 15 mW argon-ion laser (488 nm emission). At least 30,000 events were acquired for each subsample (usually 100,000 events). Fluorescent beads (1 μm, Fluoresbrite carboxylate microspheres, Polysciences Inc.) were added at a known density as internal standards. The bead standard concentration was determined by epifluorescence microscopy. For phototrophic picoplankton, we used the same procedure as for heterotrophic prokaryote, but without addition of SYTO-13. Data analysis was performed with FlowJo software (Tree Star, Inc.). In order to associate viruses to carbon export we used viral populations as defined in ref. 25 using a set of 43 Tara Oceans viromes. In brief, viral populations were defined as large contigs (>10 predicted genes and >10 kb) identified as most likely originating from bacterial or archaeal viruses. These 6,322 contigs remained and were then clustered into populations if they shared more than 80% of their genes at >95% nucleotide identity. This resulted in 5,477 ‘populations’ from the 6,322 contigs, where as many as 12 contigs were included per population. For each population, the longest contig was chosen as the ‘seed’ representative sequence. The relative abundance of each population was computed by mapping all quality-controlled reads to the set of 5,477 non-redundant populations (considering only mapping quality scores greater than 1) with Bowtie2 (ref. 63) and if more than 75% of the reference sequence was covered by virome reads. The relative abundance of a population in a sample was computed as the number of base pairs recruited to the contig normalized to the total number of base pairs available in the virome and the contig length if more than 75% of the reference sequence was covered by virome reads, and set to 0 otherwise (see ref. 25 for further details). The final viral population abundance matrix used in our analyses included 5,291 viral population contigs in 37 samples from 22 stations. The longest contig in a population was defined as the seed sequence and considered the best estimate of that population’s origin. These seed sequences were used to assess taxonomic affiliation of each viral population. Cases where >50% of the genes were affiliated to a specific reference genome from RefSeq Virus (based on a BLASTP comparison with thresholds of 50 for bit score and 1 × 10−5 for e-value) with an identity percentage of at least 75% (at the protein sequence level) were considered as confident affiliations to the corresponding reference virus. The viral population host group was then estimated based on these confident affiliations (see Supplementary Table 13 for host affiliation of viral population contigs associated to carbon export). Viral protein clusters (PCs) correspond to ORFs initially mapped to existing clusters (POV, GOS and phage genomes). The remaining, unmapped ORFs were self-clustered, using cd-hit as described in ref. 25. Only PCs with more than two ORFs were considered bona fide and were used for subsequent analyses. To compute PC relative abundance for statistical analyses, reads were mapped back to predicted ORFs in the contigs data set using Mosaik as described in ref. 25. Read counts to PCs were normalized by sequencing depth of each virome. Importantly, we restricted our analyses to 4,294 PCs associated to the 277 viral population contigs significantly associated to carbon export in 37 samples from 22 stations. In order to directly associate eukaryotic lineages to carbon export and other environmental traits (Fig. 1b), we used sparse partial least square (sPLS)64 as implemented in the R package mixOmics29. We applied the sPLS in regression mode, which will model a causal relationship between the lineages and the environmental traits, that is, PLS will predict environmental traits (for example, carbon export) from lineage abundances. This approach enabled us to identify high correlations (see Supplementary Table 1) between certain lineages and carbon export but without taking into account the global structure of the planktonic community. Weighted correlation network analysis (WGCNA) was performed to delineate feature (lineages, viral populations, PCs or functions) subnetworks based on their relative abundance65, 66. A signed adjacency measure for each pair of features was calculated by raising the absolute value of their Pearson correlation coefficient to the power of a parameter p. The default value p = 6 was used for each global network, except for the Prokaryotic functional network where p had to be lowered to 4 in order to optimize the scale-free topology network fit. Indeed, this power allows the weighted correlation network to show a scale-free topology where key nodes are highly connected with others. The obtained adjacency matrix was then used to calculate the topological overlap measure (TOM), which for each pair of features, taking into account their weighted pairwise correlation (direct relationships) and their weighted correlations with other features in the network (indirect relationships). For identifying subnetworks a hierarchical clustering was performed using a distance based on the TOM measure. This resulted in the definition of several subnetworks, each represented by its first principal component. These characteristic components play a key role in weighted correlation network analysis. On the one hand, the closeness of each feature to its cluster, referred to as the subnetwork membership, is measured by correlating its relative abundance with the first principal component of the subnetwork. On the other hand, association between the subnetworks and a given trait is measured by the pairwise Pearson correlation coefficients between the considered environmental trait and their respective principal components. A similar protocol has been performed on the eukaryotic relative abundance matrix, the prokaryotic relative abundance matrix, the prokaryotic functions relative abundance matrix and the viral population and PC relative abundance matrices. All procedures were applied on Hellinger-transformed log-scaled abundances. Notably, the protocol is not sensitive to copy number variation as observed across different eukaryotic species, because the association between two species relies on a correlation score between relative abundance measurements. Computations were carried out using the R package WGCNA33. Given the nature of the eukaryotic data set (three distinct size fractions), the sampling process may lead to the loss of size fractions. In particular, samples 1, 3, 17, 37, 39, 43, 48, 53, 54, 55 and 66 are eventually biased by such a loss (Supplementary Table 12). A complementary WGCNA analysis was performed with addition of these samples to evaluate the robustness of our protocol to missing size fractions. The composition of the eukaryotic subnetwork built with an extended data set (that is, 67 samples from 37 stations for which size fractions were missing in 11 samples) was compared to the subnetwork as presented above (that is, 56 samples from 33 stations). Both subnetworks show an overlap of 75% of lineage, whereas four of the top five VIP lineages with the extended data set (see Extended Data Fig. 5 for details) can be found in the top six VIP lineages of the above subnetwork (Supplementary Table 2), emphasizing highly similar results and a small sensitivity to size fraction loss. For each subnetwork (called modules within WGCNA) extracted from each global network, pairwise Pearson correlation coefficients between the subnetwork principal components and the carbon export estimation was computed, as well as corresponding P values corrected for multiple testing using the Benjamini and Hochberg FDR procedure. The subnetworks showing the highest correlation scores are of interest and were investigated. One subnetwork (49 nodes) was significant within the eukaryotic network; one subnetwork (109 nodes) was significant for the prokaryotic network; one subnetwork (277 nodes) was significant within the virus network; two subnetworks (441 and 220 nodes) were significant within the prokaryotic functional network, and two subnetworks (1,879 and 2,147 nodes) were significant within the viral PCs network. In addition to the network analyses, we asked whether the identified subnetworks can be used as predictors for the carbon export estimations. To answer this question, we used partial least squares (PLS) regression, which is a dimensionality-reduction method that aims at determining predictor combinations with maximum covariance with the response variable. The identified combinations, called latent variables, are used to predict the response variable. The predictive power of the model is assessed by correlating the predicted vector with the measured values. The significance of the prediction power was evaluated by permuting the data 10,000 times. For each permutation, a PLS model was built to predict the randomized response variable and a Pearson correlation was calculated between the permuted response variable and in leave-one-out cross-validation (LOOCV) predicted values. The 10,000 random correlations are compared to the performance of the PLS model that were used to predict the true response variable. In addition, the predictors were ranked according to their value importance in projection (VIP)67. The VIP measure of a predictor estimates its contribution in the PLS regression. The predictors having high VIP values are assumed important for the PLS prediction of the response variable. The VIP values of the prokaryotic functional subnetworks are provided in Supplementary Tables 5, 6. For the sake of illustration, only lineages or functions with VIP >1 (ref. 67) are discussed and pictured in Figs 2 and 4. Our computations were carried out using the R package pls68. All programs are available under GPL Licence. Nodes of the subnetworks represent either lineages (eukaryotic, prokaryotic or viral) or functions (prokaryotic or viral). Subnetworks related to the carbon export have been represented in two distinct formats. Scatter plots represent each nodes based on their Pearson correlation to the carbon export and their respective node centrality within the subnetwork. The latter has been recomputed using significant Spearman correlations above 0.3 (>0.9 for viral PCs) as edges, this is done for visualization purposes since WGCNA subnetworks (based on the topology overlap measure (TOM) between nodes) are hyper-connected. Size representation of nodes are proportional to the VIP score after PLS. The hive plots depict the same subnetworks by focusing on two main features: x axis and y axis depict nodes of subnetworks ranked by their VIP scores and Pearson correlation to the carbon export, respectively.

News Article | December 13, 2016
Site: www.biosciencetechnology.com

Understanding rare diseases. An estimated 350 million people worldwide suffer from one of 7000 rare diseases. Currently, there is no approved drug treatment for 95 percent of these diseases—but bioinformatics tools are changing that picture. Drug repurposing is a relatively fast and safe way to identify drugs that have already met pharmacovigilance and regulatory requirements, and are therefore poised for approval for a new indication for a rare disease. Bioinformatics tools give R&D professionals the ability to rapidly mine data from the literature, regulatory documents, clinical trials and other patient-centric information to more quickly target an existing drug to a specific rare disease population, and also to identify other promising repurposing candidates. For example, the UK nonprofit Findacure aims to save the National Health Service money by identifying potentially repurposable off-patent drugs to treat rare diseases. Elsevier is donating bioinformatics tools and expertise to help in the nonprofit’s first proof-of-concept effort—the development of an evidence base to support the launch of a clinical trial that will test the efficacy of the immunosuppressant sirolimus for children with congenital hyperinsulinism. On a larger scale, dozens of institutions are partnering with RD-Connect, a global effort to share, standardize and analyze data from biobanks and registries to facilitate the discovery of diagnostic tools and therapies for rare diseases. The initiative also intends to develop bioinformatics tools to facilitate data sharing and analysis, as well as cohort selection for clinical trials. Accelerating immunotherapy development. Developing targeted, personalized immunotherapies demands a cellular-level understanding of specific immune cells. However, identifying all the relevant data in the scientific literature is difficult because different laboratories may use different nomenclature when referring to the same proteins and processes. For example, Carnegie Mellon researchers recently undertook a DARPA project that involved creating a summary of “core knowledge” about KRAS signaling pathways to help with the development of new disease models for cancer. They discovered that the gene of interest alone might be referred to as KRAS, KRAS2 or RASK2, according to UniProt, and its protein may be referred to as GTPase Kras; K-Ras 2; Ki-Ras; c-K-ras; or c-Ki-ras. Moreover, KRAS interacts with about 150 other proteins in the human genome, all of which are likely to have multiple synonyms. Natural language processing software, another bioinformatics tool, helped to standardize the nomenclature, facilitating the understanding of both terminology and, when available, experimental methods. The software and a consultant with domain expertise together enabled “deep reading” of the literature—i.e., the machines could “read” articles more like scientists do, making judgments about statements and findings, and extracting only information that validated or contributed to existing knowledge about KRAS. Improving microbiome analyses. The emerging field of microbiome analysis holds promise for personalized medicine, but for now raises more questions than it answers. A recent blog on the website of the American Microbiome Institute underscores the wide variation that currently exists in microbiome sequencing and analysis, and a recent paper in Science explores the difficulty of establishing just what constitutes a “normal” gut microbiome. These large knowledge gaps didn’t stop at least two companies from offering gut analyses to consumers—and coming up with completely conflicting results. Simply put, for now, personal microbiome analyses have huge margins of error. Filtering out the “noise” with bioinformatics tools is essential for making such analyses usable in R&D and, ultimately, beneficial for patients. Efforts are underway to enable microbiome to fulfill its promise. The Microbiome Quality Control Project is a multicenter initiative to evaluate and standardize methods for assessing the human microbiome and its role in health and disease. Elsevier recently launched the open access Human Microbiome Journal, which will help chart progress in this area by making high quality research freely available. A relatively new field, translational bioinformatics, is poised to become an important discipline for precision medicine. Defined by the American Medical Informatics Association as “the development of storage, analytic, and interpretive methods to optimize the transformation of increasingly voluminous biomedical data, and genomic data, into proactive, predictive, preventive, and participatory health,” translational bioinformatics is at the forefront of data-driven health care. To realize the full power of translational bioinformatics and generate actionable results for populations and individuals, more professionals who know how to use those tools, and accurately interpret the findings, are needed. As Joel Dudley of the Icahn School of Medicine at Mount Sinai, New York City, stated in a recent publication, “An important bottleneck in the application of bioinformatics methods in translational research is the lack of investigators who are versed in both biomedical domains and informatics. Efforts to nurture both sets of competencies within individuals and to increase interfield visibility will help to accelerate the adoption and increased application of bioinformatics in translational research.” Jaqui holds a Ph.D. in Molecular Biology from the University of Oxford. Jaqui has worked at Elsevier for most of the last decade; prior to this, she held senior roles at Excerpta Medica and Solvay; Jaqui has also worked in R&D at Glaxo Wellcome. Jaqui’s speciality is the field of genomics and disease biology. Her profile can be found here.

News Article | October 26, 2016
Site: www.nature.com

No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment. The genes encoding TRIC-B1 and -B2 from Caenorhabditis elegans (UniProt codes: Q9NA75, Q9NA73; GI: 290457497, 71998474) were synthesized (Genscript) with optimized codon usage for protein expression in Pichia pastoris. The target cDNA was inserted between the EcoR1/Xho1 sites of the pPICZ-A (or pPICZ-C for TRIC-B2) vector (Invitrogen), yielding a construct with a C-terminal fusion polypeptide containing the c-Myc epitope and a polyhistidine (6 × His) tag. To improve the crystallizability of the target proteins, 48 and 61 amino acid residues at the flexible C-terminal regions of TRIC-B1 and -B2 as well as those of the Myc epitope sequence from the vector were truncated, yielding the expression products (residues 1–247 of TRIC-B1 and 1–252 of TRIC-B2) covering the transmembrane domain of the full-length protein. All point mutations were introduced through the Quikchange site-directed mutagenesis. The expression vectors were linearized with Pme I and transformed into P. pastoris GS115 strain by electroporation using the Micropulser Electroporator (BioRad). The transformants were selected by plating on the YPD agar plates with 0.1 and 1.0 mg/ml zeocin. For the large-scale protein expression, the colony with the highest protein expression level was used to inoculate 250 ml Minimal Glycerol Media containing Histidine (MGYH). When the OD reached 2.0, the culture was used to inoculate 1 l MGYH media at 1:40 (v:v) ratio in a 5 l baffled flask. The cells were grown in the shaking incubator for 24 h at 29 °C and 280 r.p.m. with the feeding of glycerol as the carbon source. When the OD reached 4.0, the cells were spun down and resuspended in the Minimal Methanol + Histidine (MMH) media. After the media exchange, the cells were grown at 27 °C (or 24 °C) and 260 r.p.m. The induction of protein expression was initiated by adding methanol to the culture at 0.5% (v:v) final concentration. Protein expression continued for ~48–60 h and during the period, methanol was added every 24 h. The cells were harvested by centrifugation at 8,983g in JLA-8.1000 rotor (Beckman) and were stored at −80 °C after being frozen instantly in liquid nitrogen. For protein purification, the frozen cell pellets were suspended in a lysis buffer (50 mM TRIS-HCl pH 8.0, 150 mM KCl) at 1:10 (m:v) ratio and then homogenized by using the T10 basic homogenizer (IKA). The cells were lysed by passing through a high-pressure homogenizer (ATS Engineering, Shanghai) at 1,300 bar 4 times. To solubilize the membrane proteins, Triton X-100 was added to the cell lysate at 1.5% (v/v) final concentration. The mixture was stirred at room temperature (RT) for 2 h to extract the target proteins from the membrane. The insoluble cell debris was removed by centrifugation at 37,044g in JA-25.50 rotor (Beckman) for 1 h. The supernatant was collected and mixed with the cobalt affinity beads (Talon, BD Science) at 1 ml resin/30 g cell pellet ratio. The resin was pre-equilibrated with a solution containing 150 mM KCl, 25 mM HEPES, pH 7.5, 10 mM imidazole, 0.5% n-decyl-β-d-maltopyranoside (β-DM, Anatrace). After 1-h incubation at RT, the mixture was loaded on a column, washed with five column volumes of buffer A (150 mM KCl, 25 mM HEPES, pH 7.5, 10 mM imidazole, 0.5% β-DM) and then five volumes of buffer B (150 mM KCl, 25 mM HEPES, pH 7.5, 20 mM imidazole, 0.4% β-DM). The target protein was eluted with buffer C (150 mM KCl, 25 mM HEPES, pH 7.5, 300 mM imidazole, 0.4% β-DM). The fractions with protein concentration above 0.3 mg/ml were combined and then diluted by adding two volumes of buffer D (150 mM KCl, 25 mM HEPES, pH 7.5, 0.4% β-DM) to lower imidazole concentration and prevent protein precipitation. The protein was concentrated to 10–15 mg/ml in the 50 kDa cut-off Amicon Ultra-4 centrifugal filter unit (Millipore). The concentrated protein sample was further purified through a Superdex-200 10/300 GL (GE Healthcare) gel filtration column in buffer E (150 mM KCl, 10 mM HEPES pH 7.5, 0.3% β-DM). The fractions between 12 and 13 ml were collected, concentrated to ∼10 mg/ml, frozen in liquid nitrogen as small aliquots and stored at −80 °C for further use in crystallization or functional assays. TRIC-B1(CΔ48) was crystallized through the hanging-drop vapour diffusion method at 16 °C. The initial crystallization conditions were identified via the sparse-matrix screening method using the MemGold I and II kits (Molecular Dimensions). Plate crystals (space group C222 ) grew within drops prepared by mixing the concentrated protein solution (10–13 mg/ml) with the well solution (22% PEG550 MME, 0.2 M NaCl, 0.1 M HEPES, pH 7.0) in 1:1 (v:v) ratio. Another form, namely tetragonal bipyramid crystals (in P4 2 2 space group), was grown with the well solution containing 20–24% PEG400, 10% glycerol, 50 mM ADA buffer (pH 6.5). For the purpose of phasing, the TRIC-B1 crystals in C222 space group were soaked in 22% PEG550 MME, 15% glycerol, 0.2 M NaCl, 0.1 M HEPES, pH 7.0, 0.4% β-DM with 2 mM CH HgCl for 16 h and yielded diffraction data to 3.9 Å resolution. For the P4 2 2 crystals used for solving the structure with Ca2+ bound, they were co-crystallized with 4 mM CaCl by combining 1.0 μl protein-CaCl mixture with 1.0 μl well solution (18.4% PEG400, 4.4% PEG550 MME, 8% glycerol, 40 mM ADA buffer, pH 8.0, 40 mM NaCl, 20 mM HEPES, pH 7.0). For derivatizing the P4 2 2 crystals with Rb+ or Cs+, the TRIC-B1(CΔ48) protein was co-purified and co-crystallized with 150 mM RbCl or CsCl and the crystals were soaked for 1–3 min in solutions containing 0.5 M RbCl or 1 M CsCl, 22% PEG400, 4.4% PEG550 MME, 10% glycerol, 40 mM ADA buffer (pH 8.0), 40 mM NaCl, 20 mM HEPES, pH 7.0 and 0.4% β-DM. The BaCl derivatives were obtained by growing the C222 crystals in a solution with 50 mM BaCl , 22% PEG550 MME, 0.2 M NaCl, 0.1 M HEPES, pH 7.0. TRIC-B2(CΔ61) was crystallized through the sitting or hanging-drop vapour diffusion method at 16 °C. The well solution contains 20% PEG400, 50 mM NaAc buffer (pH 4.4), 50 mM MgAc and 10 mM betaine hydrochloride. Rhombohedron-shaped crystals usually appeared in two weeks and matured in 2–3 months. Before data collection, a crystal was soaked in situ for 15 h at 16 °C in a solution with 20% PEG400, 50 mM NaAc (pH 4.4), 50 mM BaCl , 150 mM KCl, 0.5% β-DM and 4% 2,2,2-trifluoroethanol. The cryoprotection was achieved by raising PEG400 concentration to 30% while the other components remain constant. Although the soaking solution contains BaCl , it is evident that Ba2+ did not bind to TRIC-B2 under the acidic soaking condition, as no Ba2+ signals can be detected in the anomalous difference Fourier map, presumable due to the low affinity of metal binding at pH 4.4. On the other hand, the TRIC-B1 crystal with BaCl bound was prepared in a solution at pH 7.0, more favourable for the binding of Ba2+. The diffraction data were collected at BL17U of the Shanghai Synchrotron Radiation Facility (SSRF), or at BL1A, BL5A or NW12A beamlines of the Photon Factory (Tsukuba, Japan). The data processing was carried out by using iMosflm or HKL2000 programs. The initial experimental phases were solved by using the CH HgCl-derivatized TRIC-B1 data (collected at 1.00731 Å wavelength near the L-III edge of Hg) through the single-wavelength anomalous diffraction (SAD) method by using the Autosol program of the Phenix suite31. The initial phases solved from a total of 11 Hg atoms have a figure-of-merit (FOM) at 0.36 and after density modification, the FOM is improved to 0.59. An initial model of TRIC-B1 with seven transmembrane helices was manually built in Coot32. The anomalous difference Fourier signals of the two Hg atoms bound to Cys61 on the M2 helix and Cys197 on the M6 helix were used to verify the sequence registration on the transmembrane helices. Moreover, the relative topological relationship of M1, M3–5 and M7 transmembrane helices with respect to M2 and M6 was deduced according to the secondary structure prediction result from the PSIPRED web server. The structural model of TRIC-B1 was completed through iterative manual model building and refinement in CNS program (versions 1.2 and 1.3)33 with the maximum likelihood target function using amplitudes. Each asymmetric unit of the TRIC-B1 crystal in P4 2 2 or C222 space group contains a homotrimer of the TRIC-B1–PIP complex. The TRIC-B2 structure was solved through the molecular replacement method using the initial poly-Ala model of a partial TRIC-B1 monomer (with only seven transmembrane helices). After a solution was found in the Phaser program34 (TFZ = 7.7, LLG = 39), the model was completed by running the automatic model building programs with the 2.3 Å high-resolution data in PHENIX (AutoBuild)31 first, leading to an improved model with R  = 35.08% and R  = 38.76%. A second round of automatic model building was carried out in ARP/wARP35, resulting in a more complete model with R  = 22.9% and R  = 28.6%. Further model building and adjustment was performed manually with the Coot program32, and the structure refinement was carried out with the CNS program33. Lipid, detergent and water molecules were added manually in the model for refinement at later stages when their electron densities were well defined. Each asymmetric unit of the TRIC-B2 crystal in R32 space group contains a TRIC-B2 monomer in complex with one PIP molecule. The crystallographic threefold axis coincides with the C3 axis of the homotrimer of TRIC-B2–PIP complex. The presence of PIP in TRIC-B1 and -B2 is supported by the following crystallographic evidence. The SigmaA-weighted 2F − F map and simulated annealing (SA) omit map of TRIC-B2 both show strong electron densities that fit well with the PIP structural model (Extended Data Fig. 2a, b). Moreover, the diffraction data collected with the TRIC-B1 crystals at 3 Å wavelength yielded three anomalous difference Fourier peaks matching well with the three phosphorus atoms on the PIP head group (Extended Data Fig. 2d), confirming the presence of PIP in TRIC-B1 protein. The statistics of data analysis, phasing and structure refinement are summarized in Extended Data Table 1a, b. For the structures of TRIC-B1/B2, 91.8%/90.8% amino acid residues have their main chain dihedral angles in preferred regions of the Ramachandran plot; 8.2%/9.2% were in the allowed regions; and 0.0%/0.0% were in the outliers. The molecular graphics were produced with PyMOL36. To extract lipids in the purified TRIC-B1 protein samples, the wild-type or K129A/R133L mutant protein (200 μl at ∼5 mg/ml) was mixed with 180 μl of chloroform/methanol/concentrated HCl (1:2:0.02, v/v/v) solution. Subsequently, 60 μl of chloroform and 60 μl of 2 M KCl (sigma) were added to each tube. The tubes were vortexed and then centrifuged for 5 min at 2,000g to separate the organic phase from the aqueous phase. The organic phase was pipetted out and applied to the PVDF membrane in small dotted area, and the membrane was dried in air. The PIP lipid was detected through incubating the blotted membrane with the Mouse anti-PIP antibody 2C11 (Abcam) as primary antibody, and then with a secondary antibody of goat anti-mouse IgG*HRP (Zsbio). The signals were developed by adding the western lightning ultra ECL horseradish peroxidase substrate (Perkin-Elmer). Images of blots were captured on a chemiluminescence CCD imaging system (ChemiScope 3500 mini imager, Clinx Science Instruments). To further verify the presence of PIP in the TRIC-B2 protein sample, mass spectrometry analysis was performed on the lipid extracts of the sample. To extract the lipid, 1.8 mg purified TRIC-B2 protein (at 1.5 mg/ml) was treated with 18 mg SM2 Biobeads (Bio-rad) at 4 °C overnight to remove the detergent (β-DM). After the detergent-free sample was separated from the Biobeads, it was split into two aliquots and then dried to 10 μl under vacuum. Subsequently, a mixture of 300 μl methanol and 1 ml methyl tertiary butyl ether (MTBE) and 13 μl 0.1 N HCl was added to every 10 μl protein sample and then the tubes were vortexed once every 2 min (for 20 min). Then, 250 μl 0.1 N HCl was added to the above mixture and the tubes were shaken to homogenize the samples. The mixture was centrifuged for 5 m at 16,200g and the upper phase was kept as ‘Extract I’. To further process the sample, the mixture of 300 μl methanol, 1 ml MTBE and 13 μl 0.1 N HCl was centrifuged and the lower organic-phase solvent was taken and mixed with ‘Extract I’. This mixture was centrifuged again and the upper layer was kept as ‘Extract II’. One half of ‘Extract II’ was treated with trimethylsilyl diazomethane (the other half was stored at −20 °C as a backup sample) and mass spectrometry was performed according to the protocol described in ref. 37. The PIP molecules were resolved as methylated derivatives. The result confirms that the TRIC-B2 protein sample contains 34:0, 34:1, 34:2, 34:3, 36:1, 36:2, 36:3 and 36:8 PIP molecules (for the spectrum of 34:1-PIP , see Extended Data Fig. 2c). For the preparation of small unilamellar vesicle (SUVs) samples for fluorescence-based K+ flux assay, 10 mg/ml Azolectin lipid mixture dissolved in chloroform was aliquoted and dried under vacuum in a CentriVap Concentrator (Labconco). The dried lipid film was solubilized to 10 mg/ml in the dialysis buffer (10 mM HEPES pH7.0, 150 mM KCl, 0–5 mM CaCl ) plus 8% n-octyl-β-d-maltopyranoside (β-OM). The sample was then sonicated in a bath sonicator three times (10 s on, 10 s off for 1 min). Subsequently, the purified TRIC-B1(CΔ48) protein was added to the mixture at a protein:lipid ratio of 1:100 (w:w). The sample was rotated gently for 1 h at RT before being injected into a 10 kDa-cut-off Slide-A-Lyzer cassette (Pierce) for dialysis at 4 °C. During dialysis, the removal of detergent from the sample was facilitated by adding 40 mg/ml SM2 biobeads (Bio-rad) in the dialysis buffer. After being dialysed for 5–7 days with buffer exchange every 12 h, the sample with reconstituted SUV was retrieved from the cassette with a syringe, aliquoted in 20-μl batches and then stored at −80 °C after being flash frozen in liquid nitrogen. The control vesicles without protein added were prepared as above, while the protein sample was replaced with the blank buffer containing 10 mM HEPES-KOH (pH 7.0), 150 mM KCl and 0.3% β-DM. The protocol for the K+ flux assay was based on the published methods in refs 38,39. For each reaction, 5 μl frozen vesicle sample was thawed, briefly sonicated, and diluted 20 fold into a flux-assay solution containing 150 mM NMDG-Cl (pH 7.0), 10 mM HEPES (pH 7.0), 0.5 mg/ml BSA, 0–5 mM free Ca2+ (in the form of CaCl ), 0.2 mM EGTA, 2 μM 9-amino-6-chloro-2-methoxyacridine (ACMA) and MgCl . EGTA was added to remove background Ca2+, and the concentration of total exogenous Ca2+ added in the assay solution was estimated through an online calculator (MAXCHELATOR: http://maxchelator.stanford.edu/CaMgATPEGTA-TS.htm) by inputting the designated free [Ca2+]. MgCl was added in the assay solution at the concentration of ([Ca2+] − [Ca2+] ) to balance the membrane potential change arising from the difference of internal and external [Ca2+]. The fluorescence (excitation: 410 nm; emission: 490 nm) was monitored in the infinite M1000 PRO plate reader (TECAN) every 30 s. After the fluorescence signal stabilized, K+ efflux was coupled to the influx of proton by adding 2 μM carbonyl cyanide m-chlorophenyl hydrazone (CCCP, a proton ionophore) into the assay buffer. The increase of proton concentration within the liposome led to protonation and fluorescence quenching of the ACMA dye. At the end of experiment, the K+-selective ionophore valinomycin was added at 2 μM final concentration to dissipate the potassium gradient. For the flux assay results shown in Fig. 3 and Extended Data Fig. 7, each set of the assays were conducted in parallel in a 96-well black plate (Costar), and each flux assay was repeated four to eight times. The fluorescence data were normalized to the mean value of the initial reading before being plotted against time. For preparing the SUV samples to be transformed into the giant unilamellar vesicles (GUV) for electrophysiology, a lipid mixture containing 90% 1,2-diphytanoyl-sn-glycero-3-phosphocholine (DPhPC, Avanti) and 10% cholesterol (w:w) was dried under vacuum for 4 h. The dry lipid sample was then suspended at 10 mg/ml concentration in a low-salt buffer with 1 mM HEPES-KOH (pH 7.2) and 5 mM KCl, and then subjected to tip sonication to clarity (50 Hz, 1 s on, 5 s off for 2.5 min). The sample was solubilized by adding β-DM to a final concentration of 10 mM, and incubated for 30 min at RT. The purified TRIC-B1(CΔ48) protein was then added to the solubilized lipid mixture to achieve a protein:lipid ratio of 1:100 (wt:wt). More β-DM was then added to reach a final concentration of 17.5 mM and the resulting mixture was gently agitated for 1 h at 25 °C. (Alternatively, Triton X-100 at a final concentration of 7.8 mM can be used to replace β-DM during the above sample-preparing steps.). The detergent was removed by dialyzing the protein–lipid mixture against a low-salt buffer (1 mM HEPES pH 7.2, 5 mM KCl) with the addition of 40 mg/ml washed SM2 biobeads (Bio-rad) in the dialysis buffer at 16 °C. The external buffer was changed every 12 h and the dialysis lasted for 6–7 days. After dialysis, the resulting SUV sample was aliquoted, flash-frozen in liquid nitrogen and then stored at −80 °C. The GUV samples used in the single-channel electrophysiological studies were prepared through the electroformation technique by using the Nanion Vesicle Prep Pro device (Nanion). To protect the protein during the partial dehydration process before electroformation, trehalose was added to the preformed SUV sample to a final concentration of 10 mM and then the mixture was sonicated in water bath for 30 s. Subsequently, about 10 μl SUV solution was pipetted in small droplets (∼0.2 μl/droplet) on the indium tin oxide (ITO)-coated glass slide. The droplets were exposed under room atmosphere for approximately 15 min to let them dry. After partial dehydration, the lipid films were rehydrated by adding 270 μl of 1 M sorbitol solution and then a cassette sandwiching the sample in the middle was assembled. The ITO layers of the slides face and contact the sample during the cassette assembling. The electroformation was first run at 0.1 to 1.0 V and 12 Hz frequency for 3 h. Afterwards, the frequency was lowered to 4 Hz and the voltage was increased to 2 V. The procedure was continued for 30 min to release the GUVs from the glass slides. Throughout the electroformation process, the temperature was maintained at 36 °C. The single-channel activity recordings were performed with the inside-out mode. The data were recorded under symmetrical solutions with 210 mM KCl and 10 mM HEPES (pH 7.2). The data were acquired at 50 kHz with a 0.5-kHz filter and 50 Hz notch filter, using an EPC-10 amplifier (HEKA). The Clampfit Version 10.0 (Axon Instruments) was used for data analysis, Excel Version 2010 (Microsoft) and OriginPro 8 were used for statistical analysis and Igor Pro 6.37A (WaveMetrics) was used for making the graphs. The single-channel conductance was obtained through linear fitting of the data recorded under different voltage values. The open probability data shown in Fig. 3i, j were based on P  = t /T, where t is the total time that the channel was observed in the open state at the ith level and T is the total recording time. Analyses on the potential cooperative/independent gating behaviours among the three pores within the TRIC-B1 trimer were performed according to the methods reported by Ding and Sachs40 (See Supplementary Discussion for details). For the cysteine accessibility assay experiments, the mutant proteins (M38C, A126C and S166C on a Cys-less template) were purified and reconstituted on the GUVs (protein:lipid = 1:200–250, w-w; lipid: 95% azolectin + 5% cholesterol, w-w) through a modified sucrose method41. The data were recorded on the inside-out patches from the reconstituted GUVs under ±40 mV at 50 kHz by using an EPC-10 amplifier (HEKA) with a 0.5-kHz filter and 50 Hz notch filter. The bath and pipette buffers both contained 210 mM KCl and 10 mM HEPES (pH 7.2). The experiments were performed at room temperature (21–24 °C). During the experiments, freshly prepared stock solution of MTSET (1 M in water) was added to the bath solution to a final concentration of 2 mM. The same patch of membrane containing active TRIC channels was used for collecting the data before and after MTSET was added. Tryptophan fluorescence spectra were measured on a Spectrofluorometer F7000 (Hitachi). All samples were excited at 295 nm and the fluorescence emission spectra were scanned in the range between 300–400 nm. During the measurements, 90° and 0° polarizers were used for excitation and emission, respectively. The slit size was 5 nm × 5 nm and the voltage was set at 650 V. To minimize the influence from background Ca2+ for the starting sample, EGTA was added to the TRIC-B1 or W180A mutant protein samples (at 6.9 μM protein concentration in 10 mM HEPES, 150 mM KCl, 0.3% β-DM, pH 7.5 buffer) at a final concentration of 10 mM. The addition of 10 mM EGTA in the sample reduced the background free Ca2+ to about 10 nM. The sample buffer with 10 mM EGTA was used as the blank, providing background data to be subtracted from the overall fluorescence spectroscopic data of the sample. Stock solution of 1 M CaCl was titrated into 200 μl protein sample to yield final [Ca2+] at 1 μM, 10 μM, 100 μM, 1 mM and 10 mM respectively. Experiments with the wild-type TRIC-B1 and W180A mutant were repeated 10 times. The difference spectra were derived by subtracting the data of the W180A mutant from the data of the wild-type sample. The range of fluorescence signals used for calculating the integrated intensity is from 305 to 345 nm. The fluorescence below 305 nm may include some reflection from incident light, while the background noise increases at wavelength larger then 345 nm. The formula used for normalization of the integrated fluorescence signals is I /I , where the fluorescence intensity under conditions with Ca2+ added (I ) was normalized to the sample with minimal Ca2+ (10 nM Ca2+ in the presence of 10 mM EGTA, I ). Excel Version 2010 (Microsoft), Origin 8.0 and Prism 6.0 (GraphPad) programs were used for statistical analysis and making the graphs. To verify the conformational change of M5–6 loop region, a double cysteine mutants, namely A49C/N185C, have been designed and used for disulphide cross-linking experiment under conditions with or without Ca2+. The two Cys mutation sites are located at the monomer–monomer interface of TRIC-B1. One of the mutation sites (N185C) is located on M5–6 loop, and the other (A49C) is in an invariable region adjacent to M5–6 loop. To eliminate the potential background signal from endogenous cysteine residues during cross-linking, all the endogenous Cys residues in TRIC-B1 were mutated to Ser, and then the A49C/N185C mutation was introduced to such a Cys-less template. The A49C/N185C mutant protein was expressed and purified according to the same protocol used for purifying wild-type TRIC-B1 protein. Before the cross-linking reactions were started, the protein sample was treated with 100 μM Tris(2-chloroethyl) phosphate (TCEP) for 10 min at room temperature to reset the Cys residues in a maximally reduced state. Subsequently, EGTA or CaCl was added to the sample at 10 mM final concentration and the sample was incubated for 10 min at room temperature. To induce cross-linking between two Cys residues, diamide was titrated into the samples at various concentrations (0, 0.1, 0.3 and 0.5 mM) and the reactions were conducted at 25 °C for 10 min. To quench the reaction, N-ethylmaleimide (NEM) was added to the sample at 50 mM final concentration to block the remaining free sulfhydryls. The samples were mixed with the non-reducing SDS–PAGE loading buffer containing 150 mM NEM and then the products of cross-linking reaction are separated through SDS–PAGE. The gels were stained with Coomassie brilliant blue R-250. As a control, A49C/N185C/W180A triple mutant was produced and cross-linked under the same conditions as the A49C/N185C mutant. Three repeats were performed for each mutant and the ImageJ was used for quantifying the intensities of bands on the SDS–PAGE gel. The Excel Version 2010 (Microsoft) and Prism 6 (GraphPad) programs were used for analysing and plotting the data. To analyse the oligomeric state of TRIC-B1 protein (wild type and K129A/R133L mutant), chemical cross-linking was performed with the purified protein sample in detergent solution or directly with those on the cellular membranes. For cross-linking of purified protein in detergent solution, the wild-type or K129A/R133L mutant TRIC-B1 protein was diluted to 1 mg/ml in a reaction buffer consisting of 10 mM HEPES (pH 7.5), 150 mM KCl and 0.3% β-DM. The protein was cross-linked by 0–10 mM glutaraldehyde for 20 min at room temperature and the reactions were stopped by adding 100 mM TRIS-HCl (pH 7.5). The cross-linked samples were loaded on SDS–PAGE and the gels were stained with Coomassie brilliant blue R250. For cross-linking of target proteins on the membrane, the cells containing wild-type or K129A/R133L mutant protein were re-suspended in a buffer consisting of 50 mM TRIS-HCl, pH 8.0, and 150 mM KCl. After the cells were lysed by passing through high-pressure homogenizer (ATS Engineering), the cell debris was removed through low-speed centrifugation at 10,000g for 30 min and the membrane fraction was further collected through ultracentrifugation at 100,000g for 1 h at 4 °C. The membrane pellets were re-suspended in a buffer consisting of 100 mM HEPES (pH 7.5) and 150 mM KCl, and then treated with disuccinimidyl suberate (DSS) at 0–10 mM final concentration for 30 min at room temperature. The reactions were quenched by adding 100 mM TRIS-HCl (pH 7.5). The cross-linked samples were solubilized by adding 5 × SDS–PAGE loading buffer, and then loaded on SDS–PAGE. The target protein bands were detected through western blot by using the HRP-conjugated His-tag antibody (Genscript). After being developed with the western lightning Ultra ECL horseradish peroxidase substrate (Perkin–Elmer), the western blots were imaged on a chemiluminescence CCD system (ChemiScope 3500 mini imager, Clinx Science Instruments).

Animal experiments were performed according to procedures approved by the Institutional Animal Care and Use Committee of the Beth Israel Deaconess Medical Center. Unless otherwise stated, mice used were male C57BL/6J (8–12 weeks of age; Jackson Laboratories), and housed in a temperature-controlled (20–22 °C) room on a 12 h light/dark cycle. All compounds administered to mice in vivo were injected at the stated dose i.p. 10 min before subsequent interventions unless otherwise stated. Body temperature and cold exposure experiments were assessed using a mouse rectal probe (World Precision Instruments). When studying acute activation of thermogenesis, mice were housed from birth at 20–22 °C to allow for recruitment of thermogenic adipose tissue7. Before individual housing at 4 °C, mice were placed at thermoneutrality (30 °C) for 3 days which allows both for maintenance BAT UCP1 protein content31 and for measurement of acute induction of BAT thermogenesis upon cold exposure. Upon exposure to 4 °C, temperature was measured every 30 min. When studying body temperature after 4 °C acclimation, WT and Ucp1−/− mice (equal numbers of male and female mice in each group) were acclimated using established protocols: mice were individually housed for 1 week at 15 °C, 1 week at 10 °C, and 24 h at 4 °C before the experiment. Mice were individually restrained to limit non-shivering muscle activity and two EMG needle electrodes were inserted subcutaneously above the nuchal muscles in the back of the neck. EMG leads were connected to a computerized data acquisition system via a communicator. EMG was recorded at thermoneutrality to determine non-shivering basal nuchal muscle activity, before placement of mice at 4 °C. EMG data were collected and burst activity was determined as described previously32. Briefly, EMG data were collected from the implanted electrodes at a sampling rate of 2 kHz using LabChart 8 Pro Software (ADInstruments). The raw signal was converted to root mean square activity. Root mean square activity was analysed for shivering bursts in 10 s windows. Whole-body energy metabolism was evaluated using a Comprehensive Lab Animal Monitoring System (CLAMS, Columbia Instruments). For 6 h measurements, mice were acclimated in the metabolic chambers for 48 h before experiments to minimize stress from the housing change. CO and O levels were collected every 12 or 32 min for each mouse over the period of the experiment. For acute measurements, CO and O levels were collected every 10 s. CL 316,243 (Sigma-Aldrich; 1 mg kg−1) was injected i.p. into mice at the indicated times. Aconitase activity was measured as described previously33. In brief, after the relevant in vivo intervention mouse BAT was rapidly excised and homogenized in mitochondrial isolation buffer (250 mM sucrose, 2 mM EDTA, 10 mM sodium citrate, 0.6 mM MnCl , 100 mM Tris-HCl, pH 7.4) followed by mitochondrial isolation by differential centrifugation. Samples (1–2 mg mitochondrial protein) were added to a 96-well plate and 190 ml assay buffer (50 mM Tris-HCl (pH 7.4), 0.6 mM MnCl , 5 mM sodium citrate, 0.2 mM NADP+, 0.1% (v/v) Triton X-100, 0.4 U ml−1 ICDH). Absorbance was measured at 340 nm for 7 min at 37 °C. To control for mitochondrial content aconitase activity was normalized to citrate synthase activity34 and expressed the result as a percentage of control levels. Lipid hydroperoxide content in mouse BAT was estimated by rapid snap freezing of BAT tissue followed by lipid extraction and assessment using a modified ferric thiocyanate assay (Cayman Chemical Lipid Hyroperoxide Assay Kit) according to the manufacturer’s instructions. Cysteine redox status of Prx3 and UCP1 was measured as described previously16, 35. After the relevant in vivo intervention, mouse BAT was rapidly excised and homogenized in 100 mM NEM, 1 mM EGTA, 50 mM Tris-HCl, pH 7.4. Samples were incubated at 37 °C for 5 min before the addition of SDS (2% final) and further incubation at 37 °C for 10 min. Incubations at 37 °C proceeded in a thermomixer at 1,300 r.p.m. Samples were then precipitated in five volumes of ice-cold acetone to remove excess NEM before resuspension in 1 mM EGTA, 2% SDS, 10 mM TCEP, 50 mM Tris-HCl, pH 7.4 containing a polyelthylene glycol polymer conjugated to maleimide (50 mM PEG-Mal). Resuspended samples were incubated for 30 min at 37 °C before a second acetone precipitation to remove excess PEG-Mal before sample resuspension and immunoblot detection by standard methods described below. For UCP1 experiments, to ensure gel shift signals were specific to reversible cysteine oxidation, oxidized samples were separately treated with TCEP before differential labelling as described above. Calibrating the number of UCP1 cysteines oxidized was achieved by treating TCEP-reduced samples with increasing proportions of Peg-Mal:NEM to generate a cysteine-dependent ladder35. In addition, to ensure higher molecular mass signals were specific to UCP1, UCP1 antibody specificity was tested in BAT. It should be noted that while the UCP1 antibody used here is highly specific for UCP1 in BAT (Extended Data Fig. 4c), the same antibody applied to cultured brown adipocyte samples can generate non-specific signals at molecular mass >35 kDa. So, the UCP1 gel shift assay as described here is only compatible with in vivo tissue experiments. Reduced and oxidized glutathione were profiled in negative ionization mode by liquid chromatography tandem mass spectrometry (LC–MS) methods as described previously36. Data were acquired using an ACQUITY UPLC (Waters) coupled to a 5500 QTRAP triple quadrupole mass spectrometer (AB SCIEX). Tissue homogenates (30 μl) were extracted using 120 μl of 80% methanol containing 0.05 ng μl−1 inosine-15N , 0.05 ng μl−1 thymine-d , and 0.1 ng μl−1 glycocholate-d as internal standards (Cambridge Isotope Laboratories). The samples were centrifuged (10 min, 9,000g, 4 °C) and the supernatants (10 μl) were injected directly onto a 150 mm × 2.0 mm Luna NH2 column (Phenomenex). The column was eluted at a flow rate of 400 μl min−1 with initial conditions of 10% mobile phase A (20 mM ammonium acetate and 20 mM ammonium hydroxide (Sigma-Aldrich) in water (VWR)) and 90% mobile phase B (10 mM ammonium hydroxide in 75:25 v/v acetonitrile/methanol (VWR)) followed by a 10 min linear gradient to 100% mobile phase A. The ion spray voltage was −4.5 kV and the source temperature was 500 °C. Raw data were processed using MultiQuant 2.1 software (AB SCIEX) for automated peak integration. LC–MS data were processed and visually inspected using TraceFinder 3.1 software (Thermo Fisher Scientific). After the relevant in vivo intervention, mouse BAT was rapidly excised and homogenized in 20% (w/v) TCA to stabilize thiols. The homogenate was incubated on ice for 30 min and then pelleted for 30 min at 16,000g at 4 °C. The pellet was washed with 10% and 5% (w/v) TCA and then resuspended in 80 μl denaturing alkylating buffer (DAB; 6 M urea, 2% (w/v) SDS, 200 mM Tris-HCl, 10 mM EDTA, 100 μM DTPA, 10 μM neocuproine). The contents of one vial of iodoTMT reagent (Thermo Scientific) was added to each of three biological replicate samples to label reduced cysteine residues at 37 °C and 1,300 r.p.m. for 1 h. Sample protein was precipitated with five volumes of ice-cold acetone, incubated at −20 °C for 2 h, and pelleted at 4 °C and 16,000g for 30 min. The amount of protein to be processed was optimized to ensure saturation of thiol labelling by the iodoTMT reagent as per the manufacturer’s instructions. The pellet was washed twice with ice-cold acetone and then re-solubilized in 80 μl DAB containing 1 mM tris(2- carboxyethyl)phosphine (TCEP), reducing previously reversibly oxidised cysteine residues in the presence of a second, distinct iodoTMT reagent. Proteins were incubated at 37 °C and 1,400 r.p.m. for 1 h, precipitated and resuspended for protease digestion. After digestion, iodoTMT-labelled cysteine-containing peptides were enriched using the anti-TMT resin as per the manufacturer’s instructions. Proteins with cysteine thiols exhibiting differential redox status (defined as >10% shift in cysteine oxidation status upon cold exposure) were assessed for Gene Ontology (GO) term enrichment37. The total identified population of cysteine thiol containing proteins was used as the reference background. Enriched GO terms were filtered after benjamini-hochberg correction at an adjusted P value <0.1. All data analysis used R (R Core Team, Vienna, Austria, http://www.R-project.org). Tissue or cellular samples were prepared adapting a protocol used previously to stabilize endogenous protein sulfenic acids38. Briefly, samples were homogenized in 50 mM Tris base, containing 100 mM NaCl, 100 μM DTPA, 0.1% SDS, 0.5% sodium deoxycholate, 0.5% Triton-X 100, 5 mM dimedone. To minimize lysis-dependent oxidation, buffers were bubbled with argon before use. Samples were incubated for 15 min at room temperature, at which point SDS was added to a final concentration of 1% and samples were incubated for a further 15 min. After dimedone treatment, 10 mM TCEP and 50 mM NEM were added and samples were incubated for a further 15 min at 37 °C to reduce and alkylate all non-sulfenic acid protein cysteine residues. Protein sulfenic acids were then assessed by immunoblotting against dimedone (1:1,000 antibody dilution). After dimedone and NEM labelling of samples as described above, samples were resolved by SDS–PAGE and bands in the UCP1 containing region of the gel (30–35 kDa) were excised, destained with acetonitrile and subjected to dehydration by a speed vacuum concentrator. Gel bands were rehydrated with digestion buffer (75 μl of 50 mM HEPES and 500 ng of trypsin (Promega) and subjected to 12 h of digestion at 37 °C. Peptides were extracted and labelled with TMT 10 reagents (Thermo Fisher) as previously described39. Protein pellets were dried and resuspended in 8 M urea containing 50 mM HEPES (pH 8.5). Protein concentrations were measured by BCA assay (Thermo Scientific) before protease digestion. Protein lysates were diluted to 4 M urea and digested with LysC (Wako, Japan) in a 1/100 enzyme/protein ratio overnight. Protein extracts were diluted further to a 1.0 M urea concentration, and trypsin (Promega) was added to a final 1/200 enzyme/protein ratio for 6 h at 37 °C. Digests were acidified with 20 μl of 20% formic acid (FA) to a pH ~2, and subjected to C18 solid-phase extraction (Sep-Pak, Waters). All spectra were acquired using an Orbitrap Fusion mass spectrometer (Thermo Fisher) in line with an Easy-nLC 1000 (Thermo Fisher Scientific) ultra-high pressure liquid chromatography pump. Peptides were separated onto a 100 μM inner diameter column containing 1 cm of Magic C4 resin (5 μm, 100 Å, Michrom Bioresources) followed by 30 cm of Sepax Technologies GP-C18 resin (1.8 μm, 120 Å) with a gradient consisting of 9–30% (ACN, 0.125% FA) over 180 min at ~250 nl min−1. For all LC–MS/MS experiments, the mass spectrometer was operated in the data-dependent mode. We collected MS1 spectra at a resolution of 120,000, with an AGC target of 150,000 and a maximum injection time of 100 ms. The ten most intense ions were selected for MS2 (excluding 1 Z-ions). MS1 precursor ions were excluded using a dynamic window (75 s ± 10 ppm). The MS2 precursors were isolated with a quadrupole mass filter set to a width of 0.5Th. For the MS3 based TMT quantitation, MS2 spectra were collected at an AGC of 4,000, maximum injection time of 150 ms, and CID collision energy of 35%. MS3 spectra were acquired with the same Orbitrap parameters as the MS2 method except HCD collision energy was increased to 55%. Synchronous-precursor-selection was enabled to include up to six MS2 fragment ions for the MS3 spectrum. A compilation of in-house software was used to convert .raw files to mzXML format, as well as to adjust monoisotopic m/z measurements and erroneous peptide charge state assignments. Assignment of MS2 spectra was performed using the SEQUEST algorithm40. All experiments used the Mouse UniProt database (downloaded 10 April 2014) where reversed protein sequences and known contaminants such as human keratins were appended. SEQUEST searches were performed using a 20 ppm precursor ion tolerance, while requiring each peptide’s amino/carboxy (N/C) terminus to have trypsin protease specificity and allowing up to two missed cleavages. IodoTMT tags on cysteine residues residues (+329.226595 Da) was set as static modifications, while methionine oxidation (+15.99492 Da) was set as variable modifications. For targeted assessment of UCP1 cysteine sulfenylation, TMT tags on lysine residues and peptide N termini (+229.16293 Da), NEM on cysteine residues (+125.047679 Da) were set as static modifications and oxidation of methionine residues (+15.99492 Da) and dimedone on cysteine residues (+13.020401 Da versus NEM) as variable modifications. Determination of sulfenylation status of the Cys253 peptide was determined by comparing TMT reporter ion abundance of the dimedone-alkylated and NEM-alkylated peptides as a proportion of total precursor ion intensity. An MS2 spectra assignment false discovery rate of less than 1% was achieved by applying the target-decoy database search strategy41. Protein filtering was performed using an in-house linear discrimination analysis algorithm to create one combined filter parameter from the following peptide ion and MS2 spectra metrics: XCorr, ΔCn score, peptide ion mass accuracy, peptide length and missed-cleavages42. Linear discrimination scores were used to assign probabilities to each MS2 spectrum for being assigned correctly, and these probabilities were further used to filter the data set to a 1% protein-level false discovery rate. For quantification, a 0.03m/z window centred on the theoretical Th value of each reporter ion was used for the nearest signal intensity. Reporter ion intensities were adjusted to correct for the isotopic impurities from the different TMT reagents (manufacturer specifications). The signal to noise values for all peptides were summed within each TMT channel. For each peptide, a total minimum sum signal to noise value of 200 and an isolation purity greater than 70% was required43. Percentage cysteine oxidation status of protein thiols was calculated as the percentage of the cysteine containing peptide (total or mitochondrial) labelled with iodoTMT (129, 130, 131) for each condition over the sum of the reduced peptide labelled with iodoTMT (126, 127, 128) plus reversibly oxidized labelled peptide (129, 130, 131): (oxidized peptide 129, 130, 131)/(reduced peptide 126, 127, 128 + oxidized peptide 129, 130, 131) × 100. Interscapular brown adipose stromal vascular fraction was obtained from 2- to 6-day-old pups as described previously44. Interscapular brown adipose was dissected, washed in PBS, minced, and digested for 45 min at 37 °C in PBS containing 1.5 mg ml−1 collagenase B, 123 mM NaCl, 5 mM KCl, 1.3 mM CaCl , 5 mM glucose, 100 mM HEPES, and 4% essentially fatty-acid-free BSA. Tissue suspension was filtered through a 40 μm cell strainer and centrifuged at 600g for 5 min to pellet the SVF. The cell pellet was resuspended in adipocyte culture medium and plated. Primary brown pre-adipocytes were counted and plated in the evening, 12 h before differentiation at 15,000 cells per well of a seahorse plate. Pre-adipocyte plating was scaled according to surface area. The following morning, brown pre-adipocytes were induced to differentiate for 2 days with an adipogenic cocktail (1 μM rosiglitazone, 0.5 mM IBMX, 5 μM dexamethasone, 0.114 μg ml−1 insulin, 1 nM T3, and 125 μM Indomethacin) in adipocyte culture medium. Two days after induction, cells were re-fed every 48 h with adipocyte culture medium containing 1 μM rosiglitazone and 0.5 μg ml−1 insulin. Cells were fully differentiated by day 5 after induction. Cellular OCR of primary brown adipocytes was determined using a Seahorse XF24 Extracellular Flux Analyzer. Adipocytes were plated and differentiated in XF24 V7 cell culture microplates. Before analysis adipocyte culture medium was changed to DMEM respiration medium lacking NaHCO (Sigma), and including 1.85 g l−1 NaCl, 3 mg l−1 phenol red, 2% fatty-acid-free BSA, 1 mM sodium pyruvate, pH 7.4. Basal respiration was determined to be the OCR in the presence of substrate alone. ATP-synthase-independent respiration was determined after addition of 2.5 μM oligomycin. Unless otherwise stated, leak respiration was determined after addition of 2.5 μM oligomycin and 100 nM noradrenaline. Maximal respiration was determined after addition of 2 μM FCCP. To determine OCR after plasma membrane permeabilization, cells were treated with 50 μg ml−1 saponin, and sequestration of free fatty acids after permeabilization was achieved through addition of 2% fatty-acid-free BSA. RNA from murine BAT was reverse-transcribed and used as template for PCR of Ucp1. Sequences for Ucp1 amplification were as follows: sense, CAC CAT GGT GAA CCC GAC AAC TTC C; antisense, TTA TGT GGT ACA ATC CAC TG. PCR fragments were gel-purified and cloned into the pENTR/D-TOPO entry vector according to the manufacturer’s instructions (Invitrogen; K2400). Cloned Ucp1 was shuttled into the pAd/CMV/V5-DEST Gateway vector, and confirmed by sequencing. Cysteine mutants were generated using the Quik-Change site-directed mutagenesis kit (Stratagene). Primers for generating mutants were as follows: Ucp1 C24A forward 5′-AGCCGGAGTTTCAGCTGCCCTGGCAGATATCATC-3′, reverse 5′-GATGATATCTGCCAGGGCAGCTGAAACTCCGGCT-3′; Ucp1 C188A forward 5′-TGAGAAATGTCATCATCAATGCTACAGAGCTGGTAACATATG-3′, reverse 5′-CATATGTTACCAGCTCTGTAGCATTGATGATGACATTTCTCA-3′; UCP1 C213A forward 5′-TGGCAGATGACGTCCCCGCCCATTTACT GTCAGCTC-3′, reverse 5′-GAGCTGACAGTAAATGGGCGGGGACG TCATCTGCCA-3′; Ucp1 C224A forward 5′-TCTTGTTGCCGGGTT TGCCACCACACTCCTGGCC-3′, reverse 5′-GGCCAGGAGTGTGGTG GCAAACCCGGCAACAAGA-3′; Ucp1 C253A forward 5′-CCCAAGC GTACCAAGCGCTGCGATGTCCATGTAC-3′, reverse 5′-GTACATGGAC ATCGCAGCGCTTGGTACGCTTGGG-3′; Ucp1 C287A forward 5′-GGAAC GTCATCATGTTTGTGGCCTTTGAACAGCTGAAAAAAG-3′, reverse 5′-CTTTTTTCAGCTGTTCAAAGGCCACAAACATGATGACGTTCC-3′; Ucp1 C304A forward 5′-CAGACAGACAGTGGATGCTACCACATAAGGATCC-3′, reverse 5′-GGATCCTTATGTGGTAGCATCCACTGTCTGTCTG-3′. pAd/CMV/V5-DEST/Ucp1 was linearized with PacI and transfected (3 μg) into 293A cells with lipofectamine 2000 (Invitrogen). Crude adenovirus was generated according to the manufacturer’s instructions (Invitrogen; V493-20). Crude adenovirus was amplified by infecting 293A cells, and purified using the Fast Trap Adenovirus Purification and Concentration Kit (EMD Millipore). Virus was quantified by examining viral DNA. Briefly, viral particles were treated with Proteinase K and DNA was isolated with phenol and chloroform/isoamylalcohol (24:1). Preliminary experiments with titrations of viral transductions in Ucp1−/− adipocytes were used to determine the amount of virus yielding a Ucp1 messenger RNA (mRNA) and protein level similar to the level detected from Ucp1+/+ adipocytes. For subsequent experiments, primary brown adipocytes were transduced with purified adenovirus in the evening of day 3 after differentiation with medium replacement the following morning. Adipocytes were used for experiments on day 5 after differentiation. A comparative model of UCP1 was built by using the structure of the bovine AAC19. This structure corresponds to the ‘c-state’ of the carrier—open to the mitochondrial inner membrane. The protein sequence of human UCP1 was taken from UniProt. To align the AAC and UCP1 sequences, MUSCLE45 and manual editing in Jalview46 were used. To improve the quality of the comparative models, the alignments were edited to remove the N- and C-terminal residues of the UCP1 sequences that did not align with resolved residues in the AAC structure, and to place gaps in the UCP1 sequences so as to minimize the distance between these residues in the initial target structure. Fifty comparative models of human UCP1 were built from the AAC structure and the sequence alignment by using MODELLER. The structure with the lowest MODELLER energy score was taken as the best representative structure. The cardiolipin molecules of the AAC were added to the modelled UCP1 structure by aligning the two structures, and copying the lipid molecules21, 22, 47. This structure was examined and figures produced by using the PyMOL molecular visualization system (PyMOL Molecular Graphics System, version 1.4.1, Schrödinger). ROS production was estimated by oxidation of DHE and ratiometric assessment as described previously33. Cells were plated and differentiated onto 96-well plates suitable for fluorescence analysis. Before imaging, cell media was removed and replaced with imaging buffer (156 mM NaCl, 1.25 mM KH PO , 3 mM KCl, 2 mM MgCl , 10 mM HEPES, pH 7.4) supplemented with 1 mM sodium pyruvate. Cells were loaded with 5 μM DHE (Invitrogen), which remained present throughout the time course. DHE was excited at 355 nm and the emitted signal was acquired at 460 nm. Oxidized DHE was excited at 544 nm and emission was acquired at 590 nm. Mitochondrial membrane potential was measured in permeabilized cells using TMRM (Life Technologies) in dequench mode. In this mode, mitochondrial depolarization causes redistribution of a high concentration of signal quenched TMRM from mitochondria to the cytosol, such that the lower concentration results in dequenching and an increase in fluorescence48. Cells were pre-loaded at room temperature with imaging buffer containing 1 μM TMRM. TMRM fluorescence was excited at 544 nm and emission was collected at 590 nm. Total RNA was extracted from frozen tissue using TRIzol (Invitrogen), purified with RNeasy Mini spin columns (QIAGEN) and reverse transcribed using a High-Capacity cDNA Reverse Transcription kit (Applied Biosystems). The resultant complementary DNA (cDNA) was analysed by quantitative PCR with reverse transcription (qRT–PCR). Briefly, 20 ng cDNA and 150 nmol of each primer were mixed with SYBR GreenER qPCR SuperMix (Applied Biosystems). Reactions were performed in a 384-well format using an ABI PRISM 7900HT real time PCR system (Applied Biosystems). Relative mRNA levels were calculated using the comparative CT method and normalized to cyclophilin mRNA. The following primers were used in these studies: Cyclophilin forward 5′-GGAGATGGCACAGGAGGAA-3′, reverse 5′-GCCCGTAGTGCTTCAGCTT-3′; Ucp1 forward 5′-ACTGCCACACCTCCAGTCATT-3′, reverse 5′-CTTTGCCTCACTCAG GATTGG-3′; Dio2 forward 5′-CAGTGTGGTGCACGTCTCCAATC-3′, reverse 5′-TGAACCAAAGTTGACCACCAG-3′; Pgc1α forward 5′-CCCTGCCATTGTTAAGACC-3′, reverse 5′-TGCTGCTGTTCCTGTTTTC-3′; PPAR-γ forward 5′-TGAAAGAAGCGGTGAACCACTG-3′, reverse 5′-TGGCATCTCTGTGTCAACCATG-3′; Pgc1β forward 5′-CTGACGT GGACGAGCTTTCA-3′, reverse 5′-CGTCCTTCAGAGCGTCAGAG-3′; Nrf2 forward 5′-CCAGCTACTCCCAGGTTGCC-3′, reverse 5′-GGGA TATCCAGGGCAAGCGA-3′; Ap2 5′-AAGGTGAAGAGCATCATAACCCT-3′, reverse 5′-TCACGCCTTTCATAACACATTCC-3′. Adipocytes were incubated in respiration medium absent BSA and treated with indicated concentrations of noradrenaline for 2 h before collection of medium and quantification of glycerol using free glycerol reagent (Sigma-Aldrich) relative to glycerol standard and normalized to protein content. Immunodetection after SDS–PAGE used the following antibodies: UCP1 (Abcam ab10983), Prx3 (Abcam ab16751), Dimedone (Millipore 07-2139), Vinculin (Sigma V9264), ATP5A and NDUFB8 (Abcam ab110413), ATGL (CST 2138), ATGL pS406 (Abcam ab135093), HSL (CST 4107), HSL pS660 (CST 4126), pPKA substrate (CST 9624 s), PPAR-γ (CST 2435S). Data were expressed as mean ± s.e.m. and P values were calculated using two-tailed Student’s t-test for pairwise comparisons, one-way ANOVA for multiple comparisons, and two-way ANOVA for multiple comparisons involving two independent variables. ANOVA analyses were subjected to Bonferroni’s post hoc test. Sample sizes were determined on the basis of previous experiments using similar methodologies. For in vivo studies, mice were randomly assigned to treatment groups. Mass spectrometric analyses were blinded to experimental conditions.

News Article | November 30, 2016
Site: www.nature.com

No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment. The C. marinus laboratory stocks were bred according to Neumann1, care was provided by the MFPL aquatic facility. Briefly, C. marinus were kept in 20 × 20 × 5 cm plastic containers with sand and natural seawater diluted to 15‰ with desalted water, fed diatoms (Phaeodactylum tricornutum, strain UTEX 646) in early larval stages and nettle powder in later stages. Temperature in the climate chambers was set to 20 °C and the light–dark cycle was 12:12 (except where noted differently). Moonlight was simulated with an incandescent flashlight bulb (about 1 lx), which was switched on all night for four successive nights every 30 days. The genome assembly process (Extended Data Fig. 9a) was based on three sequencing libraries (Supplementary Table 10): a 0.2-kb insert library was prepared from a single adult male of the Jean laboratory strain (established from field samples taken at St. Jean-de-Luz, France, in 2007; >12 generations in the laboratory), which was starved and kept in seawater with penicillin (60 units per ml), streptomycin (60 μg ml−1) and neomycin (120 μg ml−1) during the last 2 weeks of development. DNA was extracted with a salting-out method46, sheared on a Covaris S2 sonicator (frequency sweeping mode; 4 °C; duty cycle, 10%; intensity, 7; cycles per burst, 300; microTUBE AFA fibre 6 × 16 mm; 30 s) and prepared for Illumina sequencing with standard protocols. A 2.2-kb and a 7.6-kb insert library were prepared from a polymorphic DNA pool of >300 field-caught Jean adult males by Eurofins MWG Operon (Ebersberg, Germany) according to the manufacturer’s protocol. Each library was sequenced in one lane of an Illumina HiSeq2000 with 100-bp paired-end reads at the Next Generation Sequencing unit of the Vienna Biocenter Core Facilities (http://vbcf.ac.at). Reads were filtered for read quality, adaptor and spacer sequences with cutadapt47 (−b −n 3 −e 0.1 −O 8 −q 20 −m 13) and duplicates were removed with fastq-mcf from ea-utils48 (−D 70). Read pairs were interleaved with ngm-utils49, leaving only paired reads. Contamination with human DNA found in the 0.2-kb library was removed by deleting reads matching the human genome at a phred-scaled quality score ≥ 20 (alignment with BWA50). Assembly into contigs with Velvet51 (scaffolding disabled; 57-bp kmers as determined by VelvetOptimiser52) was based solely on the less polymorphic 0.2-kb library. About 600 remaining adaptor sequences at the ends of assembled contigs were trimmed with cutadapt (−O 8 −e 0.1 −n 3). For assembly statistics see Supplementary Table 11. Scaffolding of the contigs was based on all three libraries and performed with SSPACE53 in two iterations, that is, scaffolds from the first round were scaffolded again. Using different parameters in the iterations (Supplementary Table 12) allowed different connections to be made and thus increased scaffold connectivity (Supplementary Table 13). The effect is probably owing to the polymorphic nature of the 2.2-kb and 7.6-kb libraries; it results in a ‘population-consensus most common arrangement of the scaffolds’. The iterative scaffolding process was performed with and without applying a size cut-off excluding contigs <1 kb, resulting in two independent assemblies (CLUMA_0.3 and CLUMA_0.4; see Extended Data Fig. 9a), which differed in overall connectivity and sequence content (Supplementary Table 11), but also in the identity and structure of the large scaffolds. In order to combine both connectivity and sequence content, and in order to resolve the contradictions in the structure of the largest scaffolds, the two assemblies were compared and reconciled in a manual super-scaffolding process, as detailed in Supplementary Method 1. Briefly, the overlap of scaffolds from the two assemblies was tested with BLAST searches and represented in a graphical network structure. Scaffolds with congruent sequence content in both assemblies would result in a linear network, whereas scaffolds with contradictory sequence content would result in branching networks. At the same time, both assemblies were subject to genetic linkage mapping based on genotypes obtained from restriction-site-associated DNA sequencing (RAD sequencing) of a published mapping family6 (Supplementary Method 2). The resulting genetic linkage information served to resolve the branching networks into the longest possible unambiguous linear sub-networks with consistent genetic linkage information (see scheme A in Supplementary Method 1). Finally, the structure of the resulting super-scaffolds was coded in YAML format and translated into DNA sequence with Scaffolder54, resulting in 75 mapped super-scaffolds. The remaining small and unmapped scaffolds were filtered for fragments of the mitochondrial genome, the histone gene cluster and 18S/28S ribosomal rDNA gene cluster, which were assembled separately (Supplementary Method 3; Extended Data Fig. 10). Unmapped scaffolds were also filtered for obvious contamination from other species (Supplementary Method 3). The degree to which the remaining unmapped scaffolds are leftover polymorphic variants of parts of the mapped super-scaffolds was estimated by blasting the former against the latter (Supplementary Method 3 and Supplementary Table 14). All scaffolds were subject to gap closing with GapFiller55 and repeated edges, that is, gaps with almost identical sequences at both sides that are generally not closed because of genetic polymorphisms, were assessed and if possible removed with a custom script (Supplementary Method 4; code available supplied as Source Data File). The final assembly CLUMA_1.0 was submitted under project PRJEB8339 (75 mapped scaffolds; 23,687 unmapped scaffolds ≥100 bp). The assembly and further information can also be obtained from ClunioBase (http://cluniobase.cibiv.univie.ac.at). Genetic linkage information for the final 75 super-scaffolds was obtained by repeating read mapping to genotype calling for the RAD sequencing experiment as described above (Supplementary Method 2), but now with assembly CLUMA_1.0 as a reference. This allowed us to place and orient super-scaffolds along the genetic linkage map (Fig. 1a and Extended Data Fig. 2). The positions of the recombination events within a scaffold were approximated as the middle between the positions of the two RAD markers between which the marker pattern changed from one map location to the next. The published genetic linkage map was refined and revised (Supplementary Method 5 and Extended Data Fig. 2). Based on the refined linkage map, QTL analysis of the published mapping family was repeated as described6 (Supplementary Table 4 and Supplementary Note 5). Using the correspondence between the reference assembly and the genetic linkage map, we were able to directly identify the genomic regions corresponding to the confidence intervals of the QTLs (Fig. 1 and Extended Data Fig. 5a, b). Assembled transcripts of a normalized cDNA library of all life stages and various C. marinus strains (454 sequencing) were available from previous experiments and RNA sequencing data was available for Jean strain adults (Illumina sequencing). Furthermore, specifically for genome annotation, RNA from 80 third instar larvae from the Jean and Por laboratory strains each was prepared for RNA sequencing according to standard protocols (Supplementary Method 6). Each sample was sequenced on a single lane of an Illumina HiSeq 2000. All transcript reads were submitted to the European Nucleotide Archive (ENA) under project PRJEB8339. For the adult and larval RNA sequencing data, raw reads were quality checked with fastqc56, trimmed for adaptors quality with cutadapt47 and filtered to contain only read pairs using the interleave command in ngm-utils49. Reads were assembled separately for larvae and adults with Trinity57 (path_reinforcement_distance: 25; maximum paired-end insert size: 1,500 bp; otherwise default parameters). Automated annotation was performed with MAKER258. Repeats were masked based on all available databases in repeatmasker. MAKER2 combined evidence from assembled transcripts (see above), mapped protein data sets from Culex quinquefasciatus (CpipJ1), Anopheles gambiae (AgamP3), Drosophila melanogaster (BDGP5), Danaus plexippus (DanPle_1.0), Apis mellifera (Amel4.0), Tribolium castaneum (Tcas3), Strigamia maritima (Smar1) and Daphnia pulex (Dappu1) and ab initio gene predictions with AUGUSTUS59 and SNAP60 into gene models. AUGUSTUS was trained for C. marinus based on assembled transcripts from the normalized cDNA library. SNAP was run with parameters for A. mellifera, which had the highest congruence with known C. marinus genes in preliminary trials (Supplementary Method 7). MAKER was set to infer gene models from all evidence combined (not transcripts only) and gene predictions without transcript evidence were allowed. Splice variant detection was enabled, single-exon genes had to be larger than 250 bp and intron size was limited to a maximum of 10 kb. All gene models within the QTL confidence intervals, as well as all putative circadian clock genes and light receptor genes were manually curated: exon–intron boundaries were corrected according to transcript evidence (approximately 500 gene models), chimeric gene models were separated into the underlying individual genes (approximately 100 gene models separated into around 300 gene models) and erroneously split gene models were joined (approximately 15 gene models). Finally, this resulted in 21,672 gene models, which were given IDs from CLUMA_CG000001 to CLUMA_CG021672 (‘CLUMA’ for Clunio marinus, following the controlled vocabulary of species from the UniProt Knowledgebase; CG for ‘computated gene’). Splice variants of the same gene (detected in 752 gene models) were identified by the suffix ‘-RA’, ‘-RB’ and so on, and the corresponding proteins by the suffix ‘-PA’, ‘-PB’ and so forth. Gene models were considered as supported if they overlapped with mapped transcripts or protein data (Supplementary Table 1). Gene counts for D. melanogaster were retrieved from BDGP5, version 75.546 and for A. gambiae from AgamP3, version 75.3. The putative identities of the C. marinus gene models were determined in reciprocal BLAST searches, first against UniProtKB/Swiss-Prot (8,379 gene models assigned) and if no hit was found, second against the non-redundant protein sequences (nr database) at NCBI (1,802 additional genes assigned). Reciprocal best hits with an e value < 1 × 10−10 were considered putative orthologues (termed ‘putative gene X’), non-reciprocal hits with the same e value were considered paralogues (termed ‘similar to’). All remaining gene models were searched against the PFAM database of protein domains (111 gene models assigned; termed ‘gene containing domain X’). If still no hit was found, the gene models were left unassigned (‘NA’). Genome-wide synteny between the C. marinus, D. melanogaster and A. gambiae genomes was assessed based on reciprocal best BLAST hits (e value < 10 × 10-10) between the three protein data sets (Ensembl Genomes, Release 22, for D. melanogaster and A. gambiae). Positions of pairwise orthologous genes were retrieved from the reference genomes (BDGP5, AgamP3 and CLUMA_1.0) and plotted with Circos61. C. marinus chromosome arms were delimited based on centromeric and telomeric signatures in genetic diversity and linkage disequilibrium (Extended Data Fig. 3c and Supplementary Table 3; for data source see ‘strain re-sequencing’ below). Homologues for C. marinus chromosome arms were assigned based on enrichment with putative orthologous genes from specific chromosome arms in D. melanogaster and A. gambiae (Extended Data Figs 3, 4 and Supplementary Table 3). Additionally, for the 5,388 detected putative 1:1:1 orthologues (C. marinus:D. melanogaster:A. gambiae), microsynteny was assessed by testing if all pairs of directly adjacent genes in one species were also directly adjacent in the other species. The degree of microsynteny was then calculated as the fraction of conserved adjacencies among all pairs of adjacent genes. From this fraction the relative levels of chromosomal rearrangements in the evolutionary lineage leading to C. marinus were estimated (Supplementary Note 3 and Extended Data Fig. 4). Genetic variation in five C. marinus strains (Extended Data Fig. 1) was assessed based on pooled-sequencing data from field-caught males from the strains of St. Jean-de-Luz (Jean; Basque Coast, France; sampled in 2007; n = 300), Port-en-Bessin (Por; Normandie, France; 2007; n = 300), as well as Vigo (Spain; 2005; n = 100), Helgoland (He; Germany; 2005; n = 300) and Bergen (Ber; Norway; 2005; n = 100). Samples from Vigo and Bergen, were provided by D. Neumann and C. Augustin, respectively. For each strain we chose the largest available number of individuals to obtain the best possible resolution of allele frequencies. Females are not available, because they are virtually invisible in the field. For an overview of the experimental procedure, see Extended Data Fig. 9b. DNA was extracted with a salting-out method46 from sub-pools of 50 males, the DNA pools were mixed at equal DNA amounts, sheared and prepared as described above and sequenced on four lanes of an Illumina HiSeq2000 with paired-end 100-bp reads (Ber and Vigo combined in one lane, distinguished by index reads). All reads were submitted to the European Nucleotide Archive (ENA) under project PRJEB8339. Sequencing reads were filtered for read quality and adaptor sequences with cutadapt47 (−b −n 2 −e 0.1 −O 8 −q 13 −m 15), interleaved with ngm-utils49 and duplicates were removed with fastq-mcf from ea-utils48 (−D 70). Reads were aligned to the mapped super-scaffolds of assembly CLUMA_1.0 with BWA50 (aln and sampe; maximal insert size (bp): −a 1500). Based on the unfiltered alignments, the samples from Por and Jean were screened for genomic inversions and indels relative to the reference sequence with the multi-sample version of DELLY62. Paired-end information was only considered if the mapping quality was high (q ≥ 20) (see also Supplementary Note 3). For population genomic analysis (Extended Data Fig. 9b), the alignments of the pool-sequencing (pool–seq) data from Vigo, Jean, Por, He and Ber were filtered for mapping quality (q ≥ 20), sorted, merged and indexed with SAMtools63. Reads were re-aligned around indels with the RealignerTargetCreator and the IndelRealigner in GATK64. The resulting coverage per strain is given in Supplementary Table 5. For identification of SNPs, a pileup file was created with the mpileup command of SAMtools63. Base Alignment Quality computation was disabled (−B); instead, after creating a synchronized file with the mpileup2sync script in PoPoolation265, indels that occurred more than ten times were masked (including 3 bp upstream and downstream) with the identify-indel-regions and filter-sync-by-gtf scripts of PoPoolations2. F values were determined with the fst-sliding script of PoPoolation2, applying a minimum allele count of 10 (so that any false-positive SNPs resulting from the remaining unmasked indels were effectively excluded) and a minimum coverage of 40× for the comparison between Por and Jean or 10× for the comparison of all five strains. F was calculated at a single base resolution, as well as in windows of 5 kb (step size, 1 kb). Individual SNPs were only considered for further analyses or plotted if they were significantly differentiated as assessed by Fisher’s exact test (fisher-test in PoPoolation2). Average genome-wide genetic differentiation between timing strains, as obtained by averaging over 5-kb sliding-windows, was compared to the respective timing differences and geographic distances (see Supplementary Table 8) in Mantel tests (Pearson’s product moment correlation; 9,999 permutations), as implemented in the vegan package in the R statistical programming environment (ref. 66). Geographic distances and circadian timing differences were determined as described previously67 (see Supplementary Table 8). For determination of lunar timing differences when comparing lunar with semilunar rhythms see Supplementary Note 6. In order to find genomic regions for which genetic differentiation is correlated with the timing differences between strains, the Mantel test was then applied to 5-kb genomic windows every 1 kb along the reference sequence. 5 kb is roughly the average size of a gene locus in C. marinus. Windows with a correlation coefficient of r ≥ 0.5 were tested for significance (999 permutations). For each genomic position the number of overlapping significantly correlated 5-kb windows was enumerated, resulting in a correlation score (CS; ranging from 0 to 5). Genetic diversity, measured as Watterson’s theta (θ ), for each strain was assessed with PoPoolation1.1.2 (ref. 68) in 20-kb windows with 10-kb steps. In order to save computing time, the pileup files of Jean, Por and He were linearly downscaled to 100× coverage with the subsample-pileup script (‘fraction’ option), positions below 100× coverage were discarded. Indel regions were excluded (default in PoPoolation 1.1.2) and a minimum of 66% of a sliding window needed to be covered. SNPs were only considered in θ calculations if present ≥2 times, leading to slight inconsistencies in θ estimates between strains due to differing coverage, but not affecting diversity comparisons within strains. Linkage disequilibrium between the SNPs was determined for the Por and Jean strains with LDx69, assuming physical linkage between alleles on the same read or read pairs. r2 was determined by a maximum likelihood estimator, minimum and maximum read depths corresponded to the 2.5% and 97.5% coverage depths for each population (Jean, 111–315; Por, 98–319), total insert distance was limited to 600 bp, minimum phred-scaled base quality was 20, minimum allele frequency was 0.1 and a minimum coverage per pair of SNPs was 11. SNPs were binned by their physical distance for the plots (0–200 bp, 200–400 bp, 400–600 bp), with the mean value plotted. Finally, small indels (<30 bp) in the Por and Jean strains were detected with the UnifiedGenotyper (−glm INDEL) in GATK64 for positions with more than 20× coverage. Genetic differentiation for indels was calculated with the classical formula F  = (H −H )/H , where H is the average expected heterozygosity according to Hardy–Weinberg Equilibrium (HWE) in the two subpopulations and H is the expected heterozygosity in HWE of the hypothetical combined total population. If more than two alleles were present, only the two most abundant alleles were considered in the calculation of F . Gene models from the automated annotation were considered candidate genes, if they fulfilled the following criteria. (1) The gene was located within the reference sequence corresponding to the QTL confidence intervals as determined for the Por and Jean strains. (2) The gene contained a strongly differentiated SNP or small indel or it was directly adjacent to such a SNP or small indel (F  ≥ 0.8 for Por versus Jean, that is, the strains used in QTL mapping). This resulted in a preliminary list of 133 genes based on the comparison between Por and Jean (Supplementary Table 6). These candidate genes were narrowed down based on their overlap with genomic 5-kb windows, for which genetic differentiation between five European timing strains correlated with their timing differences (Fig. 1a, Extended Data Fig. 5a, b and Supplementary Table 9). The location and putative effects of the SNPs and indels relative to the gene models were assessed with SNPeff70 (−ud 0, otherwise default parameters; Extended Data Fig. 5c, d and Supplementary Tables 6, 9). For Gene Ontology (GO) term analysis, all C. marinus gene models with putative orthologues in the UniProtKB/Swiss-Prot and non-redundant protein sequences (nr) databases based on reciprocal best BLAST hits (see above) were annotated with the GO terms of their detected orthologues (6,837 gene models). Paralogues were not annotated. The enrichment of candidate SNPs and indels (F  ≥ 0.8 between Por and Jean) in specific GO terms was tested with SNP2GO71 (min.regions = 1, otherwise default parameters). Hyper-geometric sampling was applied to test if individual genes of a GO term or a whole pathway of genes are enriched for SNPs (Supplementary Table 7). RNA-seq data of the Por and Jean strains for CaMKII.1 were obtained from the larval RNA sequencing experiment described above. Besides four assembled full-length transcripts (RA–RD) from RNA-seq and assembled EST libraries, additional partial transcripts (RE–RO) were identified by PCR amplification (for PCR primers see Supplementary Table 15), gel extraction (QIAquick Gel Extraction Kit, Qiagen), cloning with the CloneJET PCR Cloning Kit (Thermo Scientific) and Sanger sequencing with pJET1.2 primers (LGC Genomics & Microsynth). cDNA was prepared from RNA extracted from third instar larvae of the Por and Jean laboratory strains (RNA extraction with RNeasy Plus Mini Kit, Qiagen; reverse transcription with QuantiTect Reverse Transcription Kit, Qiagen). qPCR was performed with variant-specific primers and actin was used as a control gene (Supplementary Table 16). cDNA was obtained from independent pools of 20 third instar larvae of the Por and Jean strains. Sample size was ten pools per strain to cover different time points during the day and to test for reproducibility (two samples each at zeitgeber times 0, 4, 8, 16 and 20; for one Por sample extraction failed; RNA extraction and reverse transcription as above). qPCR was performed with Power SYBR Green PCR Master Mix on a StepOnePlus Real Time System (both Applied Biosystems). Fold-changes were calculated according to ref. 72 in a custom excel sheet. The assumption of equal variance was violated for the RD comparison (F-test) and the assumption of normal distribution was violated for the data of RA and RC in the Por strain (Shapiro–Wilk normality test), possibly reflecting circadian effects in the samples from different times of day. Thus, expression differences were assessed for significance in a two-tailed Wilcoxon rank-sum test (wilcox.test in R66). Holm correction73 was used for multiple testing (default in p.adjust function of R). PCR fragments containing the CaMKII.1 linker region (exons 10–15) were amplified from genomic Por or Jean DNA, respectively, with primers CaMKII-Sc61-F-344112 and CaMKII-Sc61-R-351298 (Supplementary Table 15), cloned with the CloneJET PCR Cloning Kit (Thermo Scientific), transferred into the pcDNA3.1+ vector using NotI and XbaI (Thermo Scientific). These constructs were transfected into D. melanogaster S2R+ cells and RNA was prepared 48 h after transfection. After DNase digestion, isoform expression was analysed by radioactive, splicing-sensitive RT–PCR (primers in Supplementary Table 17) and phosphorimager quantification as described74. Identity of isoforms is based on size and sequencing of PCR products. To test for reproducibility, there were seven biological replicates (raw data in Supplementary Table 18). As the assumptions of equal variance (F-test) and normal distribution of data (Shapiro–Wilk normality test) were not violated, the significance of expression differences was assessed in unpaired, two-sided two-sample t-tests. Holm correction73 was used for multiple testing (default in p.adjust function of R). S2R+ cells were obtained from the laboratory of S. Sigrist, regularly authenticated by morphology and routinely tested for absence of mycoplasma contamination. The entire experiment was reproduced several months later with three biological replicates (raw data in Supplementary Table 18). Firefly luciferase is driven from a period 3X69 promoter under control of the CLOCK and CYCLE protein19, 21. The D. melanogaster pAc–clk construct was obtained from F. Rouyer, pCopia–Renilla luciferase and period 3X69–luc reporter constructs from M. Rosbash, a [Ca2+]-independent mouse CaMKIIT286D was provided by M. Mayford. The CaMKII inhibitor KN-93 was purchased from Abcam (#ab120980). C. marinus Cyc, C. marinus Clk and C. marinus CaMKII.1–RD were cloned into the pAc5.1/V5–His A plasmid (Invitrogen) with stop codons before the tag. The Q5 Site-Directed Mutagenesis Kit (NEB) was used to make kinase-dead and [Ca2+]-independent versions of C. marinus CaMKII.1–RD (for primers, see Supplementary Table 17). D. melanogaster S2 cells (Invitrogen) were cultured at 25 °C in Schneider’s D. melanogaster medium (Lonza) supplemented with fetal bovine serum (FBS, 10%, heat-inactivated), penicillin (100 U ml−1), streptomycin (100 μg ml−1) and 2 mM l-glutamine; Sigma). Cells were seeded into 24-well plates (800,000 cells per well) and transfected with Effectene transfection reagent (Qiagen) according to the manufacturer’s instructions. Experiment with mouse [Ca2+]-independent CaMKII: 25 ng pCopia–Renilla, 10 ng period 3X69–luc, 0.5 ng D. melanogaster pAc–clk, 200 ng mouse pAc–CaMKIIT286D. Experiment with CaMKII inhibitor KN-93: 25 ng pCopia–Renilla, 10 ng period 3X69–luc, 0.5 ng D. melanogaster pAc–clk, various amounts of KN-93. Experiment with C. marinus genes: 25 ng pCopia–Renilla, 10 ng period 3X69–luc, 100 ng C. marinus pAc–cyc, 100 ng C. marinus pAc–clk, 200 ng C. marinus CaMKII.1–RDK42R or 200 ng C. marinus CaMKII.1–RDT286D. In all experiments, the transfection mix was filled up with empty pAc5.1/V5–His A vector to a total of 435 ng DNA per well. After 48 h, cells were washed with PBS and lysed with Passive Lysis Buffer (Promega). Luciferase activities were determined on a Synergy H1 plate reader (Biotek) using a Dual-Luciferase Reporter Assay System (Promega). For each biological replicate three independent cell lysates were measured and their mean value determined. Firefly luciferase activity was normalized to Renilla luciferase activity and values were normalized to controls transfected with D. melanogaster pAc–clk or C. marinus pAc–clk and C. marinus pAc–cyc, respectively. S2 cells (Invitrogen/Life Technologies, Cat.no. R690-07) were regularly authenticated by morphology and routinely tested for absence of mycoplasma contamination (Lonza MycoAlert). Sample size was chosen to test for reproducibility. For circadian free-run experiments, culture boxes of the Por, He and Jean strains were transferred from light–dark cycle (16:8) to constant dim light (light–light cycle, about 100 lx). Emerging adults were collected in 1-h intervals by a custom made C. marinus fraction collector (similar to those described in ref. 75) and counted once a day. Because collection was automated, the experimenter had no influence on the results and blinding was not necessary. As the circalunar clock restricts adult emergence to a few days, the circadian emergence rhythm can only be assessed over a few days. Several culture boxes were transferred to a light–light cycle at different time points. The resulting emergence data were combined for each strain using the switch to a light–light cycle as a common reference point. We used the maximum number of available individuals. Free-running period was calculated as the mean interval between subsequent emergence peaks, weighting each peak by the number of individuals. All sequence data are deposited in the European Nucleotide Archive (ENA) under PRJEB8339. The reference genome is also on ClunioBase (http://cluniobase.cibiv.univie.ac.at). Machine readable super-scaffolding data and the computer source code for the removal of repeated edges are supplied as source data files.

News Article | February 15, 2017
Site: www.nature.com

No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment. We sequenced Chenopodium quinoa Willd. (quinoa) accession PI 614886 (BioSample accession code SAMN04338310; also known as NSL 106399 and QQ74). DNA was extracted from leaf and flower tissue of a single plant, as described in the “Preparing Arabidopsis Genomic DNA for Size-Selected ~20 kb SMRTbell Libraries” protocol (http://www.pacb.com/wp-content/uploads/2015/09/Shared-Protocol-Preparing-Arabidopsis-DNA-for-20-kb-SMRTbell-Libraries.pdf). DNA was purified twice with Beckman Coulter Genomics AMPure XP magnetic beads and assessed by standard agarose gel electrophoresis and Thermo Fisher Scientific Qubit Fluorometry. 100 Single-Molecule Real-Time (SMRT) cells were run on the PacBio RS II system with the P6-C4 chemistry by DNALink (Seoul). De novo assembly was conducted using the smrtmake assembly pipeline (https://github.com/PacificBiosciences/smrtmake) and the Celera Assembler, and the draft assembly was polished using the quiver algorithm. DNA was also sequenced using an Illumina HiSeq 2000 machine. For this, DNA was extracted from leaf tissue of a single soil-grown plant using the Qiagen DNeasy Plant Mini Kit. 500-bp paired-end (PE) libraries were prepared using the NEBNext Ultra DNA Library Prep Kit for Illumina. Sequencing reads were processed with Trimmomatic (v0.33)42, and reads <75 nucleotides in length after trimming were removed from further analysis. The remaining high-quality reads were assembled with Velvet (v1.2.10)43 using a k-mer of 75. High-molecular-weight DNA was isolated and labelled from leaf tissue of three-week old quinoa plants according to standard BioNano protocols, using the single-stranded nicking endonuclease Nt.BspQI. Labelled DNA was imaged automatically using the BioNano Irys system and de novo assembled into consensus physical maps using the BioNano IrysView analysis software. The final de novo assembly used only single molecules with a minimum length of 150 kb and eight labels per molecule. PacBio-BioNano hybrid scaffolds were identified using IrysView’s hybrid scaffold alignment subprogram. Using the same DNA prepared for PacBio sequencing, a Chicago library was prepared as described previously10. The library was sequenced on an Illumina HiSeq 2500. Chicago sequence data (in FASTQ format) was used to scaffold the PacBio-BioNano hybrid assembly using HiRise, a software pipeline designed specifically for using Chicago data to assemble genomes10. Chicago library sequences were aligned to the draft input assembly using a modified SNAP read mapper (http://snap.cs.berkeley.edu). The separations of Chicago read pairs mapped within draft scaffolds were analysed by HiRise to produce a likelihood model, and the resulting likelihood model was used to identify putative mis-joins and score prospective joins. A population was developed by crossing Kurmi (green, sweet) and 0654 (red, bitter). Homozygous high- and low-saponin F lines were identified by planting 12 F seeds derived from each F line, harvesting F seed from these F plants, and then performing foam tests on the F seed. Phenotyping was validated using gas chromatography/mass spectrometry (GC/MS). RNA was extracted from inflorescences containing a mixture of flowers and seeds at various stages of development from the parents and 45 individual F progeny. RNA extraction and Illumina sequencing were performed as described above. Sequencing reads from all lines were trimmed using Trimmomatic and mapped to the reference assembly using TopHat44, and SNPs were called using SAMtools mpileup (v1.1)45. For linkage mapping, markers were assigned to linkage groups on the basis of the grouping by JoinMap v4.1. Using the maximum likelihood algorithm of JoinMap, the order of the markers was determined; using this as start order and fixed order, regression mapping in JoinMap was used to determine the cM distances. Genes differentially expressed between bitter and sweet lines and between green and red lines were identified using default parameters of the Cuffdiff function of the Cufflinks program46. A second mapping population was developed by crossing Atlas (sweet) and Carina Red (bitter). Bitter and sweet F lines were identified by performing foam and taste tests on the F seed. DNA sequencing was performed with DNA from the parents and 94 sweet F lines, as described above, and sequencing reads were mapped to the reference assembly using BWA. SNPs were called in the parents and in a merged file containing all combined F lines. Genotype calls were generated for the 94 F genotypes by summing up read counts over a sliding window of 500 variants, at all variant positions for which the parents were homozygous and polymorphic. Over each 500-variant stretch, all reads with Atlas alleles were summed, and all reads with the Carina Red allele were summed. Markers were assigned to linkage groups using JoinMap, with regression mapping used to obtain the genetic maps per linkage group. The Kurmi × 0654 and Atlas × Carina Red maps were integrated with the previously published quinoa linkage map13, with the Kurmi × 0654 map being used as the reference for the positions of anchor markers and scaling. We selected markers from the same scaffold that were in the same 10,000-bp bin in the assembly. The anchor markers on the alternative map received the position of the Kurmi × 0654 map anchor marker in the integrated map. This process was repeated with anchor markers at the 100,000-bp bin level. The assumption is that at the 100,000-bp bin level recombination should essentially be zero. On this level, a regression of cM position on both maps yielded R2 values >0.85 and often >0.9, so the regression line can easily be used for interpolating the positions of the alternative map towards the corresponding position on the Kurmi × 0654 map. All Kurmi × 0654 markers went into the integrated map on their original position. Pseudomolecules were assembled by concatenating scaffolds based on their order and orientation as determined from the integrated linkage map. An AGP (‘A Golden Path’) file was made that describes the positions of the scaffold-based assembly in coordinates of the pseudomolecule assembly, with 100 ‘N’s inserted between consecutive scaffolds. Based on these coordinates, custom scripts were used to generate the pseudomolecule assembly and to recoordinate the annotation file. DNA was extracted from C. pallidicaule (PI 478407) and C. suecicum (BYU 1480) and was sent to the Beijing Genomic Institute (BGI, Hong Kong) where one 180-bp PE library and two mate-pair libraries with insert sizes of 3 and 6 kb were prepared and sequenced on the Illumina HiSeq platform to obtain 2 × 100-bp reads for each library. The generated reads were trimmed using the quality-based trimming tool Sickle (https://github.com/najoshi/sickle). The trimmed reads were then assembled using the ALLPATHS-LG assembler47, and GapCloser v1.1248 was used to resolve N spacers and gap lengths produced by the ALLPATHS-LG assembler. Repeat families found in the genome assemblies of quinoa, C. pallidicaule and C. suecicum (see Supplementary Information 3) were first independently identified de novo and classified using the software package RepeatModeler49. RepeatMasker50 was used to discover and identify repeats within the respective genomes. AUGUSTUS51 was used for ab initio gene prediction, using model training based on coding sequences from Amaranthus hypochondriacus, Beta vulgaris, Spinacia oleracea and Arabidopsis thaliana. RNA-seq and isoform sequencing reads generated from RNA of different tissues were mapped onto the reference genome using Bowtie 2 (ref. 52) and GMAP53, respectively. Hints with locations of potential intron–exon boundaries were generated from the alignment files with the software package BAM2hints in the MAKER package54. MAKER with AUGUSTUS (intron–exon boundary hints provided from RNA-seq and isoform sequencing) was then used to predict genes in the repeat-masked reference genome. To help guide the prediction process, peptide sequences from B. vulgaris and the original quinoa full-length transcript (provided as EST evidence) were used by MAKER during the prediction. Genes were characterized for their putative function by performing a BLAST search of the peptide sequences against the UniProt database. PFAM domains and InterProScan ID were added to the gene models using the scripts provided in the MAKER package. The following quinoa accessions were chosen for DNA re-sequencing: 0654, Ollague, Real, Pasankalla (BYU 1202), Kurmi, CICA-17, Regalona (BYU 947), Salcedo INIA, G-205-95DK, Cherry Vanilla (BYU 1439), Chucapaca, Ku-2, PI 634921 (Ames 22157), Atlas and Carina Red. The following accessions of C. berlandieri were sequenced: var. boscianum (BYU 937), var. macrocalycium (BYU 803), var. zschackei (BYU 1314), var. sinuatum (BYU 14108), and subsp. nuttaliae (‘Huauzontle’). Two accessions of C. hircinum (BYU 566 and BYU 1101) were also sequenced. All sequencing was performed with an Illumina HiSeq 2000 machine, using either 125-bp (Atlas and Carina Red) or 100-bp (all other accessions) paired-end libraries. Reads were trimmed using Trimmomatic and mapped to the reference assembly using BWA (v0.7.10)55. Read alignments were manipulated with SAMtools, and the mpileup function of SAMtools was used to call SNPs. Orthologous and paralogous gene clusters were identified using OrthoMCL28. Recommended settings were used for all-against-all BLASTP comparisons (Blast+ v2.3.056) and OrthoMCL analyses. Custom Perl scripts were used to process OrthoMCL outputs for visualization with InteractiVenn57. Using OrthoMCL, orthologous gene sets containing two copies in quinoa and one copy each in C. pallidicaule, C. suecicum, and B. vulgaris were identified. In total, 7,433 gene sets were chosen, and their amino acid sequences were aligned individually for each set using MAFFT58. The 7,433 alignments were converted into PHYLIP format files by the seqret command in the EMBOSS package59. Individual gene trees were then constructed using the maximum likelihood method using proml in PHYLIP60. In addition, the genomic variants of all 25 sequenced taxa (Supplementary Data 5) relative to the reference sequence were called based on the mapped Illumina reads in 25 BAM files using SAMtools. To call variants in the reference genome (PI 614886), Illumina sequencing reads were mapped to the reference assembly. Variants were then filtered using VCFtools61 and SAMtools, and the qualified SNPs were combined into a single VCF file which was used as an input into SNPhylo62 to construct the phylogenetic relationship using maximum likelihood and 1,000 bootstrap iterations. To identify FT homologues, the protein sequence from the A. thaliana flowering time gene FT was used as a BLAST query. Filtering for hits with an E value <1 × e−3 and with RNA-seq evidence resulted in the identification of four quinoa proteins. One quinoa protein (AUR62013052) appeared to be comprised of two tandem repeats which were separated for the purposes of phylogenetic analysis. For the construction of the phylogenetic tree, protein sequences from these five quinoa FT homologues were aligned using Clustal Omega63 along with two B. vulgaris (gene models: BvFT1-miuf.t1, BvFT2-eewx.t1) and one A. thaliana (AT1G65480.1) homologue. Phylogenetic analysis was performed with MEGA64 (v6.06). The JTT model was selected as the best fitting model. The initial phylogenetic tree was estimated using the neighbour joining method (bootstrap value = 50, Gaps/ Missing Data Treatment = Partial Deletion, Cutoff 95%), and the final tree was estimated using the maximum likelihood method with a bootstrap value of 1,000 replicates. The syntenic relationships between the coding sequences of the chromosomal regions surrounding these FT genes were visualized using the CoGE65 GEvo tool and the Multi-Genome Synteny Viewer66. The alignment of bHLH domains was performed with Clustal Omega63, using sequences from Mertens et al.39. The phylogeny was inferred using the maximum likelihood method based on the JTT matrix-based model67. Initial trees for the heuristic search were obtained automatically by applying Neighbour-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with superior log likelihood value. All positions containing gaps and missing data were eliminated. Trimmed PE Illumina sequencing reads that were used for the de novo assembly of C. suecicum and C. pallidicaule were mapped onto the reference quinoa genome using the default settings of BWA. For every base in the quinoa genome, the depth coverage of properly paired reads from the C. suecicum and C. pallidicaule mapping was calculated using the program GenomeCoverage in the BEDtools package68. A custom Perl script was used to calculate the percentage of each scaffold with more than 5× coverage from both diploids. Scaffolds were assigned to the A or B sub-genome if >65% of the bases were covered by reads from one diploid and <25% of the bases were covered by reads from the other diploid. The relationship between the quinoa sub-genomes and the diploid species C. pallidicaule and C. suecicum was presented in a circle proportional to their sizes using Circos69. Orthologous regions in the three species were identified using BLASTN searches of the quinoa genome against each diploid genome individually. Single top BLASTN hits longer than 8 kb were selected and presented as links between the quinoa genome assembly (arranged in chromosomes, see Supplementary Information 7.3) and the two diploid genome assemblies on the Circos plot (Fig. 2a). Sub-genome synteny was analysed by plotting the positions of homoeologous pairs of A- and B-sub-genome pairs within the context of the 18 chromosomes using Circos. Synteny between the sub-genomes and B. vulgaris was assessed by first creating pseudomolecules by concatenating scaffolds which were known to be ordered and oriented within each of the nine chromosomes. Syntenic regions between these B. vulgaris chromosomes and those of quinoa were then identified using the recommended settings of the CoGe SynMap tool70 and visualized using MCScanX71 and VGSC72. For the purposes of visualization, quinoa chromosomes CqB05, CqA08, CqB11, CqA15 and CqB16 were inverted. Quinoa seeds were embedded in a 2% carboxymethylcellulose solution and frozen above liquid nitrogen. Sections of 50 μm thickness were obtained using a Reichert-Jung Frigocut 2800N, modified to use a Feather C35 blade holder and blades at −20 °C using a modified Kawamoto method73. A 2,5-dihydroxybenzoic acid (Sigma-Aldrich) matrix (40 mg ml−1 in 70% methanol) was applied using a HTX TM-Sprayer (HTX Technologies LLC) with attached LC20-AD HPLC pump (Shimadzu Scientific Instruments). Sections were vacuum dried in a desiccator before analysis. The optical image was generated using an Epson 4400 Flatbed Scanner at 4,800 d.p.i. For mass spectrometric analyses, a Bruker SolariX XR with 7T magnet was used. Images were generated using Bruker Compass FlexImaging 4.1. Data were normalized to the TIC, and brightness optimization was employed to enhance visualization of the distribution of selected compounds. Individual spectra were recalibrated using Bruker Compass DataAnalysis 4.4 to internally lock masses of known DHB clusters: C H O  = 273.039364 and C H O  = 409.055408 m/z. Accurate mass measurements for individual saponins and identified compounds were run using continuous accumulation of selected ions (CASI) using mass windows of 50–100 m/z and a transient of 4 megaword generating a transient of 2.93 s providing a mass resolving power of approximately 390,000 at 400 m/z. Lipids were putatively assigned by searching the LipidMaps database74 (http://www.lipidmaps.org) and lipid class confirmed by collision-induced dissociation using a 10 m/z window centred around the monoisotopic peak with collision energy of between 15–20 V. Quinoa flowers were marked at anthesis, and seeds were sampled at 12, 16, 20 and 24 days after anthesis. A pool of five seeds from each time point was analysed using GC/MS. Quantification of saponins was performed indirectly by quantifying oleanolic acid (OA) derived from the hydrolysis of saponins extracted from quinoa seeds. Derivatized solution was analysed using single quadrupole GC/MS system (Agilent 7890 GC/5975C MSD) equipped with EI source at ionisation energy of 70 eV. Chromatography separation was performed using DB-5MS fused silica capillary column (30m × 0.25 mm I.D., 0.25 μm film thickness; Agilent J&W Scientific), chemically bonded with 5% phenyl 95% methylpolysiloxane cross-linked stationary phase. Helium was used as the carrier gas with constant flow rate of 1.0 ml min−1. The quantification of OA in each sample was performed using a standard curve based on standards of OA. Specific, individual saponins were identified in quinoa using a preparation of 20 mg of seeds performed according a modified protocol from Giavalisco et al.75. Samples were measured with a Waters ACQUITY Reversed Phase Ultra Performance Liquid Chromatography (RP-UPLC) coupled to a Thermo-Fisher Exactive mass spectrometer, which consists of an electrospray ionisation source and an Orbitrap mass analyser. A C18 column was used for the hydrophilic measurements. Chromatograms were recorded in full-scan MS mode (mass range, 100 −1,500). Extraction of the LC/MS data was accomplished with the software REFINER MS 7.5 (GeneData). SwissModel76 was used to produce homology models for the bHLH region of AUR62017204, AUR62017206 and AUR62010677. RaptorX77 was used for prediction of secondary structure and disorder. QUARK78 was used for ab initio modelling of the C-terminal domain, and the DALI server79 was used for 3D homology searches of this region. Models were manually inspected and evaluated using the PyMOL program (http://pymol.org). The genome assemblies and sequence data for C. quinoa, C. pallidicaule and C. suecicum were deposited at NCBI under BioProject codes PRJNA306026, PRJNA326220 and PRJNA326219, respectively. Additional accessions numbers for deposited data can be found in Supplementary Data 9. The quinoa genome can also be accessed at http://www.cbrc.kaust.edu.sa/chenopodiumdb/ and on the Phytozome database (http://www.phytozome.net/).

News Article | November 23, 2016
Site: www.nature.com

Experiments were approved by the local ethical committee of the University of Bordeaux (approval number 501350-A) and the French Ministry of Agriculture and Forestry (authorization number 3306369). Mice were maintained under standard conditions (food and water ad libitum; 12 h–12 h light–dark cycle, light on at 7:00; experiments were performed between 9:00 and 17:00). Male C57BL/6N mice were purchased from Janvier (France). Wild-type (CB +/+) and CB −/− female and male mice (2–4 months old) were obtained, bred and genotyped as described31. Only male mice were used for behavioural experiments. For most experiments CB +/+ and CB −/− were littermates. For primary cell cultures, pups were obtained from homozygote pairs. No method of randomization to assign experimental groups was performed and the number of mice in each experimental group was similar. No statistical methods were used to predetermine sample size. THC was obtained from THC Pharm GmbH (Frankfurt, Germany). HU210 was synthesized as described32. WIN55-212-2, KH7, PTX, bicarbonate (HCO −), forskolin, carbonyl cyanide-4-(trifluoromethoxy)phenylhydrazone (FCCP), oligomycin, antimycin, rotenone, picrotoxin, GTPγS and other chemicals used in this study were purchased from Sigma-Aldrich (St-Louis, USA). [3H]CP55,940 (162.5 Ci mmol−1) and [35S]GTPγS (1,250 Ci mmol−1) were purchased from Perkin Elmer NEN (Boston, USA). For in vivo administration, WIN was dissolved in a mixture of saline (0.9% NaCl) with 2% DMSO and 2% cremophor; THC was dissolved in a mixture of 4% ethanol, 5% cremophor and saline; and KH7 was dissolved in 10% cremophor, 2.5% DMSO and saline. Vehicles contained the same amounts of solvents. All drugs were prepared freshly before the experiments. For in vitro experiments, PTX, HCO − and forskolin were dissolved in water. KH7, HU210 and WIN were dissolved in DMSO. THC, oligomycin, FCCP, antimycin and rotenone were dissolved in ethanol. Corresponding vehicle solutions were used in control experiments. DMSO and ethanol were no more than 0.001%. Doses and concentrations of the different drugs were chosen on the basis of previous published data or preliminary experiments. The N-terminal deletion of the first 22 amino acids (66 base pairs) in the mouse CB -receptor coding sequence, to obtain the DN22-CB mutant, as well as the generation of the mitochondrially targeted constitutively active form of PKA (MLS–PKA-CA) was achieved by polymerase chain reaction (PCR). In brief, for DN22-CB a forward primer hybridizing from the 67th base starting from ATG was coupled to a reverse primer hybridizing to the end of the coding sequence, including the TGA stop codon. In order to guarantee accurate translation of the construct, the forward primer included an ATG codon upstream of the hybridizing sequence. The cDNA for DN22-CB was amplified using HF Platinum DNA polymerase (Invitrogen) and inserted into a PCRII-Topo vector (Invitrogen) according to the manufacturers’ instructions. The absence of amplification mismatches was then verified by DNA sequencing. Primers used were: forward, with the inserted ATG in bold, 5′-ATGGTGGGCTCAAATGACATTCAG-3′; reverse, with the stop codon in bold, 5′-TCACAGAGCCTCGGCAGACGTG-3′. The cDNA sequence for CB or DN22-CB was inserted into a modified version of a pcDNA3.1 mammalian expression vector using BamHI–EcoRV according to standard cloning procedures. This modification allowed the co-expression of CB or DN22-CB with an mCherry fluorescent protein for control of transfection efficiency. For the study of mitochondrial motility, the coding sequence of CB or DN22-CB was fused to GFP using the pEGFP-N1 vector (Addgene) according to the manufacturer’s instructions. For MLS–PKA-CA, a forward primer including a restriction site after the initial ATG codon for future subcloning with mitochondrial leading sequences was coupled to a reverse primer hybridized to the end of the coding sequence of the catalytic subunit of PKA (pET15b PKA Cat from Addgene)33, including a myc epitope and a TGA stop codon. Subsequently, the construct was subcloned into a pcDNA3.1 vector as an intermediate step and the QuikChange Multi Site-Directed Mutagenesis Kit (Agilent Genomics, Santa Clara, CA, USA) was used to mutate histidine-87 to glutamine, and tryptophan-196 to arginine to generate a constitutively active form of PKA24. Finally, the construct was fused to a 4×MLS sequence to target the constitutively active PKA to mitochondria (MLS–PKA-CA). The absence of amplification mismatches and confirmation of mutagenesis was then verified by DNA sequencing. Primers used were: forward, with the inserted ATG in bold, 5′-TATCTGGATCCCTATGCAATTGGGCAACGCCGCCGCCGCCAAGAAGG-3′; reverse, with the stop codon and the myc epitope in bold, 5′-TATGATCTAGAGATCACAGATCCTCTTCTGAGATGAGTTTTTGTTCAAACTCAGTAAACTCCTTGCCACACTTC-3′; and for mutagenesis of H87, 5′-AAAGCAGATCGAGCAAACTCTGAATGAGAAG-3′; and W196 5′-GTGAAAGGCCGTACTAGGACCTTGTGTGGGA-3′ (in bold are the mutated codons). The phosphomimetic version of NDUFS2 was custom synthesized by Eurofins Genomics (Germany). Briefly, the NDUFS2 sequence (NM_153064) was modified to obtain a phosphomimetic form mutating the 3 potential phosphorylation sites of PKA. The sites were chosen because consensus for their PKA phosphorylation nature was found between the two online available phosphorylation prediction algorithms, PhosphoMotif Finder (http://www.hprd.org/PhosphoMotif_finder) and PKA prediction site (http://mendel.imp.ac.at/pkaPS/)25, 26. By this approach, four sites were identified. One of these was excluded, because it is present on the mitochondrial leading sequence of NDUFS2. Thus, serines 296, 349 and 374 were mutated to aspartic acid to obtain a phosphomimetic version of NDUFS2 (NDUFS2-PM). A myc epitope was added at the C terminus of the protein for detection. The cDNAs coding for mouse CB , DN22-CB , MLS–PKA-CA, NDUFS2-PM and for GFP were subcloned into the pAM–CBA vector using standard molecular cloning techniques. The resulting vectors were transfected by calcium phosphate precipitation into HEK293 cells together with the rAAV-helper-plasmid pFd6 and AAV1/2-serotype-packaging plasmids pRV1 and pH21 (ref. 34.). The viruses were then purified and titred as previously described35, 36. Virus titres were between 1010 and 1011 genomic copies per ml for all batches of virus used in the study. All cell lines were originally obtained from ATCC (https://www.lgcstandards-atcc.org/Products/Cells_and_Microorganisms/Cell_Lines.aspx?geo_country=fr). Mouse 3T3 cells (3T3 F442A), HeLa and HEK293 cells were grown in Dulbecco modified Eagle’s medium (DMEM) supplemented with 10% fetal bovine serum (FBS), 4.5 g l−1 glucose, 2 mM glutamine, 1 mM pyruvate. HEK293 cells were transfected with control plasmid, CB or DN22-CB cDNA coupled with mCherry cloned in pcDNA 3.1(+), respectively. Cells were transfected with sAC–HA or mtsAC–HA provided by G. Manfredi, (see refs 17, 21) or small hairpin RNA (shRNA) targeting AKAP121 provided by A. Feliciello (see ref. 22). HeLa cells were transfected with MLS–PKA-CA or NDUFS2-PM (see above). The transfections were carried out using FugeneHD (Roche, France) for 3T3 cells and polyethylenimine (PEI, Polysciences, USA) for HEK293 and HeLa cells, according to the manufacturers’ protocols. For biochemical experiments, primary hippocampal cultures were prepared from CB +/+ and CB −/− P0–P1 mice. Briefly, after mice were killed by decapitation, hippocampi were extracted in dissection medium (10 mM HEPES, 0.3% glucose in Hank’s balanced salt solution, pH 7.4) and dissociated in 0.25% trypsin for 30 min. Where indicated, dissociated cells were transfected with sAC–HA using the Amaxa P3 primary cell 4D-nucleofector kit (Lonza, France), according to the manufacturer’s protocol. Cells were plated on poly-l-lysine-coated 96-well dishes using neurobasal/B27 medium (supplemented with 5% FBS, 2 mM l-glutamine, 1 mM pyruvate, 1 mM sodium lactate, 0.3% glucose and 37.5 mM NaCl) at a density of 50,000 cells per well. One hour after plating, the serum was removed. Primary hippocampal cultures contained both neurons and astrocytes, and were used at 3 days in vitro (DIV). For live imaging of mitochondrial mobility, primary hippocampal cultures were prepared from CB −/− P0–P1 mice. Brains were extracted in PBS containing 0.6% glucose and 0.5% bovine serum albumin (BSA) and the hippocampi were dissected. To dissociate cells, a kit for dissociation of postnatal neurons was used following the manufacturer’s instructions (Milteny Biotec, France). Cells were seeded onto 0.5 mg ml−1 poly-l-lysine-coated 35-mm glass-bottom dishes (MatTek Corporation, France) for live imaging in neurobasal medium (Gibco, France) containing 2 mM l-glutamine, 120 μg ml−1 penicillin, 200 μg ml−1 streptomycin and B27 supplement (Invitrogen, France), and were maintained at 37 °C in the presence of 5% CO . Cells were cultured for 7 to 9 days. Neuron transfection was carried out at 4–5 DIV, using a standard calcium phosphate transfection protocol, with a 1:2 DNA ratio of plasmids expressing pDsred2–mito37 to GFP, CB -GFP or DN22CB -GFP, respectively. Axonal mitochondrial mobility was recorded 72–96 h after transfection (see below). Cannabinoid treatments altered the percentage of axonal mobile mitochondria, without altering velocity, dwelling time or travelled distance (data not shown). Mouse fibroblasts were generated from P0–P1 CB −/− pups. After mice were killed by decapitation, the dorsal skin was excised and minced in PBS. Cells were then separated by incubation in 0.25% trypsin solution in PBS, collected by centrifugation and resuspended in DMEM with 10% fetal bovine serum, 1% l-glutamine and 2% penicillin/streptomycin solution (Invitrogen, France). Cells were seeded in 25-cm2 flasks and then expanded in 75-cm2 flasks until reaching 90% confluence. Transfections were carried out by using a BTX-electroporator ECM 830 (Harvard Apparatus, France) (175 V, 1-ms pulse, five pulses, 0.5-s interval between pulses). Cells were electroporated in Optimem medium (Invitrogen, France) at 2 × 107 cells per ml (fibroblasts from two 75 cm2 flasks at 90% confluence in 300 μl) in a 2-mm gap cuvette using 30 μg of either control plasmid (mCherry), CB or DN22-CB cDNA coupled with mCherry, respectively. After electroporation, cells were resuspended in DMEM with 10% fetal bovine serum, 1% l-glutamine and 2% penicillin/streptomycin solution (Invitrogen, France) and seeded in three 100 cm2 Petri dishes. All cells were maintained at 37 °C and 5% CO and collected 48 h after transfection for respiration experiments. The brains of CB +/+ and CB −/− littermates were dissected and mitochondria were purified using a Ficoll gradient as previously described7, 8. In brief, brains were extracted in ice-cold isolation buffer (250 mM sucrose, 10 mM Tris, 1 mM EDTA, pH 7.6) containing protease inhibitors (Roche, France) and 2 M NaF and homogenized with a Teflon potter. Homogenates were centrifuged at 1,500g for 5 min (4 °C). The supernatant was then centrifuged at 12,500g (4 °C). The pellet was collected and the cycle of centrifugation was repeated. To purify mitochondria, the final pellet was resuspended in 400 μl of isolation buffer, layered on top of a discontinuous Ficoll gradient (10% and 7% fractions) and centrifuged at 100,000g for 1 h (4 °C). Purified mitochondria were recovered from the pellet obtained after ultracentrifugation. All experiments using freshly isolated brain mitochondria were performed within 3 h after purification. The 3T3 cells were collected, resuspended in isolation buffer and disrupted with 25 strokes using a 25G needle. The total cell lysate was centrifuged at 500g (4 °C) to remove cells debris and nuclei. The supernatant was kept and centrifuged at 12,500g for 10 min (4 °C). The supernatant was then kept (cytosolic fraction), the pellet was resuspended, and the centrifugation cycle was repeated. Finally, the mitochondrial fractions were obtained from the last pellet. The oxygen consumption of isolated mitochondria, homogenized hippocampus and cell lines was monitored at 37 °C in a glass chamber equipped with a Clark oxygen electrode (Hansatech, UK). Purified mitochondria (75–100 μg) were suspended in 500 μl of respiration buffer (75 mM mannitol, 25 mM sucrose, 10 mM KCl, 10 mM Tris-HCl pH 7.4, 50 mM EDTA) in the chamber. Respiratory substrates were added directly to the chamber. Pyruvate (5 mM), malate (2 mM) and ADP (5 mM) were successively added to measure complex-I-dependent mitochondrial respiration. Complex-II-dependent respiration was measured using rotenone (0.5 μM), succinate (10 mM) and ADP (5 mM). Complex-IV-dependent respiration was measured using N,N,N′,N′-tetramethyl-p-phenylenediamine (TMPD, 0.5 mM) and ascorbate (2 mM), in the presence of ADP (5 mM) and antimycin A (2.5 μM). Complex-I-dependent respiration was evaluated, unless stated otherwise. For respiration in homogenized hippocampi, both hippocampi of each mouse were dissected and homogenized with a Teflon potter in 800 μl Mir05 buffer (Mitochondrial Physiology Network: 0.5 mM EGTA, 3 mM MgCl , 60 mM lactobionate, 20 mM taurine, 10 mM KH PO , 20 mM HEPES, 110 mM sucrose and 1 g l−1 BSA) containing 12.5 μg ml−1 of saponin. Subsequently,15 μl of the homogenate was diluted in 1 ml Mir05 buffer and the oxygen consumption was measured with the respiratory substrates pyruvate (5 mM), malate (2 mM) and ADP (5 mM) to measure complex-I-dependent mitochondrial respiration before and after WIN (100 nM) or vehicle addition. Oligomycin (2 μg ml−1), FCCP (0.5 μM), rotenone (0.5 μM) and antimycin A (2.5 μM) were injected subsequently into the chamber as modifiers of the respiration. 50 μl of the homogenate were saved for WB and protein quantification experiments. The experiments using cell lines were performed on 2 × 106 cells ml−1 in growth medium. Intact cells were transferred directly into the chamber and basal respiration was recorded. Drugs were added directly into the chambers. Mitochondria were incubated with PTX, KH7 and H89 for 5 min before addition of CB agonists. HCO − and 8-Br-cAMP were added 5 min after the addition of CB agonists. Oxygen consumption of primary hippocampal cultures was monitored using an XF96 Seahorse Bioscience analyser (Seahorse Bioscience, Denmark), according to the manufacturer’s protocol. When indicated, oligomycin (2 μg ml−1) and FCCP (1 μM) were injected directly into the wells. Other drugs were directly added into the medium 1 h before measurements. Respiration of HEK293 cells co-expressing CB and NDUFS2-PM or MLS–PKA-CA was analysed using the Oxygraph-2k (Oroboros Instruments, Austria). These experiments were performed on 5 × 105 cells ml−1 in growth medium. WIN was directly added into the medium 30 min before measurements. Then, intact cells were transferred into the 2-ml chamber and basal respiration was recorded. NADH oxidation into NAD+ by the first complex of the respiratory chain is coupled to the reduction of ubiquinone (coenzyme Q). The rate of this reaction is analysed by the measurement of NADH disappearance, which is spectrophotometrically detected (SAFAS, UVmc2) at 340 nm. The NADH extinction coefficient is 6.22 mM−1 cm−1. Final composition of the reaction solution was 50 mM K HPO pH 7.2, 2.5 mg ml−1 BSA, 0.1 mM ubiquinone and 200 μg total cell extract proteins or 50 μg purified brain mitochondria. The reaction was initiated by adding 0.1 mM NADH. The assay was monitored at 37 °C for 5 min. The intracellular ATP content was measured using the bioluminescent ATP kit HS II (Roche, France). CB +/+ and CB −/− primary hippocampal cultures (50,000 cells per well in a 96-well dish) were treated with THC (1 μM), WIN (1 μM) or vehicle in the presence or absence of rotenone (0.1 μM) for 1 h. Then, ATP measurements were performed as previously described38. In brief, cells were lysed to release the intracellular ATP using the lysis buffer provided with the kit (equal volume) for 20 min. The lysate was then analysed in a 96-well plate luminometer (Luminoskan, Thermo Scientific, France) using the luciferine/luciferase reaction system provided with the kit. For this, 100 μl of luciferine/luciferase was injected in the wells and after 10 s of incubation, bioluminescence was read (1 s integration time). Standardizations were performed with known quantities of standard ATP provided with the kit. The ATP content derived from mitochondria was determined by subtracting ATP values from the ATP ; (ATP  = ATP  − ATP ). 100 μg of mitochondria were suspended in isolation buffer, untreated or incubated with 0.01% trypsin in the presence or absence of 0.05% triton X-100 for 15 min at 37 °C. Proteins were then processed for western immunoblot analyses. Freshly purified brain mitochondria were resuspended in PBS (5 mg ml−1) supplemented with protease inhibitor cocktail (Roche, France) and 2 mM NaF, and solubilized with 1% lauryl maltoside for 30 min (4 °C). For co-immunoprecipitation of sAC and G , mitochondria were incubated with THC (800 nM) or vehicle for 5 min at 37 °C. Proteins were incubated with a C-terminal anti-CB antibody (Cayman, USA) or sAC R21 antibody (CEP Biotech, USA) overnight (4 °C). For immunoprecipitation of complex I, mitochondrial proteins were treated with THC (800 nM), HCO − (5 mM), 8-Br-cAMP (500 μM) or vehicle for 5 min at 37 °C and then incubated with complex-I-agarose-conjugated beads (Abcam, UK). Protein A/G PLUS-agarose beads (Santa Cruz, USA) were then added and the incubation continued for 4 h (4 °C). The elution was performed using glycine buffer (0.2 M glycine, 0.05% lauryl maltoside pH 2.5) and samples were processed for western immunoblotting. Following transfection (mCherry, CB or DN22-CB , respectively), cells were allowed to recover in serum containing medium for 24 h. Cells were then starved overnight in serum-free DMEM before treatment and lysis. The cells were then treated at 37 °C with HU210 (100 nM) or vehicle for 10 min. The medium was rapidly aspirated and the samples were snap-frozen in liquid nitrogen and stored at −80 °C before preparation for western blotting. For ERK-phosphorylation assays, lysis buffer (1 mM EGTA, 50 mM NaF, 1 mM Na VO , 50 mM Tris pH 7.5, 1% triton X-100, protease inhibitors, 30 mM 2-mercaptoethanol) was added and the cells were collected by scraping and pelleted by centrifugation at 12,500g (4 °C) for 5 min to remove cell debris. Protein concentrations were measured using the Pierce BCA protein assay kit (Thermo Scientific), loaded with Laemmli buffer and kept at −80 °C. For western immunoblotting, the proteins were separated on Tris-glycine 7%, 10% or 12% acrylamide gels and transferred to PVDF membranes. Membranes were soaked in 5% milk (5% BSA for phosphorylation immunoblots) in tris-buffered saline (TBS; Tris 19.82 mM, NaCl 151 mM, pH 7.6) containing tween20 (0.05%). Mitochondrial proteins were immunodetected using antibodies against complex III core 2 (Abcam, ab14745; 1:1,000, 1 h, room temperature), succinate dehydrogenase subunit A (Abcam, ab14715; 1:10,000, 1 h, room temperature), NDUFA9 (Abcam, ab14713; 1:1,000, 1 h, room temperature), NDUFS2 (Abcam, ab110249; 1:1,000, 1 h, room temperature) and TOM20 (Santa Cruz, sz-11415; 1:1,000, 1 h, 4 °C). Cytosolic proteins were probed with LDHa (Santa Cruz, sz-137243; 1:500, overnight, 4 °C). Samples were also probed with antibodies against G proteins (Enzo Life Science, SA-126; 1:1,000, 1 h, room temperature), sAC (CEP Biotech, sAC R21; 1:500, overnight, 4 °C), PKA (cAMP protein kinase catalytic subunit, Abcam, ab76238; 1:1,000, 1 h, room temperature), an antiserum directed against the C terminus of CB receptor (Cayman, 10006590; 1:200, overnight, 4 °C), AKAP121 (from A. Feliciello; 1:1,000, overnight, 4 °C), PKA-dependent phosphorylation sites (phospho (Ser/Thr)-PKA substrate, Cell Signaling, 9621; 1:1,000, overnight, 4 °C) and HA (Abcam, ab18181; 1:500, overnight, 4 °C), p-ERK (phospho-p44/42 MAPK) corresponding to residues around Thr202/Tyr204 (Cell Signaling, 4370; 1:1,000, overnight, 4 °C), ERK (p44/p42 MAPK; Cell Signaling, 9102; 1:2,000, 1 h, room temperature). Mitochondrial proteins were also separated by two-dimension electrophoresis as described39. Purified brain mitochondria were solubilized (10 mg ml−1) in 0.75 M aminocaproic acid, 50 mM BisTris, (pH 7.0) with 1.5% n-dodecyl-maltoside for 30 min on ice, and were then centrifuged at 16,000g (4 °C). The supernatant was collected and supplemented with 0.25% coomassie blue G and protease inhibitors (Roche, France). Proteins were then separated with 4–16% gradient native-PAGE gels (Invitrogen, France). The different lanes were cut out and processed for the second dimension on 12.5% SDS–PAGE gels after denaturation and reduction in 1% (w/v) sodium dodecyl sulphate and 1% (v/v) mercaptoethanol. The second dimension gels were immunoblotted for detection of PKA-dependent phosphorylated proteins. A second-dimension gel was kept for coomassie blue staining. Then, membranes were washed and incubated with appropriate secondary horseradish peroxidase (HRP)-coupled antibodies (1 h, room temperature). Finally, the HRP signal was detected using the ECL-plus reagent (Amersham) and the Bio-Rad Quantity One system. Labelling was quantified by densitometric analysis using ImageJ (NIH) software. HeLa cells were fixed in 4% formaldehyde dissolved in PBS (0.1 M, pH 7.4) and then washed with PBS. Cells were pre-incubated in a blocking solution of 10% normal goat serum, 0.1% triton X-100, 0.05% deoxycholate and 0.2 M glycine prepared in PBS for 1 h and then incubated with primary antibody rabbit anti-TOM20 (Santa Cruz, sc-11415; 1:500) and mouse anti-myc (Roche, 11667149001; 1:500) for 2 h in the same blocking solution. The cells were then washed in PBS for 1 h and were then incubated with fluorescent anti-mouse Alexa488 or anti-rabbit Alexa561 (Jackson ImmunoResearch; 1:800) in blocking solution for 1 h. Finally, cells were washed and mounted with fluoromont-G (Electron Microscopy Sciences). All the procedures were carried out at room temperature. The cells were analysed with a Confocal Leica DMI6000 microscope (Leica). Samples were digested by trypsin as previously described40. Peptides were further analysed by nano-liquid chromatography coupled to a MS/MS LTQ-Orbitrap XL mass spectrometer (Thermo Fisher Scientific, Germany). Peptides were identified with SEQUEST and MASCOT algorithms through the Proteome Discoverer interface (Thermo Fisher Scientific, Germany) against a subset of the UniProt database restricted to Reference Proteome Set of Mus musculus (UniProtKB Release 2011_12, 14th December, 2011, 46,638 entries). Peptide validation was performed using Percolator algorithm41 and only ‘high confidence’ peptides were retained corresponding to a 1% false positive rate at peptide level. Cyclic AMP levels and PKA activity of mitochondria isolated from the brain were assayed using the Direct Correlate-EIA cAMP kit (Assay Designs Inc., USA) and an ELISA kit (Enzo Life Science), respectively, according to the manufacturers’ instructions. The different treatments described in the main text were performed at 37 °C for 1 h. Mitochondrial mobility in hippocampal neurons was recorded using an inverted Leica DMI6000 microscope (Leica Microsystems, Wetzlar, Germany) equipped with a confocal head Yokogawa CSU-X1 (Yokogawa Electric Corporation, Tokyo, Japan) and a sensitive Quantem camera (Photometrics, Tucson, USA). The diode lasers used were at 491 nm and 561 nm and the objective was a HCX PL APO CS 63× oil 1.32 NA lens. The z stacks were obtained with a piezo P721.LLQ (Physik Instrumente (PI), Karlsruhe, Germany). The 37 °C atmosphere during time-lapse image acquisition was created with an incubator box and an air heating system (Life Imaging Services, Basel, Switzerland) in the presence of 5% CO . This system was controlled by MetaMorph software (Molecular Devices, Sunnyvale, USA). For mitochondrial axonal transport analysis, time-lapse series of image stacks composed of 6 images (512 × 512 pixels) were taken every 3 s for 15 min. HU210 was added just after the recording and 15 min later the same neuron was recorded for another 15 min. KH7 or vehicle were added 15 min before the first recording. All stacks obtained were processed first with MetaMorph software. Further image processing, analysis and video compilation (28 frames per second) and editing was done with ImageJ software (NIH, USA). Kymographs were generated with the KymoToolBox Plugin42. Between 10 and 32 axons were registered and analysed in each condition. In all cases, a mitochondrion was considered mobile when it moved more than 5 μm during the time of recording. Distances and speeds of retrograde and anterograde transport and dwelling time were measured separately from the corresponding kymographs, as previously described43, 44. The microarrays were composed of a collection of membrane homogenates isolated from HEK293 cells transfected with mCherry CB or DN22-CB , or from hippocampi of CB +/+(GFP), CB −/−(GFP), CB −/−(CB ) or CB −/−(DN22-CB ) mice (see below), together with increasing amounts of BSA and membranes isolated from rat cerebral cortex, as positive internal controls45. Briefly, samples were homogenized using a Teflon-glass grinder (Heidolph RZR 2020) or a disperser (Ultra-Turrax T10 basic, IKA) in 20 volumes of homogenized buffer (1 mM EGTA, 3 mM MgCl , and 50 mM Tris-HCl pH 7.4) supplemented with 250 mM sucrose. The crude homogenate was subjected to a 40g centrifugation for cells or 200g for tissue for 5 min, and the resultant supernatant was centrifuged again at 18,000g for 15 min (4 °C, Microfuge 22R centrifuge, Beckman Coulter). The pellet was washed in 20 volumes of homogenized buffer and re-centrifuged under the same conditions. The homogenate aliquots were stored at −80 °C until use. Protein concentration was measured by the Bradford method and adjusted to the required concentrations. Microarrays were fabricated by a non-contact microarrayer (Nano_plotter NP 2.1) placing the cell membrane homogenates (4 nl per spot, 3–5 replicates per sample) onto glass slides46. Microarrays were stored at −20 °C until use. After thawing, cell membrane microarrays were incubated in assay buffer (50 mM Tris-Cl; 1% BSA; pH 7.4) for 30 min at room temperature. A second incubation was performed using the same buffer for 120 min at 37 °C in the presence of [3H]CP55,940 (3 nM). Non-specific binding was determined with 10 μM WIN55,212-2. Afterwards, microarrays were washed twice in buffer, dipped in deionized water and dried. Finally, they were exposed to films, developed, scanned and quantified as described below. [35S]GTPγS binding studies were carried out according to the patented methodology for the screening of molecules that act through G-protein-coupled receptors using cell membrane microarrays45. Briefly, thawed cell membrane microarrays were dried 20 min at room temperature and were subsequently incubated in assay buffer (50 mM Tris-Cl; 1 mM EGTA; 3 mM MgCl ; 100 mM NaCl; 0,5% BSA; pH 7.4) for 15 min at room temperature. Microarrays were transferred into assay buffer containing 50 μM GDP and 0.1 nM [35S]GTPγS, with the cannabinoid agonists WIN55,212-2 or HU210, at increasing concentrations, and incubated at 30 °C for 30 min. Non-specific binding was determined with GTPγS (10 μM). After washing, microarrays, together with ARC [14C]-standards, were exposed to films, developed, scanned and quantified. The protein concentration in each spot was measured by the Bradford method and used to normalize the [35S]GTPγS binding results to nCi per ng protein. Data from the dose–response curves (5 replicates in triplicate) were analysed using the program Prism (GraphPad Software Inc., San Diego, CA) to yield EC (effective concentration 50%) and E (maximal effect) of the drugs on each different sample by nonlinear regression analysis. Samples displaying [3H]CP55,940 binding below the values of hippocampi from CB −/− mice were excluded from [35S]GTPγS binding analysis. Mice (7–9 weeks of age) were anaesthetized by i.p. injection of a mixture of ketamine (100 mg kg−1; Imalgene 500, Merial) and Xylazine (10 mg kg−1; Rompun, Bayer) and placed into a stereotaxic apparatus (David Kopf Instruments) with mouse adaptor and lateral ear bars. For intracerebroventricular injections of drugs, mice were unilaterally implanted with a 1.0-mm stainless-steel guide cannula targeting the lateral ventricle with the following coordinates: anterior–posterior −0.2; lateral ± 0.9; dorsal–ventral −2.0. For intrahippocampal injections of drugs, mice were bilaterally implanted with 1.0-mm stainless-steel guide cannulae targeting the hippocampus with the following coordinates: anterior–posterior −3.1; medial–lateral ± 1.3; dorsal–ventral −0.5. Guide cannulae were secured with cement anchored to the skull by screws. Mice were allowed to recover for at least one week in individual cages before the start of experiments. Mice were weighed daily and individuals that failed to return to their pre-surgery body weight were excluded from subsequent experiments. The intrahippocampal and intracerebroventricular drug injections were performed by using injectors protruding 1 mm from the tip of the cannula. For viral intrahippocampal AAV delivery, mice were submitted to stereotaxic surgery (as above) and AAV vectors were injected with the help of a microsyringe (0.25-ml Hamilton syringe with a 30-gauge bevelled needle) attached to a pump (UMP3-1, World Precision Instruments). Mice were injected directly into the hippocampus (0.5 μl per injection site at a rate of 0.5 μl per min), with the following coordinates: dorsal hippocampus, anterior–posterior −1.8; medial–lateral ± 1; dorsal–ventral −2.0 and −1.5; ventral hippocampus: anterior–posterior −3.5; medial–lateral ± 2.7; dorsal–ventral −4 and −3. Following virus delivery, the syringe was left in place for 1 min before being slowly withdrawn from the brain. CB +/+ mice were injected with AAV–GFP to generate CB +/+(GFP) mice; CB −/− mice were injected with AAV–GFP, AAV–CB or AAV–DN22-CB , to obtain CB −/−(GFP), CB −/−(CB ) and CB −/−(DN22-CB ) mice, respectively. Animals were used for experiments 4–5 weeks after injections. Mice were weighed daily and individuals that failed to return to their pre-surgery body weight were excluded from subsequent experiments. CB -receptor expression was verified by fluorescent or electromicroscopic immunohistochemistry (see below). The AAV vectors, MLS–PKA-CA or NDUFS2-PM were injected directly into the dorsal hippocampus (1.0 μl per injection site at a rate of 0.5 μl per min) of C57BL/6N mice, with the following coordinates: anterior–posterior −1.8; medial–lateral ± 1; dorsal–ventral −2.0 and −1.5. Following virus delivery, the syringe was left in place for 1–2 min before being slowly withdrawn from the brain. Animals were used for experiments 4–5 weeks after viral delivery. Mice were habituated to i.p. injections (saline) before the behavioural paradigm (see below). The hippocampal expression of myc-tagged MLS–PKA-CA and NDUFS2-PM was verified by immunohistochemistry using anti-myc antibodies. Mice were anaesthetized with chloral hydrate (400 mg kg−1 body weight), transcardially perfused with Ringer solution (NaCl (135 mM), KCl (5.4 mM), MgCl ·6H O (1 mM), CaCl ·2H 0 (1.8 mM), HEPES (5 mM)). Heparin choay (25,000 UI per 5 ml) was added extemporarily and tissues were then fixed with 500 ml of 4% formaldehyde dissolved in PBS (0.1 M, pH 7.4) and prepared at 4 °C. After perfusion, the brains were removed and incubated several additional hours in the same fixative. Serial vibrosections were cut at 40–50-μm thickness and collected in PBS at room temperature. Sections were pre-incubated in a blocking solution of 10% donkey serum, 0.1% sodium azide and 0.3% triton X-100 prepared in PBS for 30 min–1 h at room temperature. Free-floating sections were incubated for 48 h (4 °C) with goat anti-CB polyclonal antibodies raised against a C-terminal sequence of 31 amino acids (NM007726) of the mouse CB receptor (CB -Go-Af450-1; 2 μg ml−1; Frontier Science Co. Ltd) or overnight (4 °C) with rabbit anti-myc (Ozyme; 1:1,000). The antibody was prepared in 10% donkey serum in PBS containing 0.1% sodium azide and 0.5% triton X-100. Then, the sections were washed in PBS for 30 min at room temperature. The tissue was subsequently incubated with fluorescent anti-goat Alexa488 (1:200, Jackson ImmunoResearch) for 4 h and washed in PBS at room temperature, before being incubated with DAPI (1:20,000) for 10 min for nuclear counterstaining. Finally, sections were washed, mounted, dried and a coverslip was added on top with DPX (Fluka Chemie AG). The slides were analysed with an epifluorescence Leica DM6000 microscope (Leica). CB +/+(GFP), CB −/−(GFP), CB −/−(CB ) and CB −/−(DN22-CB ) mice (n = 3 per group) were processed for electron microscope pre-embedding immunogold labelling as previously described7, 8. Immunodetection was performed in 50-μm-thick sections of hippocampus with goat anti-CB polyclonal antibodies raised against a 31 amino acid C-terminal sequence (NM007726) of the mouse CB receptor (Frontier Institute Co. Ltd, CB -Go-Af450-1; 2 μg ml−1). Immunogold particles were identified and counted. To exclude the risk of counting possible false positive mitochondrial labelling, we used strict semi-quantification methods of mtCB receptors as recently described, excluding immunogold particles that were located on mitochondrial membranes but at a distance ≤80 nm from other cellular structures8. The normalized number of immunogold particles located on mitochondria versus the total amount of immunogold particles in each field was used to calculate the proportion of mtCB receptors over total CB . Mice were anaesthetized with isoflurane and killed by decapitation. Brains were rapidly removed and chilled in an ice-cold, carbonated (bubbled with 95% O –5% CO ) cutting solution containing 180 mM sucrose, 2.5 mM KCl, 0.2 mM CaCl , 12 mM MgCl , 1.25 mM NaH PO , 26 mM NaHCO and 11 mM glucose (pH 7.4). Sagittal hippocampal slices (350-μm thick) were cut using a Leica VT1200S vibratome and incubated with artificial cerebrospinal fluid (ACSF) containing 123 mM NaCl, 1.25 mM NaH PO4, 11 mM glucose, 2.5 mM KCl, 2.5 mM CaCl , 1.3 mM MgCl and 26 mM NaHCO (osmolarity of 298 ± 7; pH 7.4) for 30 min at 34 °C. The slices were subsequently transferred to a holding chamber, where they were maintained at room temperature until experiments. Slices were individually transferred to a submerged chamber for recording and continuously perfused with oxygenated (95% O –5% CO ) ACSF (3–5 ml min−1). All experiments were performed at room temperature. fEPSPs were recorded using glass micropipettes (2–4 mΩ) filled with normal ACSF positioned in the CA1 hippocampal region. Slices from the middle hippocampus were used preferentially. fEPSPs responses were evoked by stimulation (0.1-ms duration, 10–30-V amplitude) delivered to the stratum radiatum to stimulate the Schaffer collateral fibres using similar glass electrodes used for the recordings, in the presence of picrotoxin 100 μM. Recordings were obtained using an Axon Multiclamp 700B amplifier (Molecular Devices). Signals were filtered at 2 kHz, digitized, sampled and analysed using Axon Clampfit software (Molecular Devices). In CB −/−(CB ) and CB −/−(DN22-CB ), two slices (1 each) were excluded from analysis, because immunohistochemistry showed no re-expression. To study the effect of mtCB receptor signalling on cannabinoid-induced amnesia, we used the hippocampal-dependent NOR memory task in an L-maze (L-M/NOR)14, 47, 48. As compared to other hippocampal-dependent memory tasks, this test presents several advantages for the aims of the present study: (i) the acquisition of L-M/NOR occurs in one step and previous studies revealed that the consolidation of this type of memory is deeply altered by acute immediate post-training administration of cannabinoids via hippocampal CB receptors14, 48; (ii) this test allows repeated independent measurements of memory performance in individual animals47, thereby allowing within-subject comparisons, eventually excluding potential individual differences in viral infection and/or expression of proteins; (iii) notably, CB −/− mice do not respond to the administration of cannabinoids, but they do not show any spontaneous impairment of performance in L-M/NOR14, thereby allowing the use of re-expression approaches to study the role of hippocampal mtCB receptors in the cannabinoid-induced blockade of memory consolidation. This task was performed with an L-maze made out of dark-grey Plexiglas with two corridors (35 cm and 30 cm long, respectively, for external and internal V walls, 4.5 cm wide and 15-cm high walls) set at a 90° angle and under a weak light intensity (50 Lux). The task consisted of 3 sequential daily trials of 9 min. Day 1 (habituation): mice were placed at the intersection of the two arms and were let free to explore the maze. Day 2 (acquisition): two identical objects were placed at the end of each arm. After 9 min of exploration, mice were removed and injected. Day 3 (retrieval): A novel object different in its shape, colour and texture was placed at the end of one of the arm, whereas the familiar object remained at the end of the other arm. The position of the novel object and the pairings of novel and familiar objects were randomized. Exploration of each object was scored off-line by at least two experienced observers blind to treatments and/or genotypes. Exploration was defined as the time spent by the mouse with the nose pointing to the object at a distance of less than 1 cm, whereas climbing on or chewing the object was not considered as exploration14. Memory performance was assessed by the discrimination index. The discrimination index was calculated as the difference between the time spent exploring the novel (TN) and the familiar object (TF) divided by the total exploration time (TN+TF): discrimination index = (TN−TF)/(TN+TF). Mice receiving the acute intrahippocampal infusion of KH7 (10 mM) and WIN (5 mg kg−1) i.p., and mice that received intrahippocampal injection with AAV–MLS–PKA-CA or AAV–NDUFS2-PM were submitted to a single L-M/NOR session. Due to the limited numbers of available mice, null CB −/− mice virally injected with AAV–GFP, AAV–CB or AAV–DN22-CB were tested twice with a one-week interval using different pairs of objects and treated the first time with vehicle and the following with WIN. Every pair of objects was previously screened to exclude that the animals might exhibit significant preference for any specific item. After the NOR task, the hippocampi of vehicle-treated AAV–MLS–PKA-CA and AAV–NDUFS2-PM animals were dissected and used for respiration experiments (see above). Expression of the myc epitope was verified by immunohistochemistry in the hippocampi of animals treated with WIN. All graphs and statistical analyses were performed using GraphPad software (version 5.0 or 6.0). Results were expressed as means of independent data points ± s.e.m. For biochemical quantifications (cAMP levels, PKA and complex-I activities and oxygen consumption), data are presented as percentage of controls with or without the application of cannabinoid drugs. With the exception of KH7 (see Extended Data Fig. 5a), no other drugs and plasmids had any effect per se on any measured parameter (not shown). Data were analysed using paired or unpaired Student’s t-test, one-way (followed by Tukey's post hoc test) or two-way ANOVA (followed by Bonferroni's post hoc test), as appropriate. Detailed statistical data for each experiment are reported in Supplementary Tables 1–3.

No statistical methods were used to pre-evaluate the sample size in this study. The experiments (including animal experiments) were not randomized. The investigators were not blinded to experiments. No samples/data were excluded except any obviously unhealthy xenografted mice. H1299, U2OS, MCF7, H460 and HCT116 cell lines were cultured in DMEM supplemented with 10% (vol/vol) FBS. The SU-DHL-5 cell line was cultured in IMDM supplemented with 10% (vol/vol) FBS. MEFs were cultured in DMEM supplemented with 10% (vol/vol) heat-inactivated FBS. All the cell lines were obtained from ATCC and have been proven to be negative for mycoplasma contamination. No cell lines used in this work were listed in the ICLAC database. The cell lines were freshly thawed from the purchased seed cells and were cultured for no more than 2 months. The morphology of cell lines was checked every week and compared with the ATCC cell line image to avoid cross-contamination or misuse of cell lines. SET stable knockdown cells were generated by lentivirus-based infection of shRNA. SET cDNA was purchased from Addgene (Plasmid number 24998) and the full-length cDNA or the various fragments were sub-cloned into pWG-F-HA, pCMV-Myc or PGEX-2TL vectors. Each p53 plasmid was generated by sub-cloning human p53 cDNA (including full-length or various fragments) into pWG-F-HA, pcDNA3.1 or PGEX-2TL vectors. The point-mutation constructs (including p53-KR and -KQ) were generated by using a site-directed mutagenesis Kit (Stratagene, 200521). Introduction of the expressing construct and siRNA transfection were performed by Lipofectamine 2000 (Invitrogen, 11668-019) according to the manufacturer’s protocol. To transfer oligos into SU-DHL-5 cells, we used electroporation following the manufacturer’s protocol (Lonza PBC3-00675). The DNA damage inducer doxorubicin was used at 1 μM for 24 h. The proteasome inhibitor epoxomicin was used at 100 nM for 6 h. Cells were treated with TSA (1 μM) and nicotinamide (5 mM) for 6 h to inhibit HDAC activity in the assays in which p53 acetylation needed to be maintained. Ad–GFP and Ad–Cre–GFP viruses were purchased from Vector Biolabs (Catalogue numbers 1761 and 1710). To generate the knock-in mice, W4/129S6 mouse embryonic stem (ES) cells (Taconic) were electroporated with a targeting vector containing homologous regions flanking the mouse p53 exon 11, in which all 7 lysines were mutated to glutamines (p53KQ allele). A neomycin-resistance gene cassette flanked by two LoxP sites (LNL) was inserted into intron 10 to allow selection of targeted ES cell clones with G418. ES cell clones were screened by Southern blotting with EcoRI-digested genomic DNA, using a probe generated from PCR amplification in the region outside the homologous region in the targeting vector. The correctly targeted ES cell clones containing the K-to-Q mutations were injected into C57BL/6 blastocysts, which were then implanted into pseudopregnant females to generate chimaeras. Germ-line transmission was accomplished by breeding chimaeras with C57BL/6 mice. Subsequently, mice containing the targeted allele were bred with Rosa26-Cre mice to remove the LNL cassette and to generate mice with only the K-to-Q mutations. To confirm the mutations inserted in p53+/KQ mice, we sequenced p53 cDNA derived from mRNA isolated from p53+/KQ spleen. All seven K-to-Q mutations were confirmed and no additional mutations were found. The offspring were genotyped by PCR using the following primer set, forward: 5′-GGGAGGATAAACTGATTCTCAGA-3′, reverse: 5′-GATGGCTTCTACTATGGGTAGGGAT-3′. To generate a Set conditional knockout mouse, exon 2 of the Set gene was floxed and deletion of exon 2 resulted in a frameshift and the truncation of the C-terminal domain. The targeting vector of Set contained 10 kb genomic DNA spanning exon 2; a neomycin-resistance gene cassette and loxP sites were inserted flanking exon 2. To increase targeting frequency, a diphtheria toxin A cassette was inserted at the 3′ end of the targeting vector to reduce random integration of the modified Set genomic DNA. A new BglII restriction site was also inserted to facilitate Southern blot screening. Of the 200 mouse ES cell clones screened, eight were identified to have integrated the floxed exon 2 by Southern blot using a 5′ probe, which detects a 14-kb band for the wild-type allele and an 11-kb band for the floxed exon 2 allele (Setflox). Two of the clones were then injected into blastocysts to generate Set chimaera mice and they were bred to produce germ-line transmission of the floxed exon 2 allele. Setflox/+ mice were intercrossed to generate Set homozygous conditional knockout mice (Setflox/flox). Maintenance and experimental procedures of mice were approved by the Institutional Animal Care and Use Committee (IACUC) of Columbia University. For the in vitro peptide binding assay: equal amounts of each synthesized biotin-conjugated peptide (made as column or as batch) were incubated with highly concentrated HeLa nuclear extract (NE) or purified proteins for 1 h or overnight at 4 °C. After washing with BC100 buffer (20 mM Tris-HCl pH 7.9, 100 mM NaCl, 10% glycerol, 0.2 mM EDTA, 0.1% triton X-100) three times, the binding components were eluted in high-salt buffer (20 mM Tris-HCl pH 7.9, 1,000 mM NaCl, 1% DOC, 10% glycerol, 0.2 mM EDTA, 0.1% triton X-100) or by boiling with 1 × Laemmli buffer for further analysis. For the in vitro GST-fusion protein binding assay: Escherichia coli containing GST or GST-fusion protein expressing constructs were grown in a shaking incubator at 37 °C until the OD was about 0.6. Next 0.1 mM IPTG was added and the E. coli were incubated at 25 °C for 4 h or overnight, to induce GST or GST-fusion protein expression. After purification by GST·Bind Resin (Novagen, 70541), equal amounts of immobilized GST or GST-fusion proteins were incubated with other purified proteins for 1 h at 4 °C, followed by washing with BC100 buffer three times. The binding components were eluted by boiling with 1 × Laemmli buffer and were analysed by western blot. Whole cellular extracts (WCE) were prepared in BC100 buffer with sonication. Nuclear extract (NE) was prepared by sequentially lysing cells with HB buffer (20 mM Tris-HCl pH 7.9, 10 mM KCl, 1.5 mM MgCl , 1 mM PMSF, 1 × protease inhibitor (Sigma)) for the cytosolic fraction and BC400 buffer (20 mM Tris-HCl pH 7.9, 400 mM NaCl, 10% Glycerol, 0.2 mM EDTA, 0.5% triton X-100, 1 mM PMSF, 1 × protease inhibitor) for nuclear fraction. The salt concentration of NE was adjusted to 100 mM. 2 μg of the indicated antibody (or 20 μl Flag M2 Affinity Gel (Sigma, A2220)) was added into WCE or NE and incubated overnight at 4 °C, followed by addition of 20 μl protein A/G agarose (Santa Cruz, sc-2003; only for IP with unconjugated antibodies mentioned above) for 2 h. After washing with BC100 buffer three times, the binding components were eluted using Flag peptide (Sigma, F3290), 0.1% trifluoroacetic acid (TFA, Sigma, 302031) or by boiling with 1 × Laemmli buffer, and were analysed by western blot. For preparation of Ub-p53: H1299 cells were co-transfected with p53, MDM2 and 6 × HA-Ub (human) expressing plasmids for 48 h. The cells were lysed with Flag lysis buffer (50 mM Tris-HCl pH 7.9, 137 mM NaCl, 10 mM NaF, 1 mM Na VO , 10% glycerol, 0.5 mM EDTA, 1% triton X-100, 0.2% sarkosyl (sodium lauroyl sarcosinate), 0.5 mM DTT, 1 mM PMSF, 1 × protease inhibitor) and total Ub-conjugated proteins were purified by anti-HA-agarose (Sigma, A2095) and eluted by 1 × HA peptide (Sigma I2149). For the preparation of Sumo-p53 or Nedd-p53: H1299 cells were co-transfected with p53, MDM2 (only for Nedd-p53 preparation) and 6 × His-HA-Sumo1 (human) or 6 × His-HA-Nedd8 (human) expressing plasmids for 48 h. The cells were lysed with guanidine lysis buffer (6 M guanidin-HCl, 0.1 M Na HPO , 6.8 mM NaH PO , 10 mM Tris-HCl pH 8.0, 0.2% triton-X100, freshly supplemented with 10 mM β-mercaptoethanol and 5 mM imidazole) with mild sonication. After overnight pull-down by Ni+-NTA agarose (Qiagen 30230), the binding fractions were sequentially washed with guanidine lysis buffer, urea buffer I (8 M urea, 0.1 M Na HPO , 6.8 mM NaH PO , 10 mM Tris-HCl pH 8.0, 0.2% triton-X100, freshly supplemented with 10 mM β-mercaptoethanol and 5 mM imidazole) and urea buffer II (8 M urea, 18 mM Na HPO , 80 mM NaH PO , 10 mM Tris-HCl pH 6.3, 0.2% triton-X100, freshly supplemented with 10 mM β-mercaptoethanol and 5 mM imidazole). Precipitates were eluted in elution buffer (0.5 M imidazole, 0.125 M DTT). All purified proteins were dialysed against BC100 buffer before use in the subsequent pull-down assay. After the pull-down assay, the interaction between SET and each p53-conjugate was detected by western blot with anti-p53 (DO-1) antibody. The protein complex was separated by SDS–PAGE and stained with GelCode Blue reagent (Pierce, 24592). The visible band was cut and digested with trypsin and then subjected to liquid chromatography (LC)-MS/MS analysis. A firefly reporter (p21-Luci reporter) and a Renilla control reporter were co-transfected with indicated constructs in H1299 cells for 48 h and the relative luciferase activity was measured by dual-luciferase assay protocol (Promega, E1910). Highly purified p53 or SET was incubated with a 32P-labelled probe (160 bp) containing the p53-binding element of the p21 promoter in 1× binding buffer (10 mM HEPES, pH 7.6, 40 mM NaCl, 50 μM EDTA, 6.25% glycerol, 1 mM MgCl , 1 mM spermidine, 1 mM DTT, 50 ng μl−1 BSA, 5 ng μl−1 sheared single strand salmon DNA) for 20 min at room temperature (RT). For the super-shift assay, α-p53 or α-SET antibody was pre-incubated with purified p53 and SET in the reaction system without probe for 30 min at RT and then the probe was added for a further 20 min. The complex was analysed by 4% Tris-Borate-EDTA buffer–polyacrylamide gel electrophoresis (TBE–PAGE) and visualized by autoradiography. The probe was obtained by PCR, labelled by T4 kinase (NEB, M0201S) and purified by Bio-Spin column (Bio-Rad, 732-6223). Cells were fixed with 1% formaldehyde for 10 min at room temperature and lysed with ChIP lysis buffer (50 mM Tris-HCl pH 8.0, 5 mM EDTA, 1% SDS, 1× protease inhibitor) for 10 min at 4 °C. After sonication, the lysates were centrifuged, and the supernatants were collected and pre-cleaned by salmon sperm DNA saturated protein A agarose (Millipore, 16-157) in dilution buffer (20 mM Tris-HCl pH 8.0, 2 mM EDTA, 150 mM NaCl, 1% triton X-100, 1× protease inhibitor) for 1 h at 4 °C. The pre-cleaned lysates were aliquoted equally and incubated with indicated antibodies overnight at 4 °C. Saturated protein A agarose was added into each sample and incubated for 2 h at 4 °C. The agarose was washed with TSE I (20 mM Tris-HCl pH 8.0, 2 mM EDTA, 150 mM NaCl, 0.1% SDS, 1% triton X-100), TSE II (20 mM Tris-HCl pH 8.0, 2 mM EDTA, 500 mM NaCl, 0.1% SDS, 1% triton X-100), buffer III (10 mM Tris-HCl pH 8.0, 1 mM EDTA, 0.25 M LiCl, 1% DOC, 1% NP40), and buffer TE (10 mM Tris-HCl pH 8.0, 1 mM EDTA), sequentially. The binding components were eluted in 1% SDS and 0.1 M NaHCO and reverse cross-linkage was performed at 65 °C for at least 6 h. DNA was extracted using the PCR purification Kit (Qiagen, 28106). Real-time PCR was performed to detect relative enrichment of each protein or modification on indicated genes. Approximately 105 MEFs or U2OS cells, as indicated in each figure, were seeded into 6-well plates with three replicates. Their cell growth was monitored on consecutive days, as indicated, by using the Countess automated cell counter (Invitrogen) or by staining with 0.1% crystal violet. For quantitative analysis of the crystal violet staining, the crystal violet was extracted from cells using 10% acetic acid and the relative cell number was measured by detecting the absorbance at 590 nm. 106 HCT116-derived cells, as indicated in each figure, were mixed with Matrigel (Corning, 354248) in a 1:1 ratio in a total volume of 200 μl. The cell–matrix complex was subcutaneously injected into nude mice (NU/NU; 8 weeks old; female; strain 088; Charles River). After 3 weeks, the mice were killed and weight of the tumours was measured. The experimental procedures were approved by the Institutional Animal Care and Use Committee (IACUC) of Columbia University. None of the experiments were exceeded the limit for tumour burden (10% of total bodyweight or 2 cm in diameter). Total RNA was extracted by TRIzol (Invitrogen, 15596-026) and precipitated in ethanol. 1 μg of total RNA was reverse transcribed into cDNA using the SuperScript III First-Strand Synthesis SuperMix (Invitrogen, 11752-50). The relative expression of each target was measured by qPCR and the data were normalized by the relative expression of GAPDH or ActB. FFPE sections of mouse brain tissue samples were stained with indicated antibodies and visualized by DAB exposure. The Flag-tagged p53 or SET construct was transfected into H1299 cells for 48 h and the cells were lysed in Flag lysis buffer. After centrifugation, the Flag M2 Affinity Gel was added to supernatant and incubated for 1h at 4 °C. After washing with Flag lysis buffer six times, the purified proteins were eluted with Flag peptide. For purification of acetylated p53, the construct CBP was co-transfected with the p53 vector for 48 h. TSA and nicotinamide were added into the medium for the last 6 h and the cells were harvested in Flag lysis buffer supplemented with TSA and nicotinamide. The C-terminal unacetylated p53 was removed by p53-PAb421 antibody and then the acetylated p53 was purified as described above. 0.5 μg recombinant H3 was incubated with 20 ng purified p300 in 1× HAT buffer (50 mM Tris-HCl, pH 7.9; 1 mM DTT; 10 mM sodium butyrate, 10% glycerol) containing 0.1 mM Ac-CoA for 30 min at 30 °C. After the reaction, the products were assayed by western blot with indicated antibodies. To measure the effect of SET on p300-mediated H3 acetylation, H3 and purified SET (1 μg) were pre-incubated in 1× HAT buffer for 20 min at room temperature before addition of the other components (p300 and Ac-CoA) for the subsequent in vitro acetylation assay. Cells were transfected with constructs expressing Cas9-D10A (Nickase) and control sgRNAs or sgRNAs targeting p53 exon3 (Santa Cruz: sc-437281 for control; sc-416469-NIC for targeting of p53). After 48 h of transfection, cells were suspended, diluted and re-seeded to ensure single clone formation. More than 30 clones were picked up and the expression of p53 in each single clone was evaluated by western blot with both α-p53 (DO-1) and α-p53 (FL-393) antibodies. Further verification of positive clones was done by sequencing the genomic DNA to make sure that the functional genomic editing occured (insertion or deletion-mediated frame-shift of the p53 open reading frame (ORF)). Two (U2OS) or three (HCT116) clones were finally selected for subsequent experiments. The p53 knockout-mediated effect was verified to be reproducible in these independent clones. The targeting sequences of p53 loci for the sgRNAs were: 1) TTGCCGTCCCAAGCAATGGA; 2) CCCCGGACGATATTGAACAA. U2OS (CRISPR Ctr or CRISPR p53-KO) cells were transfected with control siRNA or SET-specific siRNA (three oligos) for 4 days. Each sample group had at least two biological replicates. Total RNA was prepared using TRIzol (Invitrogen, 15596-026). The RNA quality was evaluated by Bioanalyzer (Agilent) and confirmed that the RIN > 8. Before performing RNA-seq analysis, a small aliquot of each sample was analysed by RT–qPCR to confirm SET knockdown efficiency. RNA-seq analysis was performed at the Columbia Genome Center. Specifically, from total RNA samples, mRNAs were enriched by poly-A pull-down and then processed for library preparation by using the Illumina TruSeq RNA prep kit (Illumina RS-122-2001). Libraries were then sequenced using the Illumina HiSeq2000. Samples were multiplexed in each lane and yielded targeted number of single-end 100-bp reads for each sample. RTA (Illumina) was used for base calling and bcl2fastq (version 1.8.4) was used for converting BCL to fastq format, coupled with adaptor trimming. Reads were mapped to a reference genome (Human: NCBI/build37.2) using TopHat (version 2.0.4). Relative abundance of genes and splice isoforms were determined using Cufflinks (version 2.0.2) using the default settings. Differentially expressed genes were tested under various conditions using DEseq, an R package based on a negative binomial distribution that models the number reads from RNA-seq experiments and tests for differential expression. To further analyse the differentially expressed genes in a more reliable interval, the following filter strategies were applied: 1) the average of FPKM (Fragments per kilobase of transcript per million mapped reads) in either sample group exceeded 0.1; 2) the fold change between the CRISPR Ctr/si-Ctr group and the CRISPR Ctr/si-SET group exceeded 2; 3) the P value between the CRISPR Ctr/si-Ctr group and the CRISPR Ctr/si-SET group < 0.01. To retrieve potential p53 target genes which were repressed by SET in a p53-dependent manner, we searched the filtered RNA-seq results using the following strategies: 1) the expression level in the CRISPR Ctr/si-SET group was at least 2-fold higher than that in the CRISPR Ctr/si-Ctr group; 2) the expression level in the CRISPR Ctr/si-SET group was at least 2-fold higher than that in the CRISPR p53-KO/si-SET group. The filtered genes which were also verified as p53 target genes from the literature were collected and presented as a heatmap. For the discovery of acidic domains in the human proteome: our motif-finding algorithm initially searched for sequence motifs with a minimum acidic composition of 76% using a sliding window of 36 residues, as dictated by experimental results. Motifs found to be partially overlapping were merged into single motifs. Flanking non-acidic residues were subsequently cropped-out from the final motif. Motif discovery was carried out using the UniProt database, which contains 20,187 canonical human proteins, that have been manually annotated and reviewed. For prediction of proteins that bound acidic domain-containing proteins and were regulated by acetylation: we identified proteins that can potentially bind long acidic domains in a similar way to p53: using a K-rich region whose binding properties can be regulated by acetylation. We used the training set assembled in SSPKA, which combines lysine acetylation annotations from multiple resources obtained either experimentally or in the scientific literature. This dataset individually lists all annotated acetylation sites for a given protein. We generated acetylation motifs with multiple acetylation sites by clustering those sites found to within a maximum distance of 11 residues in sequence. Following this, we searched for acetylation motifs with five or more lysines where at least three of them are annotated as acetylation sites. Results are shown as means ± s.d. Statistical significance was determined by using a two-tailed, unpaired Student t-test in all figures except those described below. In Fig. 1g, significance was determined by one-way ANOVA with a Bonferroni post hoc test. In Fig. 2d and g and Extended Data Figs 2c, 3b, d, 4f and 7h, statistical significance was measured by two-way ANOVA with a Bonferroni post hoc test. All statistical analysis was performed using GraphPad Prism software. P < 0.05 was denoted as statistically significant.

HSP90 inhibitors used in this study including PU-H71, PU-DZ13, NVP-AUY922, and SNX-2112 were synthesized as previously reported7, 19. 17-DMAG was purchased from Sigma. HSP90 bait (PU-H71 beads)21, HSP70 bait (YK beads)22, biotinylated YK (YK-biotin)22, fluorescently labelled PU-H71 (PU-FITC)23, the control derivatives PU-TEG and PU-FITC9 (ref. 24), and the radiolabelled PU-H71-derivative 124I-PU-H71 (ref. 25) were generated as previously described. The specificity of PU-H71 for HSP90 and over other proteins was extensively analysed7. Thus binding of PU-H71 in cell homogenates, live cells and organisms denotes binding to HSP90 species characteristic of each analysed tumour or tissue. Combined with the findings that PU-H71 binds more tightly to HSP90 in type 1 than in type 2 cells, an observation true for cell homogenates, live cells, and in vivo, at the organismal level, we propose that labelled versions of PU-H71 are reliable tools to perturb, identify and measure the expression of the high-molecular-weight, multimeric HSP90 complexes in tumours. The specificity of YK probes for HSP70 was previously reported22, 26, 27, 28. Cell lines were obtained from laboratories at WCMC or MSKCC, or were purchased from the American Type Culture Collection (ATCC) or Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSMZ). Cells were cultured as per the providers’ recommended culture conditions. Cells were authenticated using short tandem repeat profiling and tested for mycoplasma. The pancreatic cancer cell lines include: ASPC-1 (CRL-1682), PL45 (CRL-2558), MiaPaCa2 (CRL-1420), SU.86.86 (CRL-1837), CFPAC (CRL-1918), Capan-2 (HTB-80), BxPc-3 (CRL-1687), HPAFII (CRL-1997), Capan-1 (HTB-79), Panc-1 (CRL-1469), Panc05.04 (CRL-2557) and Hs766t (HTB-134) (purchased from the ATCC); 931102 and 931019 are patient derived cell lines provided by Y. Janjigian, MSKCC. Breast cancer cell lines were obtained from ATCC and include MDA-MB-468 (HTB-132), HCC1806 (CRL-2335), MDA-MB-231 (CRM-HTB-26), MDA-MB-415 (HTB-128), MCF-7 (HTB-22), BT-474 (HTB-20), BT-20 (HTB-19), MDA-MB-361 (HTB-27), SK-Br-3 (HTB-30), MDA-MB-453 (HTB-131), T-47D (HTB-133), AU565 (CRL-2351), ZR-75-30 (CRL-1504), ZR-75-1 (CRL-1500). Lymphoma cell lines include: Akata1, Mutu-1 and Rae-1 (provided by W. Tam, WCMC); BCP-1 (CRL-2294), Daudi (CCL-213), EB1 (HTB-60), NAMALWA (CRL-1432), P3HR-1 (HTB-62), SU-DHL-6 (CRL-2959), Farage (CRL-2630), Toledo (CRL-2631) and Pfeiffer (CRL-2632) (obtained from ATCC); HBL-1, MD901 and U2932 (kindly provided by J. Angel Martinez-Climent, Centre for Applied Medical Research, Pamplona, Spain); Karpas422 (ACC-32), RCK8 (ACC-561) and SU-DHL-4 (ACC-495) (obtained from the DSMZ); OCI-LY1, OCI-LY3, OCI-LY4, OCI-LY7 and OCI-LY10 (obtained from the Ontario Cancer Institute); TMD8 (kindly provided by L. M. Staudt, NIH); BC-1 (derived from an AIDS-related primary effusion lymphoma); IBL-1 and IBL-4 (derived from an AIDS-related immunoblastic lymphoma) and BC3 (derived from a non-HIV primary effusion lymphoma). Leukaemia cell lines include: REH (CRL-8286), HL-60 (CCL-240), KASUMI-1 (CRL-2724), KASUMI-4 (CRL-2726), TF-1 (CRL-2003), KG-1 (CCL-246), K562 (CCL-243), TUR (CRL-2367), THP-1 (TIB-202), U937 (CRL-1593.2), MV4-11 (CRL-9591) (obtained from ATCC); KCL-22 (ACC-519), OCI-AML3 (ACC-582) and MOLM-13 (ACC-554) (obtained from DSMZ). The lung cancer cell lines include: NCI-H3122, NCI-H299 (provided by M. Moore, MSKCC); EBC1 (provided by Dr Mellinghoff, MSKCC); PC9 (kindly provided by D. Scheinberg, MSKCC), HCC15 (ACC-496) (DSMZ), HCC827 (CRL-2868), NCI-H2228 (CRL-5935), NCI-H1395 (CRL-5868), NCI-H1975 (CRL-5908), NCI-H1437 (CRL-5872), NCI-H1838 (CRL-5899), NCI-H1373 (CRL-5866), NCI-H526 (CRL-5811), SK-MES-1 (HTB-58), A549 (CCL-185), NCI-H647 (CRL-5834), Calu-6 (HTB-56), NCI-H522 (CRL-5810), NCI-H1299 (CRL-5803), NCI-H1666 (CRL-5885) and NCI-H1703 (CRL-5889) (obtained from ATCC). The gastric cancer cell lines include: MKN74 (obtained from G. Schwarz, Columbia University), SNU-1 (CRL-5971) and NCI-N87 (CRL-5822) (obtained from ATCC), OE19 (ACC-700) (DSMZ). The non-transformed cell lines MRC-5 (CCL-171), human lung fibroblast and HMEC (PCS-600-010), human mammary epithelial cells were obtained from ATCC. NIH-3T3, and NIH-3T3 cell lines stably expressing either mutant MET (Y1248H) or vSRC, were provided by L. Neckers, National Cancer Institute (NCI), USA, and were previously reported29, 30. Patient tissue was obtained with informed consent and authorized through institutional review board (IRB)-approved bio-specimen protocol number 09-121 at Memorial Sloan Kettering Cancer Centre (New York, New York). Specimens were treated for 24 h or 48 h with the indicated concentrations of PU-H71 as previously described31. Following treatment, slices were fixed in 4% formalin solution for 1 h, then stored in 70% ethanol. For tissue analysis, slices were embedded in paraffin, sectioned, slide-mounted, and stained with haematoxylin and eosin (H&E). Apoptosis and necrosis of the tumour cells (as percentage) was assessed by reviewing all the H&E slides of the case (controls and treated ones) in toto, blindly, allowing for better estimation of the overall treatment effect to the tumour. In addition, any effects to precursor lesions (if present) and any off-target effects to benign surrounding tissue, were analysed. Tissue slides were assessed blindly by a breast cancer pathologist who determined the apoptotic events in the tumour, as well as any effect on adjacent normal tissue31. Cryopreserved primary AML samples were obtained with informed consent and Weill Cornell Medical College IRB approval (IRB number 0910010677 and IRB number 0909010629). Samples were thawed and cultured for in vitro treatment as described previously32. The microdose 124I-PU-H71 PET-CT (Dunphy, M. PET imaging of cancer patients using 124I-PUH71: a pilot study available from: http://clinicaltrials.gov; NCT01269593) and phase I PU-H71 therapeutic (Gerecitano, J. The first-in-human phase I trial of PU-H71 in patients with advanced malignancies available from: http://clinicaltrials.gov; NCT01393509) studies were approved by the institutional review board (protocols 10-139 and 11-041, respectively), and conducted under an exploratory investigational new drug (IND) application approved by the US Food and Drug Administration. Patients provided signed informed consent before participation. 124I-PU-H71 tracer was synthesized in-house by the institutional cyclotron core facility at high specific activity. For PU-PET, research PET-CT was performed using an integrated PET-CT scanner (Discovery DSTE, General Electric). CT scans for attenuation correction and anatomic coregistration were performed before tracer injection. Patients received 185 megabecquerel (MBq) of 124I-PU-H71 by peripheral vein over two minutes. PET data were reconstructed using a standard ordered subset expected maximization iterative algorithm. Emission data were corrected for scatter, attenuation, and decay. 124I-PU-H71 scans (PU-PET) were performed at 24 h after tracer administration. Each picture shown in Fig. 4c and Extended Fig. 6a is a scan taken of an individual patient. PET window display intensity scales for FDG and PU-PET fusion PET-CT images are given for both PU-PET and FDG-PET. Numbers in the scale bar indicate upper and lower SUV thresholds that define pixel intensity on PET images. The phase I trial included patients with solid tumours and lymphomas who had undergone prior treatment and currently had no curative treatment options. Patient cohorts were treated with PU-H71 at escalating dose levels determined by a modified continuous reassessment model. Each patient was treated with his or her assigned dose of PU-H71 on day 1, 4, 8, and 11 of each 21-day cycle. Human embryonic stem cells (hESCs) were differentiated with a modified dual-SMAD inhibition protocol towards floor plate-based midbrain dopaminergic (mDA) neurons as described previously33. hESCs were maintained on mouse embryonic fibroblasts and passaged with Dispase (STEMCELL Technologies). For each differentiation, hESCs were harvested with Accutase (Innovative Cell Technology). At day 30 of differentiation, hESC-derived mDA neurons were replated and maintained on dishes precoated with polyornithine (PO; 15 μg ml−1), laminin (1 μg ml−1), and fibronectin (2 μg ml−1) in Neurobasal/B27/l-glutamine-containing medium (NB/B27; Life Technologies) supplemented with 10 μM Y-27632 (until day 32) and with BDNF (brain-derived neurotrophic factor, 20 ng ml−1; R&D), ascorbic acid (AA; 0.2 mM, Sigma), GDNF (glial cell line-derived neurotrophic factor, 20 ng ml−1; R&D), TGFβ3 (transforming growth factor type β3, 1 ng ml−1; R&D), dibutyryl cAMP (0.5 mM; Sigma), and DAPT (10 nM; Tocris). Two days after replating, mDA neurons were treated with 1 μg ml−1 mitomycin C (Tocris) for 1 h to kill any remaining non-post mitotic contaminants. Assays were performed at day 65 of neuron differentiation. The PU-FITC assay was performed as previously described7, 23. Briefly, cells were incubated with 1 μM PU-FITC at 37 °C for 4 h. Then cells were washed twice with FACS buffer (PBS/0.5% FBS), and resuspended in FACS buffer containing 1 μg ml−1 DAPI. HL-60 cells were used as internal control to calculate fold binding for all cell lines tested. The mean fluorescence intensity (MFI) of PU-FITC in treated viable cells (DAPI negative) was evaluated by flow cytometry. For primary AML specimens, cells were also stained with anti-CD45-APC-H7, to identify blasts and lymphocyte populations (BD biosciences). Blasts and lymphocyte populations were gated based on SSC versus CD45. The fold PU-FITC binding of leukaemic blasts (CD45dim) was calculated relative to lymphocytes (CD45hiSSClow). The FITC derivative FITC9 was used as a negative control. Cells were seeded on coverslips in 6-well plate and cultured overnight. Cells were treated with 1 μM PU-FITC or negative control (PU-FITC9, an HSP90 inert PU-H71 derivative labelled with FITC). At 4 h post-treatment, cells were fixed with 4% formaldehyde at room temperature for 30 min, and the coverslips were mounted on slides with DAPI-Fluoromount-G Mounting Media (Southern Biotech). The images were captured using EVOS FL Auto imaging system (ThermoFisher Scientific) or a confocal microscope (Zeiss LSM5). Cells were seeded on coverslips and cultured overnight. Cells were fixed with 4% formaldehyde at room temperature for 30 min, washed three times with PBS, and permeabilized with 0.2% Triton X-100 in blocking buffer (PBS/5% BSA) for 10 min. Cells were incubated in blocking buffer for 30 min, and then incubated with rabbit anti-human HSP90α antibody (1:500, Abcam 2928) and mouse anti-human HSP90β (1:500, Stressmarq H9010), or rabbit and mouse normal IgG, in blocking buffer for 1 h. Cells were washed three times with PBS, and incubated with goat anti-mouse Alexa Fluor 568 and goat anti-rabbit Alexa Fluor 488 (1:1,000, ThermoFisher Scientific) in blocking buffer in the dark for 1 h. Cells were then washed three times with PBS, and the coverslips were removed from the plate, and mounted on slides with DAPI-Fluoromount-G Mounting Media (Southern Biotech). The images were captured using EVOS FL Auto imaging system (ThermoFisher Scientific) or a confocal microscope (Zeiss LSM5). Fluorescence intensity was quantified by the integrated density algorithm as implemented in ImageJ. Assays were carried out in black 96-well microplates (Greiner Microlon Fluotrac 200). A stock of 10 μM PU-FITC (or GM-cy3B34) was prepared in DMSO and diluted with Felts buffer (20 mM Hepes (K), pH 7.3, 50 mM KCl, 2 mM DTT, 5 mM MgCl , 20 mM Na MoO , and 0.01% NP40 with 0.1 mg ml−1 BGG). To each well was added the fluorescent dye-labelled HSP90 ligand (3 nM PU-FITC or 6 nM GM-cy3B), and cell lysates (7.5 μg) in a final volume of 100 μl Felts buffer. For each assay, background wells (buffer only), and tracer controls (PU-FITC only) were included on assay plate. To determine the equilibrium binding of GM-cy3b, increasing amounts of lysate (up to 20 μg of total protein) were incubated with tracer. The assay plate was placed on a shaker at room temperature for 60 min and the FP values in mP were measured every 5 min. At time t = 60 min, dissociation of fluorescent ligand was initiated by adding 1 μM PU-H71 in Felts buffer to each well and then placing the assay plate on a shaker at room temperature and measuring the FP values in mP every 5 min. The assay window was calculated as the difference between the FP value recorded for the bound fluorescent tracer and the FP value recorded for the free fluorescent tracer (defined as mP − mPf). Measurements were performed on a Molecular Devices SpectraMax Paradigm instrument (Molecular Devices, Sunnyvale, CA), and data were imported into SoftMaxPro6 and analysed in GraphPad Prism 5. To identify and separate chaperome complexes in tumours, and to overcome the limitations of classical protein chromatography methods for resolving complexes of similar composition and size, we took advantage of a capillary-based platform that combines isoelectric focusing (IEF) with immunoblotting capabilities35. This methodology uses an immobilized pH gradient to separate native multimeric protein complexes based on their isoelectric point (pI), and allows for subsequent probing of immobilized complexes with specific antibodies. The method uses only minute amounts of sample, thus enabling the interrogation of primary specimens. Cultured cells were lysed in 20 mM HEPES pH 7.5, 50 mM KCl, 5 mM MgCl , 0.01% NP40, 20 mM Na MoO buffer, containing protease and phosphatase inhibitors. Primary specimens were lysed in either Bicine-Chaps or RIPA buffers (ProteinSimple). Total protein assay was performed on an automated system, NanoPro 1000 Simple Western (ProteinSimple), for charge-based separation. Briefly, total cell lysates were diluted to a final protein concentration of 250 ng μl−1 using a master mix containing 1× Premix G2 pH 3-10 separation gradient (Protein simple) and 1× isoelectric point standard ladders (ProteinSimple). Samples diluted in this manner maintained their native charge state, and were loaded into capillaries (ProteinSimple) and separated based on their isoelectric points at a constant power of 21,000 μWatts for 40 min. Immobilization was performed by UV-light embedded in the Simple Western system, followed by incubations with anti-HSP90β (SMC-107A, StressMarq Biosciences), anti-HSP90α (ab2928, Abcam), anti-HSP70 (SPA-810, Enzo), AKT (4691), P-AKT (9271) or BCL2 (2872) from Cell Signaling Technology and subsequently with HRP-conjugated anti-Mouse IgG (1030-05, SouthernBiotech) or with HRP-conjugated anti-Rabbit IgG (4010-05, SouthernBiotech). Protein signals were quantitated by chemiluminescence using SuperSignal West Dura Extended Duration Substrate (Thermo Scientific), and digital imaging and associated software (Compass) in the Simple Western system, resulting in a gel-like representation of the chromatogram. This representation is shown for each figure. Protein was extracted from cultured cells in 20 mM Tris pH 7.4, 150 mM NaCl, 1% NP-40 buffer with protease and phosphatase inhibitors added (Complete tablets and PhosSTOP EASYpack, Roche). Ten to fifty μg of total protein was subjected to SDS–PAGE, transferred onto nitrocellulose membrane, and incubated with indicated antibodies. HSP90β (SMC-107) and HSP110 (SPC-195) antibodies were purchased from Stressmarq; HER2 (28-0004) from Zymed; HSP70 (SPA-810), HSC70 (SPA-815), HIP (SPA-766), HOP (SRA-1500), and HSP40 (SPA-400) from Enzo; HSP90β (ab2927), HSP90α (ab2928), p23 (ab2814), GAPDH (ab8245) and AHA1 (ab56721) from Abcam; cleaved PARP (G734A) from Promega; CDC37 (4793), CHIP (2080), EGFR (4267), S6K (2217), phospho-S6K (S235/236) (4858), P-AKT (S473) (9271), AKT (4691), P-ERK (T202/Y204) (4377), ERK (4695), MCL1 (5453), Bcl-XL (2764), BCL2 (2872), c-MYC (5605) and HER3 (4754) from Cell Signaling Technology; and β-actin (A1978) from Sigma-Aldrich. The blots were washed with TBS/0.1% Tween 20 and incubated with appropriate HRP-conjugated secondary antibodies. Chemiluminescent signal was detected with Enhanced Chemiluminescence Detection System (GE Healthcare) following the manufacturer’s instructions. We screened a panel of anti-chaperome antibodies for those that interacted with the target protein in its native form. We reasoned that these antibodies were more likely to capture stable multimeric forms of the chaperome members. These native-cognate antibodies were used in native-PAGE and IEF analyses of chaperome complexes. HSP90β (SMC-107) and HSP110 (SPC-195) antibodies were purchased from Stressmarq; HSP70 (SPA-810), HSC70 (SPA-815), HOP (SRA-1500), and HSP40 (SPA-400) from Enzo; HSP90β (ab2927), HSP90α (ab2928), and AHA1 (ab56721) from Abcam; CDC37 (4793) from Cell Signaling Technology. Cells were lysed in 20 mM Tris pH 7.4, 20 mM KCl, 5 mM MgCl , 0.01% NP40, and 10% glycerol buffer by a freeze-thaw procedure. Primary samples were lysed in either Bicine-Chaps or RIPA buffers (ProteinSimple). Twenty-five to one hundred μg of protein was loaded onto 4–10% native gradient gel and resolved at 4 °C. The gels were immunoblotted as described above following either incubation in Tris-Glycine-SDS running buffer for 15 min before transfer in regular transfer buffer for 1 h, or directly transferred in 0.1% SDS-containing transfer buffer for 1 h. Cells were plated at 1 × 106 per 6 well-plate and transfected with an siRNA against human AHA1 (AHSA1; 5′-TTCAAATTGGTCCACGGATAA-3′), HSP90α (HSP90AA1; no. 1 5′-ATGGCATGACAACTACTTTAA-3′; no. 2 5′-AACCCTGACCATTCCATTATT-3′; no.3 5′-TGCACTGTAAGACGTATGTAA-3′), HSP90β (HSP90AB1; no., 5′-CAAGAATGATAAGGCAGTTAA-3′; no. 5′-TACGTTGCTCACTATTACGTA-3′; no.3 5′-CAGAAGACAAGGAGAATTACA-3′) HSP90α/β (no.1 5′-CAGAATGAAGGAGAACCAGAA-3′, no.2 5′-CACAACGATGATGAACAGTAT-3′), HSP110 (HSPH1; 5′-AGGCCGCTTTGTAGTTCAGAA-3′) from Qiagen or HOP (STIP1) (Dharmacon; M-019802-01), or a negative control (scramble; 5′-CAGGGTATCGACGATTACAAA-3′) with Lipofectamine RNAiMAX reagent (Invitrogen), incubated for 72 h and subjected to further analysis. Total mRNA was isolated using TRIzol Reagent (Invitrogen) following the manufacturer’s recommended protocol. Reverse transcription of mRNA into cDNA was performed using QuantiTect Reverse Transcription Kit (Qiagen). qRT–PCR was performed using PerfeCTa SYBR (Quanta Bioscience), 10 nM AHSA1 (forward: 5′-GCGGCCGCTTCTAGTAGTTT-3′ and reverse: 5′-CATCTCTCTCCGTCCAGTGC-3′) and GAPDH (forward: 5′-CAAAGGCACAGTCAAGGCTGA-3′ and reverse: 5′-TGGTGAAGACGCCAGTAGATT-3′) primers, or 1× QuantiTect Primers for HSP110 (HSPH1), HSP90α (HSP90AA1), HSP90β (HSP90AB1), HSP70 (HSPA1A), HOP (STIP1) (Qiagen) following recommended PCR cycling conditions. Melting curve analysis was performed to ensure product uniformity. To investigate which of the two HSP70 paralogues is involved in epichaperome formation we performed immunodepletions with HSP70 and HSC70 antibodies. Protein lysates were immunoprecipitated consecutively three times with either an HSP70 (Enzo, SPA-810), HSC70 (Enzo, SPA-815) or HOP (kindly provided by M. B. Cox, University of Texas at El Paso), or with the same species normal antibody as a negative control (Santa Cruz). The resulting supernatant was collected and run on a native or a denaturing gel. Tumour lysates were mixed with 10 M urea (dissolved in Felts buffer) to reach the indicated final concentrations of 2 M, 4 M and 6 M. After incubation for 10 min at room temperature or frozen overnight at −80 °C, the lysates were loaded onto 4–10% native gradient gel and resolved at 4 °C or applied to the IEF capillary. The HSP90β bands were detected by using antibody purchased from Stressmarq (SMC-107). A lentiviral vector expressing the MYC shRNA, as previously described36, was requested from Addgene (Plasmid 29435, c-MYC shRNA sequence: GACGAGAACAGTTGAAACA). Viruses were prepared by co-transfecting the shRNA vector, the packaging plasmid psPAX2 and the envelop plasmid pMD2.G into HEK293 cells. OCI-LY1 cells were then infected with lentiviral supernatants in the presence of 4 μg ml−1 polybrene for 24 h. Following flow cytometry selection for positive cells, cells were expanded for further experiments. The MYC protein level was confirmed at 10 days post-infection by western blot using the anti-MYC antibody (Cell Signaling Technology, 5605). Viruses were prepared by co-transfection of the lentiviral vector expressing the MYC shRNA with pLM-mCerulean-2A-cMyc (Addgene, 23244) or pCDH-puro-cMYC (Addgene, 46970), the packaging plasmid psPAX2, and the envelope plasmid pMD2.G into HEK293 cells. ASPC1 cells were then infected with lentiviral supernatants in the presence of 4 μg ml−1 polybrene for 24 h and sorted for mCerulean positive cells or selected with puromycin treatment. Changes in cell size after infection were monitored by analysing the forward scatter (FSC) of intact cells via flow cytometry. MYC protein levels were analysed at 4 days post-infection by western blot. Whole cell extracts were prepared by homogenizing cells in RIPA buffer (20 mM HEPES pH 7.5, 150 mM NaCl, 1% NP40, 0.25% sodium deoxycholate, 10% glycerol, protease inhibitors). MYC activity was determined using the TransAM c-Myc Kit (Active Motif, 43396), following the manufacturer’s instructions. Cell viability was assessed using CellTiter-Glo luminescent Cell Viability Assay (Promega) after a 72 h PU-H71 treatment. The method determines the number of viable cells in culture based on quantification of the ATP present, which signals the presence of metabolically active cells, and was performed as previously reported37. For the annexin V staining, cells were labelled with Annexin V-PE and 7AAD after PU-H71 treatment for 48 h, as previously reported38. The necrotic cells were defined as annexin V+/7AAD+, and the early apoptotic cells were defined as annexin V+/7AAD−. For the LDH assay the release of lactate dehydrogenase (LDH) into the culture medium only occurs upon cell death. Following indicated treatment, the culture medium was collected and centrifuged to remove living cells and cell debris. The collected medium was incubated at room temperature for 30 min with the Cytotox-96 Non-radioactive Assay kit (Promega) LDH substrate. All animal studies were conducted in compliance with MSKCC’s Institutional Animal Care and Use Committee (IACUC) guidelines. Female athymic nu/nu mice (NCRNU-M, 20–25 g, 6 weeks old) were obtained from Harlan Laboratories and were allowed to acclimatize at the MSKCC vivarium for 1 week before implanting tumours. Mice were provided with food and water ad libitum. Tumour xenografts were established on the forelimbs for PET imaging and on the flank for efficacy studies. Tumours were initiated by sub-cutaneous injection of 1 × 107 cells for MDA-MB-468 and 5 × 106 for ASPC1 in a 200 μl cell suspension of a 1:1 v/v mixture of PBS with reconstituted basement membrane (BD Matrigel, Collaborative Biomedical Products). Before administration, a solution of PU-H71 was formulated in citrate buffer. Sample size was chosen empirically based on published data39. No statistical methods were used to predetermine sample size. Animals were randomly assigned to groups. Studies were not conducted blinded. Imaging was performed with a dedicated small-animal PET scanner (Focus 120 microPET; Concorde Microsystems, Knoxville, TN). Mice were maintained under 2% isoflurane (Baxter Healthcare, Deerfield, IL) anaesthesia in oxygen at 2 litres per min during the entire scanning period. To reduce the thyroid uptake of free iodide arising from metabolism of tracer, mice received 0.01% potassium iodide solution in their drinking water starting 48 h before tracer administration. For PET imaging, each mouse was administered 9.25 MBq (250 μCi) of 124I-PU-H71 via the tail vein. List-mode data (10 to 30 min acquisitions) were obtained for each animal at various time points post-tracer administration. An energy window of 420–580 keV and a coincidence timing window of 6 ns were used. The resulting list-mode data were sorted into 2-dimensional histograms by Fourier rebinning; transverse images were reconstructed by filtered back projection (FBP). The image data were corrected for non-uniformity of scanner response, dead-time count losses, and physical decay to the time of injection. There was no correction applied for attenuation, scatter, or partial-volume averaging. The measured reconstructed spatial resolution of the Focus 120 is 1.6-mm FWHM at the centre of the field of view. Region of interest (ROI) analysis of the reconstructed images was performed using ASIPro software (Concorde Microsystems, Knoxville, TN), and the maximum pixel value was recorded for each tissue/organ ROI. A system calibration factor (that is, μCi per ml per cps per voxel) that was derived from reconstructed images of a mouse-size water-filled cylinder containing 18F was used to convert the 124I voxel count rates to activity concentrations (after adjustment for the 124I positron branching ratio). The resulting image data were then normalized to the administered activity to parameterize the microPET images in terms of per cent injected dose per gram (%ID per g) (corrected for decay of 124I to the time of injection). Post-reconstruction smoothing was applied only for visual representation of images in the figures. Upon euthanasia, radioactivity (124I) was measured in a gamma-counter (Perkin Elmer 1480 Wizard 3 Auto Gamma counter) using a 400–600 keV energy window. Count data were background- and decay-corrected to the time of injection, and the percent injected dose per gram (%ID per g) for each tumour sample was calculated using a calibration curve to convert counts to radioactivity, followed by normalization to the total activity injected. Mice (n = 5) bearing MDA-MB-468 or ASPC1 tumours reaching a volume of 100–150 mm3 were treated i.p. using PU-H71 (75mg per kg) or vehicle, on a 3 times per week schedule, as indicated. Tumour volume (in mm3) was determined by measurement with Vernier calipers, and was calculated as the product of its length × width2 × 0.5. Tumour volume was expressed on indicated days as the median tumour volume ± s.d. indicated for groups of mice. Mice were euthanized after similar PU-H71 treatment periods, and at a time before tumours reached a size that resulted in discomfort or difficulty in physiological functions of mice in the individual treatment group, in accordance with our IUCAC protocol. Frozen tissue was dried and weighed before homogenization in acetonitrile/H O (3:7). PU-H71 was extracted in methylene chloride, and the organic layer was separated and dried under vacuum. Samples were reconstituted in mobile phase. The concentrations of PU-H71 in tissue or plasma were determined by high-performance LC-MS/MS. PU-H71-d was added as the internal standard40. Compound analysis was performed on the 6410 LC-MS/MS system (Agilent Technologies) in multiple reaction monitoring mode using positive-ion electrospray ionization. For tissue samples, a Zorbax Eclipse XDB-C18 column (2.1 × 50 mm, 3.5 μm) was used for the LC separation, and the analyte was eluted under an isocratic condition (80% H O + 0.1% HCOOH: 20% CH CN) for 3 min at a flow rate of 0.4 ml min−1. For plasma samples, a Zorbax Eclipse XDB-C18 column (4.6 × 50 mm, 5 μm) was used for the LC separation, and the analyte was eluted under a gradient condition (H O + 0.1% HCOOH:CH CN, 95:5 to 70:30) at a flow rate of 0.35 ml min−1. Protein extracts were prepared either in 20 mM HEPES pH 7.5, 50 mM KCl, 5 mM MgCl , 1% NP40, and 20 mM Na MoO for PU-H71 beads pull-down, or in 20 mM Tris pH 7.4, 150 mM NaCl, and 1% NP40 for YK beads pull-down. Samples were incubated with the PU-H71 beads (HSP90 bait) for 3–4 h or with the YK beads (HSP70 bait, for chemical precipitation) overnight, at 4 °C, then washed and subjected to SDS–PAGE with subsequent immunoblotting and western blot analysis. For HSP70 proteomic analyses, cells were incubated with a biotinylated YK-derivative, YK-biotin. Briefly, MDA-MB-468 cells were treated for 4 h with 100 μM biotin-YK5 or d-biotin as a negative control. Cells were collected and lysed in 20 mM Tris pH 7.4, 150 mM NaCl, and 1% NP40 buffer. Protein extracts were incubated with streptavidin agarose beads (Thermo Scientific) for 1 h at 4 °C, washed with 20 mM Tris pH 7.4, 150 mM NaCl, and 0.1% NP40 buffer and applied onto SDS–PAGE. The gels were stained with SimplyBlue Coomassie stain (Invitrogen Life Science Technologies). Proteomic analyses were performed using the published protocol7, 18, 22. Control beads contained an inert molecule as previously described7, 18, 22. Affinity-purified protein complexes from type 1 tumours (n = 6; NCI-H1975, MDA-MB-468, OCI-LY1, Daudi, IBL1, BC3), type 2 tumours (n = 3; ASPC1, OCI-LY4, Ramos) and from non-transformed cells (n = 3; MRC5, HMEC and neurons) were resolved using SDS-polyacrylamide gel electrophoresis, followed by staining with colloidal, SimplyBlue Coomassie stain (Invitrogen Life Science Technologies) and excision of the separated protein bands. Control beads that contained an inert molecule were subjected to the same steps as PU-H71 and YK beads and served as a control experiment. To ensure that we captured a majority of the HSP90 complexes in each cell type, we performed these studies under conditions of HSP90-bait saturation. The number of gel sections per lane averaged to be 14. In situ trypsin digestion of gel bound proteins, purification of the generated peptides and LC–MS/MS analysis were performed using our published protocols7, 18, 22. After the acquisition of raw files, Proteowizard (version 3.0.3650)41 was used to create a Mascot Generic Format (mgf) file containing accurate mass for each peak and its corresponding ms2 ions. Each mgf was then subjected to search a human segment of Uniprot protein database (20,273 sequences, European Bioinformatics Institute, Swiss Institute of Bioinformatics and Protein Information Resource) using Mascot (Matrix Science; version 2.5.0; http://www.matrixscience.com). Decoy proteins were added to the search to allow for the calculation of false discovery rates (FDR). The search parameters were as follows: (i) two missed cleavage tryptic sites were allowed; (ii) precursor ion mass tolerance = 10 p.p.m.; (iii) fragment ion mass tolerance = 0.8 Da; and (iv) variable protein modifications were allowed for methionine oxidation, deamidation of asparagine and glutamines, cysteine acrylamide derivatization and protein N-terminal acetylation. MudPit scoring was typically applied using significance threshold score P < 0.01. Decoy database search was always activated and, in general, for merged LS–MS/MS analysis of a gel lane with P < 0.01, false discovery rate averaged around 1%. The Mascot search result was finally imported into Scaffold (Proteome Software, Inc.; version 4_4_1) to further analyse tandem mass spectrometry (MS/MS) based protein and peptide identifications. X! Tandem (The GPM, http://thegpm.org; version CYCLONE (2010.12.01.1) was then performed and its results are merged with those from Mascot. The two search engine results were combined and displayed at 1% FDR. Protein and peptide probability was set at 95% with a minimum peptide requirement of 1. Protein identifications were expressed as Exclusive Spectrum Counts that identified each protein listed. Primary data, such as raw mass spectrometry files, Mascot generic format files and proteomics data files created by Scaffold have been deposited onto the Massive site (https://massive.ucsd.edu/ProteoSAFe/static/massive.jsp; MassIVE Accession ID: MSV000079877). In each of the Scaffold files that validate and import Mascot searched files, peptide matches, scoring information (Mascot, as well as X! Tandem search scores) for peptide and protein identifications, MS/MS spectra, protein views with sequence coverage and more, can be easily accessed. To read the Scaffold files, free viewer software can be found at (http://www.proteomesoftware.com/products/free-viewer/). Peptide matches and scoring information that demonstrate the data processing are available in Supplementary Table 1f–q. The exclusive spectrum count values, an alternative for quantitative proteomic measurements42, were used for protein analyses. CHIP and PP5 were examined and used as internal quality controls among the samples. Statistics were performed using R (version 3.1.3) limma package43, 44. For entries with zero spectral counts, and to enable further analyses, we assigned an arbitrary small number of 0.1. The data were then transformed into logarithmic base 10 for analysis. Linear models were fit to the transformed data and moderated standard errors were calculated using empirical Bayesian methods. For Fig. 1f and Extended Data Fig. 5a, a moderated t-statistic was used to compare protein enrichment between type 1 cells and combined type 2 and non-transformed cells45. For Extended Data Fig. 5b, the t-statistic was performed to compare protein enrichment among type 1 cells, type 2 cells and non-transformed cells (see Supplementary Table 1). Heat maps were created to display the selected proteins using the package “gplots” and “lattice”46, 47. See Supplementary Table 1 in which the table tab ‘a’ corresponds to Fig. 1f and contains core chaperome networks in type 1, type 2 and non-transformed cells; the table tab ‘b’ corresponds to Extended Data Fig. 5a and contains comprehensive chaperome networks in type 1, type 2 and non-transformed cells; the table tab ‘c’ corresponds to Extended Data Fig. 5b and Extended Data Fig. 8b and contains the HSP90 interactome as isolated by the HSP90 bait in type 1, type 2 and non-transformed cells; the table tab ‘d’ corresponds to Extended Data Fig. 8a and contains upstream transcriptional regulators that explain the protein signature of type1 tumours and the table tab ‘e’ contains metastasis-related proteins characteristic of type 1 tumours. To understand the physical and functional protein-interaction properties of the HSP90-interacting chaperome proteins enriched in type 1 tumours, we used the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database48. Proteins displayed in the heat map were uploaded in STRING database to generate the PPI networks. STRING builds functional protein-association networks based on compiled available experimental evidence. The thickness of the edges represents the confidence score of a functional association. The score was calculated based on four criteria: co-expression, experimental and biochemical validation, association in curated databases, and co-mentioning in PubMed abstracts48. Proteins with no adjacent interactions were not shown. The colour scale in nodes indicates the average enrichment of the protein (measured as exclusive spectral counts) in type 1, type 2, and non-transformed cells, respectively. The network layout for type 1 tumours was generated using edge-weighted spring-electric layout in Cytoscape with slight adjustments of marginal nodes for better visualization49. The layout for type 2 and non-transformed cells retains that of type 1 for better comparison. Proteins with average relative abundance values less than 1 were deleted from analyses. The biological processes in which they participate and the functionality of proteins enriched in type 1 tumours were assigned based on gene ontology terms and based on their designated interactome from UniProtKB, STRING, and/or I2D databases48, 50, 51, 52, 53. The Upstream Regulator analytic, as implemented in Ingenuity Pathways Analysis (IPA, QIAGEN Redwood City, http://www.qiagen.com/ingenuity), was used to identify the cascade of upstream transcriptional regulators that can explain the observed protein expression changes in type 1 tumours. The analysis is based on prior knowledge of expected effects between transcriptional regulators and their target genes stored in the Ingenuity Knowledge Base. The analysis examines how many known targets of each transcription regulator are present in the data set, and calculates an overlap P value for upstream regulators based on significant overlap between dataset genes and known targets regulated by a transcription regulator. For Extended Data Fig. 8b, proteins were selected based on 3 pre-curated lists (MYC target genes based on the analysis report from INGENUITY, MYC signature genes based on the reported list provided in ref. 54 and MYC expression/function activators were manually curated from UniProt and GeneCards databases). Cell lines with information available in the cBioPortal for cancer genomics (http://www.cbioportal.org) were evaluated for mutations in pathways implicated in cancer: P53, RAS, RAF, PTEN, PIK3CA, AKT, EGFR, HER2, CDK2NA/B, RB, MYC, STAT1, STAT3, JAK2, MET, PDGFR, KDM6A, KIT. Mutations in major chaperome members (HSP90AA1, HSP90AB1, HSPH1, HSPA8, STIP1, AHSA1) were also evaluated. Data were visualized and statistical analyses performed using GraphPad Prism (version 6; GraphPad Software) or R statistical package. In each group of data, estimate variation was taken into account and is indicated in each figure as s.d. or s.e.m. If a single panel is presented, data are representative of 2 or 3 biological or technical replicates, as indicated. P values for unpaired comparisons between two groups with comparable variance were calculated by two-tailed Student’s t-test. Pearson’s tests were used to identify correlations among variables. Significance for all statistical tests was shown in figures for not significant (NS), *P < 0.05, **P < 0.01, ***P < 0.001 and ****P < 0.0001. No samples or animals were excluded from analysis, and sample size estimates were not used. Animals were randomly assigned to groups. Studies were not conducted blinded, with the exception of all patient specimen histological analyses.

Loading UniProt collaborators
Loading UniProt collaborators