Time filter

Source Type

VILNIUS, Lithuania

Agency: Cordis | Branch: FP7 | Program: CSA-SA | Phase: FP7-PEOPLE-2009-NIGHT | Award Amount: 75.79K | Year: 2009

The main idea of the event is to show that researchers are ordinary people having not ordinary profession their job is full of excitement, unpredictability and unbelievable discoveries. The event pointed towards natural sciences takes place in 6 rather distant sites including 4 largest cities and covering a wide geographical area of Lithuania. That enables to attract as many participants as possible. The project is polarized towards whole society - all ages, all genders, all nationalities, all ethnic groups - everyone will be welcomed to an amazing world of research and will find that they also can perform experimental research. A lot of different activities are offered for visitors - they will see not only the serious side of natural sciences (astrophysics, Earth science, physics, chemistry, biology/biotechnology, biomedicine, agronomy, forestry, environmental research), but also the funny one. The popular lectures and the interactive program Hands on experiments in the open research laboratories at 5 universities, 2 research institutes and a high-tech industrial company comprise one part of the event. This action is complemented by the Amalgam program comprised of public science show, visiting specialized exhibitions, theater performances, finals of the announced competitions and other diverse activities in four botany gardens (and two museums). Activities of both parts of the event are based on the direct communication and discussions among academic and broad public, involving participants, students and researchers in joint project activities. The young people will be familiarized with researchers by direct communication and by presenting them the selected successful cases of individual research carrier in Lithuania and playing real stories on what the life of researcher is like.

Expression plasmid pJH114 containing the five E. coli bamABCDE genes which were under the control of a trc promoter, and with an octa-histidine (8 × His) tag at the C terminus of bamE was initially used for overexpression of BamABCDE complex in E. coli HDB150 cells16. Expression of the native BamABCDE complex was induced with 100 μmol l−1 isopropyl-β-D-1-thiogalactopyranoside (IPTG; Formedium) at 20 °C overnight when the absorbance of the cell culture at 600 nm reached 0.5–0.8. The selenomethionine-labelled BAM complexes were expressed in M9 medium supplemented with selenomethionine Medium Nutrient Mix (Molecular Dimensions) and 100 mg l−1 L-(+)-selenomethionine (Generon) using the similar conditions as the native BamABCDE. Both native and selenomethionine-labelled BamABCDE complexes were purified using a similar protocol. In brief, the cells were pelleted and resuspended in lysis buffer containing 20 mM Tris-HCl, pH 8.0, 150 mM NaCl, 10 μg ml−1 DNase I and 100 μg  ml−1 lysozyme and lysed by passing through a cell disruptor (Constant Systems) at 206 MPa. The lysate was centrifuged to remove the cell debris and unbroken cells, and the supernatant was ultracentrifuged to pellet the membranes at 100,000g for 1 h. The cell membranes were resuspended in solubilization buffer containing 20 mM Tris-HCl, pH 8.0, 300 mM NaCl, 10 mM imidazole and 1–2% n-dodecyl-β-D-maltopyranoside (DDM; all detergents were purchased from Anatrace) and rocked for 1 h at room temperature or overnight at 4 °C. The suspension was ultracentrifuged and the supernatant was applied to a 5-ml pre-equilibrated HisTrap HP column (GE Healthcare). The column was washed with wash buffer containing 20 mM Tris-HCl, pH 8.0, 300 mM NaCl and 35 mM imidazole and eluted with elution buffer containing 300 mM imidazole. The eluent was applied to HiLoad 16/600 Superdex 200 prep grade column (GE healthcare) pre-equilibrated with gel filtration buffer containing 20 mM Tris-HCl, pH 7.8, 300 mM NaCl and detergents. Different detergents were used in protein purification procedures. The purified BamABCDE complex was analysed by SDS–PAGE (Extended Data Fig. 1 and Supplementary Fig. 1), which indicated that BamB is not enough in the complex, and BamB is absent in the determined structure. We therefore decided to generate a new plasmid to express the BamABCDE complex. Additional copy of the E. coli bamB gene was introduced into pJH114 (ref. 16) after the 8 × His tag to generate a new expression plasmid pYG120 using a modified sequence and ligation-independent cloning (SLIC) method42. In brief, vector backbone and bamB gene fragments were amplified by PCR using Q5 Hot Start High-Fidelity DNA Polymerase (New England BioLabs), and plasmid pJH114 as template and primers PF_pJH114_SLIC (5′-GTTAATCGACCTGCAGGCATGCAAG-3′) and PR_pJH114_SLIC (5′-CTCTAGAGGATCTTAGTGGTGATGATGGTG-3′), and PF_EBB_SLIC (5′-TCATCACCACTAAGATCCTCTAGAGAGGGACCCGATGCAATTGC-3′) and PR_EBB_SLIC (5′-CTTGCATGCCTGCAGGTCGATTAACGTGTAATAGAGTACACGGTTCC-3′), respectively. Gel-extracted fragments were digested by T4 DNA polymerase (Fermentas) at 22 °C for 35 min followed by 70 °C for 10 min, and then placed on ice immediately. The digested fragments were annealed in an annealing buffer (10 mM Tris, pH 8.0, 100 mM NaCl and 1 mM EDTA) by incubating at 75 °C for 10 min and decreasing by 0.1 °C every 8 s to 20 °C. The mixture was transformed into E. coli DH5α for plasmid preparation. The DNA sequences were confirmed by sequencing. For the purification of the BamABCDE complex from the pYG120 construct, the wash buffer, elution buffer and gel filtration buffer were supplemented with different detergent combinations. A second gel filtration was performed to change detergents with gel filtration buffer containing 1 CMC N-octyl-β-D-glucopyranoside (OG) and 1 CMC N-dodecyl-N,N-dimethylamine-N-oxide (LDAO). For BamABCDE complex purification from construct pJH114, the wash buffer, elution buffer and gel filtration buffer were supplemented with 2 CMC N-nonyl-β-D-glucoside (β-NG) and 1 CMC tetraethylene glycol monooctyl ether (C8E4). The peak fraction was pooled and concentrated using Vivaspin 20 centrifugal concentrator (Sartorius, molecular mass cut off: 100 kDa). The selenomethionine-labelled proteins were purified in the same way as the native proteins of BamABCDE complex. The purified proteins were concentrated to 8–12 mg ml−1 for crystallization. For NaI co-crystallization, NaCl was replaced by NaI in the gel filtration buffer. All crystallizations were carried out by sitting-drop vapour diffusion method in the MRC 96-well crystallization plates (Molecular Dimensions) at 22 °C. The protein solution was mixed in a 1:1 ratio with the reservoir solution using the Gryphon crystallization robot (Art Robbins Instruments). The best NaI co-crystallized crystals were grown from 150 mM HEPES, pH 7.5, 30% PEG6000 and CYMAL-4 in MemAdvantage (Molecular Dimensions) as additive. The best native crystals were grown from 150 mM HEPES, pH 7.5 and 27.5% PEG6000. The best selenomethionine-labelled crystals were grown from 100 mM Tris, pH 8.0, 200 mM MgCl  . 6H O, 24% PEG1000 MME and OGNG in MemAdvantage as additive. The crystals were harvested, flash-cooled and stored in liquid nitrogen for data collection. The data sets of selenomethionine labelled BAM complex were collected on the I03 beamline at Diamond Light Resources (DLS) at a wavelength of 0.9795 Å. All data were indexed, integrated and scaled using XDS43. The crystals belong to space group of P4 2 2, with the cell dimensions a = b = 254.16 Å, c = 179.22, α = β = γ = 90°. There are two complexes in the asymmetric unit. The structure was determined to 3.9 Å resolution (Extended Data Table 1) using ShelxD44, 45. Fifty-six selenium sites were found, which gave a figure of merit (FOM) of 0.32. After density modification using DM46, the BamACDE complex was clearly visible in the electron density map, but without BamB. Using the individual high-resolution models, the BamACDE complex was built using Coot47 by skeletonizing the electron density map and docking the BAM subunits in the electron density map with selenomethionine sites used as guides. Rigid body refinement was performed following manual docking. NCS refinement was used along with TLS refinement against groups automatically determined using PHENIX48. Restrained refinement was performed with group B-factors alongside reference model secondary structure restraints from higher resolution models. Weights were automatically optimised by PHENIX48. To obtain the BamABCDE complex structure, the new construct was used to produce sufficient BamB to form the BamABCDE complex. The data sets of BamABCDE complex were collected on the I02 beamline at DLS. The crystals belong to space group P4 2 2, with the cell dimensions a = b = 116.69 Å, c = 435.19 Å, α = β = γ = 90°. There is one complex molecule in the asymmetric unit. Although the crystals diffracted to 2.90 Å, the crystal structure of BamABCDE could not be determined by molecular replacement. BamABCDE complex was crystallized in presence of 0.2 M sodium iodide, and SAD data sets were collected at a wavelength of 1.8233 Å. Four 360° data sets were collected on different regions of the same crystal of NaI co-crystallization then combined. The phases were determined by ShelxD44, 45 at 4 Å resolution. Eleven iodide sites were found, which gave a FOM of 0.28. The phases were extended to 2.90 Å by DM46, and the model was built using Coot47 by skeletonizing the electron density map and docking the individual high-resolution subunits in the electron density map and rigid body fit this model into the higher resolution native data set while retaining and extending the free R set from the iodide data set. The BamABCDE complex was refined using PHENIX48. TLS groups were automatically determined using PHENIX48 and used for refinement along with individual B-factors. Weights were automatically optimised and secondary structure restraints were used. An E. coli bamA expression plasmid was constructed for functional assays using SLIC method as described above. An N-terminal 10 × His tag fused with bamA starting from residue 22 was amplified by PCR using Q5 Hot Start High-Fidelity DNA Polymerase (New England BioLabs), and plasmid pJH114 as template and primers PF_bamA_SLIC (5′-CCATCATCATCATCATCATCATCATGAAGGGTTCGTAGTGAAAGATATTCATTTCGAAG-3′) and PR_bamA_SLIC (5′-AGACTCGAGTTACCAGGTTTTACCGATGTTAAACTGGAAC-3′). Vector backbone was amplified from a modified pRSFDuet-1 vector (Novagen, Merck Millipore) containing an N-terminal pelB signal peptide coding sequence with primers PF_RSFM_SLIC (5′-CGGTAAAACCTGGTAACTCGAGTCTGGTAAAGAAACCGCTGC-3′) and PR_RSFM_SLIC (5′-ATGATGATGATGATGATGATGATGGTGATGGGCCATCGCCGGCTG-3′). Plasmids were prepared using GeneJET Plasmid Miniprep Kit (Thermo Scientific). Site-directed mutagenesis was performed according to a previously described protocol49 with slight modification (PCR conditions and the sequences of the primers are available on request). The sequences of the wild type and all mutant constructs of BamA were confirmed by sequencing. E. coli JCM166 cells3 transformed with the wild-type BamA or its mutants were plated on LB agar plates supplemented with 50 μg ml−1 kanamycin and 100 μg ml−1 carbenicillin in the presence or absence of 0.05% L-(+)-arabinose and grown overnight at 37 °C. Single colonies grown on arabinose-containing plates were inoculated in 10 ml LB medium supplemented with 50 μg ml−1 kanamycin, 100 μg ml−1 carbenicillin and 0.025% L-(+)-arabinose, and incubated at 200 r.p.m. at 37 °C for 16 h. For plate assays, the cells were pelleted and resuspended in fresh LB medium supplemented with 50 μg ml−1 kanamycin and 100 μg ml−1 carbenicillin, and diluted to an A of ~0.3 and streaked onto LB agar plates supplemented with 50 μg ml−1 kanamycin, 100 μg ml−1 carbenicillin in the presence or absence of 0.05% L-(+)-arabinose and cultured at 37 °C for 12–14 h. Western blotting was performed to examine protein expression levels of BamA in the membrane. 50 ml of overnight cultures of transformed JCM166 cells with respective wild-type or each mutant of BamA were pelleted. The cells were resuspended in 25 ml 20 mM Tris-HCl, pH 8.0, 150 mM NaCl and sonicated. The cell debris and unbroken cells were removed by centrifugation at 7,000g for 30 min. The supernatant was centrifuged at 100,000g for 60 min and the membrane fraction was collected. The membrane fraction was suspended in 5 ml buffer containing 20 mM Tris-HCl, pH 8.0, 150 mM NaCl and 1% 3-(N,N-dimethylmyristylammonio)-propanesulfonate (Sigma) and solubilized for 30 min at room temperature. Samples were mixed with 5 × SDS–PAGE loading buffer, heated for 5 min at 90 °C, cooled for 2 min on ice and centrifuged. Ten microlitres of each sample was loaded onto 4–20% Mini-PROTEAN TGX Gel (Bio-Rad) for SDS–PAGE and then subjected to immunoblot analysis. The proteins were transferred to PVDF membrane using Trans-Blot Turbo Transfer Starter System (Bio-Rad) according to the manufacturer’s instructions. The PVDF membranes were blocked in 10 ml protein-free T20 (TBS) blocking buffer (Fisher) overnight at 4 °C. The membranes were incubated with 10 mL His-Tag monoclonal antibody (diluted, 1:1,000) (Millipore) for 1 h at room temperature followed by washed with PBST four times and incubated with IRDye 800CW goat anti-mouse IgG (diluted, 1:5,000) (LI-COR) for 1 h. The membrane was washed with PBST four times and PBS twice. Images were acquired using LI-COR Odyssey (LI-COR). The JCM166 cells containing the double cysteine mutants Gly393Cys/Gly584Cys, Glu435Cys/Ser665Cys and Glu435Cys/Ser658Cys of BamA were cultured overnight in LB medium with 50 μg ml−1 kanamycin, 100 μg ml−1 carbenicillin and 0.025% L-(+)-arabinose, respectively. The membrane fraction from 50 ml cells was isolated and solubilized as described above. The samples were mixed with SDS loading buffer and then boiled for 5 min or kept at room temperature for 5–10 min. SDS–PAGE was performed at 4 °C by running the gel for 60 min at 150 V. The proteins were transferred to PVDF membrane as described above and the BamA mutants were detected by western blotting. All molecular dynamics simulations were performed using GROMACS v5.0.2 (ref. 50). The Martini 2.2 force field51 was used to run an initial 1 μs Coarse Grained (CG) molecular dynamics simulation to permit the assembly and equilibration of a 1-palmitoly, 2-cis-vaccenyl, phosphatidylglycerol (PVPG): 1-palmitoly, 2-cis-vaccenyl, phosphatidylethanolamine (PVPE) bilayers around the BamABCDE complexes52. Using the self-assembled system as a guide the coordinates of the BAM complexes were inserted into an asymmetric model E. coli OM, comprised of PVPE, PVPG, cardiolipin in the periplasmic leaflet and the inner core of Rd1 LPS lipids in the outer leaflet53, using Alchembed54. This equated to a total system size of ~500,000 atoms. The systems were then equilibrated for 1 ns with the protein restrained before 100 ns of unrestrained atomistic molecular dynamics using the Gromos53a6 force field55. The lipid-modified cysteine parameters were created from lipid parameters for diacylglycerol and palmitoyl and appended to the parameters of the N-terminal cysteines56. Systems were neutralised with Mg2+ ions, to preserve the integrity of the outer leaflet of the OM, and a 150 mM concentration of NaCl. All ~500,000 atom systems were all run for 100 ns, with box dimensions in the region of 200 × 200 × 150 Å3. To assess the stability of the subunit stoichiometry we assessed various combinations of BAM assemblies. For both BamACDE and BamABCDE crystal structures, we investigated ABCDE, AD and A alone, with three repeats each; while single simulations were also performed for BamABD, ACD, ADE, ABDE and ACDE, with a total simulation time equating to 2.8 μs. In cases where domains or subunits were missing these were added to the complex by structurally aligning the resolved units from the companion structure. For BamB, this was added to the BamACDE complex by structurally aligning POTRA 3. For the full BamC, this was added to the BamABCDE by aligning the resolved N-terminal domains. Individual protein complexes were configured and built using Modeller57 and PyMOL (The PyMOL Molecular Graphics System, version 1.8, Schrödinger, LLC). All simulations were performed at 37 °C, with protein, lipids and solvent separately coupled to an external bath, using the velocity-rescale thermostat58. Pressure was maintained at 1 bar, with a semi-isotropic compressibility of 4 × 10−5 using the Parinello–Rahman barostat59. All bonds were constrained with the LINCS algorithm60, 61. Electrostatics was measured using the Particle Mesh Ewald (PME) method62, while a cut-off was used for Lennard–Jones parameters, with a Verlet cut-off scheme to permit GPU calculation of non-bonded contacts. Simulations were performed with an integration time-step of 2 fs. The linear interpolation between the three structures was performed using the morph operation in Gromacs tools50. Analysis of the molecular simulations was performed using Gromacs tools50, MDAnalysis63 and locally written scripts. Conservation analysis was performed using Consurf64. For each subunit, 150 homologues were collected from UNIREF9065 using three iterations of CSI-Blast66, with an E-value of 0.0001. The Consurf scores were then mapped into the B-factor column for each of the subunits.

NOD/SCID Il2rgnull mice (Jackson Laboratory) were bred and maintained in the Stem Cell Unit animal barrier facility at McMaster University. All procedures were approved by the Animal Research Ethics Board at McMaster University. All patient samples were obtained with informed consent and with the approval of local human subject research ethics boards at McMaster University. Human umbilical cord blood mononuclear cells were collected by centrifugation with Ficoll-Paque Plus (GE), followed by red blood cell lysis with ammonium chloride (StemCell Technologies). Cells were then incubated with a cocktail of lineage-specific antibodies (CD2, CD3, CD11b, CD11c, CD14, CD16, CD19, CD24, CD56, CD61, CD66b, and GlyA; StemCell Technologies) for negative selection of Lin− cells using an EasySep immunomagnetic column (StemCell Technologies). Live cells were discriminated on the basis of cell size, granularity and, as needed, absence of viability dye 7-AAD (BD Biosciences) uptake. All flow cytometry analysis was performed using a BD LSR II instrument (BD Biosciences). Data acquisition was conducted using BD FACSDiva software (BD Biosciences) and analysis was performed using FlowJo software (Tree Star). To quantify MSI2 expression in human HSPCs, Lin− cord blood cells were stained with the appropriate antibody combinations to resolve HSC (CD34+ CD38− CD45RA− CD90+), MPP (CD34+ CD38− CD45RA− CD90−), CMP (CD34+ CD38+ CD71−) and EP (CD34+ CD38+ CD71+) fractions as similarly described previously18, 19 with all antibodies from BD Biosciences: CD45RA (HI100), CD90 (5E10), CD34 (581), CD38 (HB7) and CD71 (M-A712). Cell viability was assessed using the viability dye 7AAD (BD Biosciences). All cell subsets were isolated using a BD FACSAria II cell sorter (BD Biosciences) or a MoFlo XDP cell sorter (Beckman Coulter). HemaExplorer20 analysis was used to confirm MSI2 expression in human HSPCs and across the hierarchy. For all qRT–PCR determinations total cellular RNA was isolated with TRIzol LS reagent according to the manufacturer’s instructions (Invitrogen) and cDNA was synthesized using the qScript cDNA Synthesis Kit (Quanta Biosciences). qRT–PCR was done in triplicate with PerfeCTa qPCR SuperMix Low ROX (Quanta Biosciences) with gene-specific probes (Universal Probe Library (UPL), Roche) and primers: MSI2 UPL-26, F-GGCAGCAAGAGGATCAGG, R-CCGTAGAGATCGGCGACA; HSP90 UPL-46, F-GGGCAACACCTCTACAAGGA, R-CTTGGGTCTGGGTTTCCTC; CYP1B1 UPL-20, F-ACGTACCGGCCACTATCACT, R-CTCGAGTCTGCACATCAGGA; GAPDH UPL-60, F-AGCCACATCGCTCAGACAC, R-GCCCAATACGACCAAATCC; ACTB (UPL Set Reference Gene Assays, Roche). The mRNA content of samples compared by qRT–PCR was normalized based on the amplification of GAPDH or ACTB. MSI2 shRNAs were designed with the Dharmacon algorithm (http://www.dharmacon.com). Predicted sequences were synthesized as complimentary oligonucleotides, annealed and cloned downstream of the H1 promoter of the modfied cppt-PGK-EGFP-IRES-PAC-WPRE lentiviral expression vector18. Sequences for the MSI2 targeting and control RFP targeting shRNAs were as follows: shMSI2, 5′-GAGAGATCCCACTACGAAA-3′; shRFP, 5′-GTGGGAGCGCGTGATGAAC-3′. Human MSI2 cDNA (BC001526; Open Biosystems) was subcloned into the MA bi-directional lentiviral expression vector21. Human CYP1B1 cDNA (BC012049; Open Biosystems) was cloned in to psMALB22. All lentiviruses were prepared by transient transfection of 293FT (Invitrogen) cells with pMD2.G and psPAX2 packaging plasmids (Addgene) to create VSV-G pseudotyped lentiviral particles. All viral preparations were titrated on HeLa cells before use on cord blood. Standard SDS–PAGE and western blotting procedures were performed to validate the effects of knockdown on transduced NB4 cells (DSMZ) and overexpression on 293FT cells. Immunoblotting was performed with anti-MSI2 rabbit monoclonal IgG (EP1305Y, Epitomics) and β-actin mouse monoclonal IgG (ACTBD11B7, Santa Cruz Biotechnology) antibodies. Secondary antibodies used were IRDye 680 goat anti-rabbit IgG and IRDye 800 goat anti-mouse IgG (LI-COR). 293FT and NB4 cell lines tested negative for mycoplasma. NB4 cells were authenticated by ATRA treatment before use. Cord blood transductions were conducted as described previously18, 23. Briefly, thawed Lin− cord blood or flow-sorted Lin− CD34+ CD38− or Lin− CD34+ CD38+ cells were prestimulated for 8–12 h in StemSpan medium (StemCell Technologies) supplemented with growth factors interleukin 6 (IL-6; 20 ng ml−1, Peprotech), stem cell factor (SCF; 100 ng ml−1, R&D Systems), Flt3 ligand (FLT3-L; 100 ng ml−1, R&D Systems) and thrombopoietin (TPO; 20 ng ml−1, Peprotech). Lentivirus was then added in the same medium at a multiplicity of infection of 30–100 for 24 h. Cells were then given 2 days after transduction before use in in vitro or in vivo assays. For in vitro cord blood studies biological (experimental) replicates were performed with three independent cord blood samples. Human clonogenic progenitor cell assays were done in semi-solid methylcellulose medium (Methocult H4434; StemCell Technologies) with flow-sorted GFP+ cells post transduction (500 cells per ml) or from day seven cultured transduced cells (12,000 cells per ml). Colony counts were carried out after 14 days of incubation. CFU-GEMMs can seed secondary colonies owing to their limited self-renewal potential24. Replating of MSI2-overexpressing and control CFU-GEMMs for secondary CFU analysis was performed by picking single CFU-GEMMs at day 14 and disassociating colonies by vortexing. Cells were spun and resuspended in fresh methocult, mixed with a blunt-ended needle and syringe, and then plated into single wells of a 24-well plate. Secondary CFU analysis for shMSI2- and shControl-expressing cells was performed by harvesting total colony growth from a single dish (as nearly equivalent numbers of CFU-GEMMs were present in each dish), resuspending cells in fresh methocult by mixing vigorously with a blunt-ended needle and syringe and then plating into replicate 35-mm tissue culture dishes. In both protocols, secondary colony counts were done following incubation for 10 days. For primary and secondary colony forming assays performed with the AHR agonist FICZ (Santa Cruz Biotechnology), 200 nM FICZ or 0.1% DMSO was added directly to H4434 methocult medium. Two-way ANOVA analysis was performed to compare secondary CFU output and FICZ treatment for MSI2-overexpressing or control conditions. Colonies were imaged with a Q-Colour3 digital camera (Olympus) mounted to an Olympus IX5 microscope with a 10× objective lens. Image-Pro Plus imaging software (Media Cybernetics) was used to acquire pictures and subsequent image processing was performed with ImageJ software (NIH). Transduced human Lin− cord blood cells were sorted for GFP expression and seeded at a density of 105 cells per ml in IMDM 10% FBS supplemented with human growth factors IL-6 (10 ng ml−1), SCF (50 ng ml−1), FLT3-L (50 ng ml−1), and TPO (20 ng ml−1) as previously described25. To generate growth curves, every seven days cells were counted, washed, and resuspended in fresh medium with growth factors at a density of 105 cells per ml. Cells from suspension cultures were also used in clonogenic progenitor, cell cycle and apoptosis assays. Experiments performed on transduced Lin− CD34+ cord blood cells used serum-free conditions as described in the cord blood transduction subsection of Methods. For in vitro cord blood studies, biological (experimental) replicates were performed with three independent cord blood samples. Cell cycle progression was monitored with the addition of BrdU to day 10 suspension cultures at a final concentration of 10 μM. After 3 h of incubation, cells were assayed with the BrdU Flow Kit (BD Biosciences) according to the manufacturer’s protocol. Cell proliferation and quiescence were measured using Ki67 (BD Bioscience) and Hoechst 33342 (Sigma) on day 4 suspension cultures after fixing and permeabilizing cells with the Cytofix/Cytoperm kit (BD Biosciences). For apoptosis analysis, Annexin V (Invitrogen) and 7-AAD (BD Bioscience) staining of day 7 suspension cultures was performed according to the manufacturer’s protocol. Lin− cord blood cells were initially stained with anti-CD34 PE (581) and anit-CD38 APC (HB7) antibodies (BD Biosciences) then fixed with the Cytofix/Cytoperm kit (BD Biosciences) according to the manufacturer’s instructions. Fixed and permeabilized cells were immunostained with anti-MSI2 rabbit monoclonal IgG antibody (EP1305Y, Abcam) and detected by Alexa-488 goat anti-rabbit IgG antibody (Invitrogen). CD34+ cells were transduced with an MSI2-overexpression or MSI2-knockdown lentivirus along with their corresponding controls and sorted for GFP expression 3 days later. Transductions for MSI2 overexpression or knockdown were each performed on two independent cord blood samples. Total RNA from transduced cells (>1 × 105) was isolated using TRIzol LS as recommended by the manufacturer (Invitrogen), and then further purified using RNeasy columns (Qiagen). Sample quality was assessed using Bioanalyzer RNA Nano chips (Agilent). Paired-end, barcoded RNA-seq sequencing libraries were then generated using the TruSeq RNA Sample Prep Kit (v2) (Illumina) following the manufacturer’s protocols starting from 1 μg total RNA. The quality of library generation was then assessed using a Bioanalyzer platform (Agilent) and Illumina MiSeq-QC run was performed or quantified by qPCR using KAPA quantification kit (KAPA Biosystems). Sequencing was performed using an Illumina HiSeq2000 using TruSeq SBS v3 chemistry at the Institute for Research in Immunology and Cancer’s Genomics Platform (University of Montreal) with cluster density targeted at 750,000 clusters per mm2 and paired-end 2 × 100-bp read lengths. For each sample, 90–95 million reads were produced and mapped to the hg19 (GRCh37) human genome assembly using CASAVA (version 1.8). Read counts generated by CASAVA were processed in EdgeR (edgeR_3.12.0, R 3.2.2) using TMM normalization, paired design, and estimation of differential expression using a generalized linear model (glmFit). The false discovery rate (FDR) was calculated from the output P values using the Benjamini–Hochberg method. The fold change of logarithm of base 2 of TMM normalized data (logFC) was used to rank the data from top upregulated to top downregulated genes and FDR (0.05) was used to define significantly differentially expressed genes. RNA-seq data have been deposited in NCBI’s Gene Expression Omnibus (GEO) and are accessible through GEO Series accession number GSE70685. iRegulon26 was used to retrieve the top 100 AHR predicted targets with a minimal occurrence count threshold of 5. The data were analysed using GSEA27 with ranked data as input with parameters set to 2,000 gene-set permutations. The GEO dataset GSE28359, which contains Affymetrix Human Genome U133 Plus 2.0 Array gene expression data for CD34+ cells treated with SR1 at 30 nM, 100 nM, 300 nM and 1,000 nM was used to obtain lists of genes differentially expressed in the treated samples compared to the control ones (0 nM)2. Data were background corrected using Robust Multi-Array Average (RMA) and quantile normalized using the expresso() function of the affy Bioconductor package (affy_1.38.1, R 3.0.1). Lists of genes were created from the 150 top upregulated and downregulated genes from the SR1-treated samples at each dose compared to the non-treated samples (0 nM). The data were analysed using GSEA with ranked data as input with parameters set to 2,000 gene-set permutations. The normalized enrichment score (NES) and false discovery rate (FDR) were calculated for each comparison. The GEO data set GSE24759, which contains Affymetrix GeneChip HT-HG_U133A Early Access Array gene expression data for 38 distinct haematopoietic cell states4, was compared to the MSI2 overexpression and knockdown data. GSE24759 data were background corrected using Robust Multi-Array Average (RMA), quantile normalized using the expresso() function of the affy Bioconductor package (affy_1.38.1, R 3.0.1), batch corrected using the ComBat() function of the sva package (sva_3.6.0) and scaled using the standard score. Bar graphs were created by calculating for significantly differentially expressed genes the number of scaled data that were above (>0) or below (<0) the mean for each population. Percentages indicating for how long the observed value (set of up- or downregulated genes) was better represented in that population than random values were calculated from 1,000 trials. A unique list of genes closest to AHR-bound regions previously identified from TCDD-treated MCF7 ChIP–seq data14 was used to calculate the overlap with genes showing >1.5-fold downregulation in response to treatment with UM171 (35 nM) or SR1 (500 nM) relative to DMSO-treated samples3 as well as with genes significantly downregulated in MSI2-overexpressing versus control treated samples (FDR < 0.05). The percentage of downregulated genes with AHR-bound regions was then plotted for each gene set. P values were generated with Fisher’s exact test for comparisons between gene lists. AHR transcription factor binding sites in downregulated gene sets were identified with oPOSSUM-328. Genes showing >1.5-fold downregulation in response to treatment with UM171 (35 nM) or SR1 (500 nM) relative to DMSO-treated samples3 were used along with significantly downregulated genes (FDR < 0.05) with EdgeR-analysed MSI2-overexpressing versus control-treated samples. The three gene lists were uploaded into oPOSSUM-3 and the AHR:ARNT transcription factor binding site profile was used with the matrix score threshold set at 80% to analyse the region 1,500 bp upstream and 1,000 bp downstream of the transcription start site. The percentage of downregulated genes with AHR-binding sites in their promoters was then plotted for each gene set. Fisher’s exact test was used to identify significant overrepresentation of AHR-binding sites in gene lists relative to background. Eight- to 12-week-old male or female NSG mice were sublethally irradiated (315 cGy) one day before intrafemoral injection with transduced cells carried in IMDM 1% FBS at 25 μl per mouse. Injected mice were analysed for human haematopoietic engraftment 12–14 weeks after transplantation or at 3 and 6.5 weeks for STRC experiments. Mouse bones (femurs, tibiae and pelvis) and spleen were removed and bones were crushed with a mortar and pestle then filtered into single-cell suspensions. Bone marrow and spleen cells were blocked with mouse Fc block (BD Biosciences) and human IgG (Sigma) and then stained with fluorochrome-conjugated antibodies specific to human haematopoietic cells. For multilineage engraftment analysis, cells from mice were stained with CD45 (HI30) (Invitrogen), CD33 (P67.6), CD15 (HI98), CD14 (MφP9), CD19 (HIB19), CD235a/GlyA (GA-R2), CD41a (HIP8) and CD34 (581) (BD Biosciences). For MSI2 knockdown in HSCs, 5.0 × 104 and 2.5 × 104 sorted Lin− CD34+ CD38− cells were used per short-hairpin transduction experiment, leading to transplantation of day zero equivalent cell doses of 10 × 103 and 6.25 × 103, respectively, per mouse. For STRC LDA transplantation experiments, 105 sorted CD34+CD38+ cells were used per control or MSI2-overexpressing transduction. After assessing levels of gene transfer, day zero equivalent GFP+ cell doses were calculated to perform the LDA. Recipients with greater than 0.1% GFP+CD45+/− cells were considered to be repopulated. For STRC experiments that read out extended engraftment at 6.5 weeks, 2 × 105 CD34+ CD38+ cells were used per overexpressing or control transduction to allow non-limiting 5 × 104 day zero equivalent cell doses per mouse. For HSC expansion and LDA experiments, CD34+CD38− cells were sorted and transduced with MSI2-overexpressing or control vectors (50,000 cells per condition) for 3 days and then analysed for gene-transfer levels (% GFP+/−) and primitive cell marker expression (% CD34 and CD133). To ensure that equal numbers of GFP+ cells were transplanted into both control and MSI2-overexpressing recipient mice, we added identically cultured GFP− cells to the MSI2 culture to match the % GFP+ of the control culture (necessary owing to the differing efficiency of transduction). The adjusted MSI2-overexpressing culture was recounted and aliquoted (63,000 cells) to match the output of half of the control culture. Three day 0 equivalent GFP+ cell doses (1,000, 300 and 62 cells) were then transplanted per mouse to perform the D3 primary LDA. A second aliquot of the adjusted MSI2-overexpressing culture was then taken and put into culture in parallel with the remaining half of the control culture to perform another LDA after 7 days of growth (10 days total growth, D10 primary LDA). Altogether, four cell doses were transplanted; when converted back to day 0 equivalents these equalled approximately 1,000, 250, 100, and 20 GFP+ cells per mouse, respectively. Pooled bone marrow from six engrafted primary mice that received D10 cultured control or MSI2-overexpressing cells (from the two highest doses transplanted) was aliquoted into five cell doses of 15 million, 10 million, 6 million, 2 million and 1 million cells. The numbers of GFP+ cells within primary mice was estimated from nucleated cell counts obtained from NSG femurs, tibias and pelvises and from Colvin et al.29. The actual numbers of GFP+ cells used for determining numbers of GFP+ HSCs and the number of mice transplanted for all LDA experiments is shown in Supplementary Tables 3–5. The cut-off for HSC engraftment was a demonstration of multilineage reconstitution that was set at bone marrow having >0.1% GFP+ CD33+ and >0.1% GFP+ CD19+ cells. HSC and STRC frequency was assessed using ELDA software30. For all mouse transplantation experiments, mice were age- (6–12 week) and sex-matched. All transplanted mice were included for analysis unless mice died from radiation sickness before the experimental endpoint. No randomization or blinding was performed for animal experiments. Approximately 3–6 mice were used per cell dose for each cord blood transduction and transplantation experiment. CLIP–seq was performed as previously described15. Briefly, 25 million NB4 cells (a transformed human cell line of haematopoietic origin) were washed in PBS and UV-cross-linked at 400 mJ cm−2 on ice. Cells were pelleted, lysed in wash buffer (PBS, 0.1% SDS, 0.5% Na-deoxycholate, 0.5% NP-40) and DNase-treated, and supernatants from lysates were collected for immunoprecipitation. MSI2 was immunoprecipitated overnight using 5 μg of anti-MSI2 antibody (EP1305Y, Abcam) and Protein A Dynabeads (Invitrogen). Beads containing immunoprecipated RNA were washed twice with wash buffer, high-salt wash buffer (5× PBS, 0.1% SDS, 0.5% Na-Deoxycholate, 0.5% NP-40), and PNK buffer (50 mM Tris-Cl pH 7.4, 10 mM MgCl , 0.5% NP-40). Samples were then treated with 0.2 U MNase for 5 min at 37° with shaking to trim immunopreciptated RNA. MNase inactivation was then carried out with PNK + EGTA buffer (50 mM Tris-Cl pH 7.4, 20 mM EGTA, 0.5% NP-40). The sample was dephosphorylated using alkaline phosphatase (CIP, NEB) at 37° for 10 min followed by washing with PNK+EGTA, PNK buffer, and then 0.1 mg ml−1 BSA in nuclease-free water. 3′RNA linker ligation was performed at 16° overnight with the following adaptor: 5′P-UGGAAUUCUCGGGUGCCAAGG-puromycin. Samples were then washed with PNK buffer, radiolabelled using P32-y-ATP (Perkin Elmer), run on a 4–12% Bis-Tris gel and then transferred to a nitrocellulose membrane. The nitrocellulose membrane was developed via autoradiography and RNA–protein complexes 15–20 kDa above the molecular weight of MSI2 were extracted with proteinase K followed by RNA extraction with acid phenol-chloroform. A 5′RNA linker (5′HO-GUUCAGAGUUCUACAGUCCGACGAUC-OH) was ligated to the extracted RNA using T4 RNA ligase (Fermentas) for 2 h and the RNA was again purified using acid phenol-chloroform. Adaptor ligated RNA was re-suspended in nuclease-free water and reverse transcribed using Superscript III reverse transcriptase (Invitrogen). Twenty cycles of PCR were performed using NEB Phusion Polymerase using a 3′PCR primer that contained a unique Illumina barcode sequence. PCR products were run on an 8% TBE gel. Products ranging between 150 and 200 bp were extracted using the QIAquick gel extraction kit (Qiagen) and re-suspended in nuclease-free water. Two separate libraries were prepared and sent for single-end 50-bp Illumina sequencing at the Institute for Genomic Medicine at the University of California, San Diego. 47,098,127 reads from the first library passed quality filtering, of which 73.83% mapped uniquely to the human genome. 57,970,220 reads from the second library passed quality filtering, of which 69.53% mapped uniquely to the human genome. CLIP-data reproducibility was verified through high correlation between gene RPKMs and statistically significant overlaps in the clusters and genes within replicates. CLIP–seq data have been deposited in NCBI’s GEO and are accessible through GEO Series accession number GSE69583. Before sequence alignment of CLIP–seq reads to the human genome was performed, sequencing reads from libraries were trimmed of polyA tails, adapters, and low quality ends using Cutadapt with parameters–match-read-wildcards–times 2 -e 0 -O 5–quality-cutoff' 6 -m 18 -b TCGTATGCCGTCTTCTGCTTG -b ATCTCGTATGCCGTCTTCTGCTTG -b CGACAGGTTCAGAGTTCTACAGTCCGACGATC -b TGGAATTCTCGGGTGCCAAGG -b AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA-b TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT. Reads were then mapped against a database of repetitive elements derived from RepBase (version 18.05). Bowtie (version 1.0.0) with parameters -S -q -p 16 -e 100 -l 20 was used to align reads against an index generated from Repbase sequences31. Reads not mapped to Repbase sequences were aligned to the hg19 human genome (UCSC assembly) using STAR (version 2.3.0e)32 with parameters–outSAMunmapped Within –outFilterMultimapNmax 1 –outFilterMultimapScoreRange 1. To identify clusters in the genome of significantly enriched CLIP–seq reads, reads that were PCR replicates were removed from each CLIP–seq library using a custom script of the same method as in ref. 33; otherwise, reads were kept at each nucleotide position when more than one read’s 5′-end was mapped. Clusters were then assigned using the CLIPper software with parameters–bonferroni–superlocal–threshold-34. The ranked list of significant targets was calculated assuming a Poisson distribution, where the observed value is the number of reads in the cluster, and the background is the number of reads across the entire transcript and or across a window of 1000 bp ± the predicted cluster. Transcriptomic regions and gene classes were defined using annotations found in gencode v17. Depending on the analysis, clusters were associated by the Gencode-annotated 5′UTR, 3′UTR, CDS or intronic regions. If a cluster overlapped multiple regions, or a single part of a transcript was annotated as multiple regions, clusters were iteratively assigned first as CDS, then 3′UTR, 5′UTR and finally as proximal (<500 bases from an exon) or distal (>500 bases from an exon) introns. Overlapping peaks were calculated using bedtools and pybedtools35, 36. Significantly enriched gene ontology (GO) terms were identified using a hypergeometric test that compared the number of genes that were MSI2 targets in each GO term to genes expressed in each GO term as the proper background. Expressed genes were identified using the control samples in SRA study SRP012062. Mapping was performed identically to CLIP–seq mapping, without peak calling and changing the STAR parameter outFilterMultimapNmax to 10. Counts were calculated with featureCounts37 and RPKMs were then computed. Only genes with a mean RPKM > 1 between the two samples were used in the background expressed set. Randomly located clusters within the same genic regions as predicted MSI2 clusters were used to calculate a background distribution for motif and conservation analyses. Motif analysis was performed using the HOMER algorithm as in ref. 34. For evolutionary sequence conservation analysis, the mean (mammalian) phastCons score for each cluster was used. CD34+ cells (>5 × 104) were transduced with an MSI2-overexpression or control lentivirus. Three days later, GFP+ cells were sorted and then put back in to StemSpan medium containing growth factors IL-6 (20 ng ml−1), SCF (100 ng ml−1), FLT3-L (100 ng ml−1) and TPO (20 ng ml−1). A minimum of 10,000 cells were used for immunostaining at culture days 3 and 7 after GFP sorting. Cells were fixed in 2% PFA for 10 min, washed with PBS and then cytospun on to glass slides. Cytospun cells were then permeabilized (PBS, 0.2% Triton X-100) for 20 min, blocked (PBS, 0.1% saponin, 10% donkey serum) for 30 min and stained with primary antibodies (CYP1B1 (EPR14972, Abcam); HSP90 (68/hsp90, BD Biosciences)) in PBS with 10% donkey serum for 1 h. Detection with secondary antibody was performed in PBS 10% donkey serum with Alexa-647 donkey anti-rabbit antibody or Alexa-647 donkey anti-mouse antibodies for 45 min. Slides were mounted with Prolong Gold Antifade containing DAPI (Invitrogen). Several images (200–1,000 cells total) were captured per slide at 20× magnification using an Operetta HCS Reader (Perkin Elmer) with epifluorescence illumination and standard filter sets. Columbus software (Perkin Elmer) was used to automate the identification of nuclei and cytoplasm boundaries in order to quantify mean cell fluorescence. A 271-bp region of the CYP1B1 3′UTR that flanked CLIP–seq-identified MSI2-binding sites was cloned from human HEK293FT genomic DNA using the forward primer GTGACACAACTGTGTGATTAAAAGG and reverse primer TGATTTTTATTATTTTGGT AATGGTG and placed downstream of renilla luciferase in the dual-luciferase reporter vector pGL4 (Promega). A 271-bp geneblock (IDT) with 6 TAG > TCC mutations was cloned in to pGL4 using XbaI and NotI. The HSP90 3′UTR was amplified from HEK293FT genomic DNA with the forward primer TCTCTGGCTGAGGGATGACT and reverse primer TTTTAAGGCCAAGGAATTAAGTGA and cloned into pGL4. A geneblock of the HSP90 3′UTR (IDT) with 14 TAG > TCC mutations was cloned in to pGL4 using SfaAI and NotI. Co-transfection of wild-type or mutant luciferase reporter (40 ng) and control or MSI2-overexpressing lentiviral expression vector (100 ng) was performed in the NIH-3T3 cell line, which does not express MSI1 or MSI2 (50,000 cells per co-transfection). Reporter activity was measured using the Dual-Luciferase Reporter Assay System (Promega) 36–40 h later. For MSI2-overexpressing cultures with the AHR antagonist SR1, Lin− CD34+ cells were transduced with MSI2-overexpression or control lentivirus in medium supplemented with SR1 (750 nM; Abcam) or DMSO vehicle (0.1%). GFP+ cells were isolated (20,000 cells per culture) and allowed to proliferate with or without SR1 for an additional 7 days at which point they were counted and immunophenotyped for CD34 and CD133 expression. For MSI2-overexpressing cultures with the AHR agonist FICZ, Lin− CD34+ cells were transduced with MSI2-overexpression or control lentivirus. GFP+ cells were isolated (20,000 cells per culture) and allowed to proliferate with FICZ (200 nM; Santa Cruz Biotechnology) or DMSO (0.1%) for an additional 3 days, at which point they were immunophenotyped for CD34 and CD133 expression. Lin− CD34+ cells were cultured for 72 h (lentiviral treated but non-transduced flow-sorted GFP− cells) in StemSpan medium containing growth factors IL-6 (20 ng ml−1), SCF (100 ng ml−1), FLT3-L (100 ng ml−1) and TPO (20 ng ml−1) before the addition of the CYP1B1 inhibitor TMS (Abcam) at a concentration of 10 μM or mock treatment with 0.1% DMSO. Equal numbers of cells (12,000 per condition) were then allowed to proliferate for 7 days at which point they were counted and immunophenotyped for CD34 and CD133 expression. Unless stated otherwise (that is, analysis of RNA–seq and CLIP–seq data sets), all statistical analysis was performed using GraphPad Prism (GraphPad Software version 5.0). Unpaired student t-tests or Mann–Whitney tests were performed with P < 0.05 as the cut-off for statistical significance. No statistical methods were used to predetermine sample size.

No statistical methods were used to predetermine sample size. The experiments were not randomized. The investigators were not blinded to allocation during experiments and outcome assessment. The strains used in this study are listed in Supplementary Table 1. Unless otherwise specified, for all deletion mutants, the corresponding alleles from the Keio collection22 were transferred into the MC4100 wild-type strain using P1 transduction standard procedures23 and checked by PCR. To excise the resistance cassette, we used pCP20 (refs 22, 24). Strain AG227, deleted for the entire yedYZ operon, was constructed as follows. First, a cat-sacB cassette, encoding chloramphenicol acetyl transferase and SacB, a protein conferring sensitivity to sucrose, was amplified from strain CH1990 using primers yedYZ::cat-sacB_Fw and yedYZ::cat-sacB_Rv. The resulting PCR product shared a 40-base-pair (bp) homology to the 5′ untranslated region of yedY (msrP) and to the 3′ untranslated region of yedZ (msrQ) at its 5′ and 3′ ends, respectively. After purification, the PCR product was transformed by electroporation into CH1940. These cells harbour the pSIM5-tet vector, which encodes the Red recombination system proteins Gam, Beta and Exo under the control of the temperature-sensitive repressor cI859, encoded by the same vector. Induction of the Gam, Beta and Exo proteins was induced by shifting the cells to 42 °C for 15 min before making them electrocompetent. Recombinant cells were selected on chloramphenicol-containing plates (25 μg ml−1) at 37 °C for 16 h. At this temperature, the pSIM5-tet vector, which has a temperature-sensitive origin of replication, is lost. Colonies were also tested for the presence of the cat-sacB cassette by negative selection on sucrose-containing media (5% sucrose, no NaCl). Finally, we verified that the cat-sacB cassette replaced the msrPQ operon in the resulting strain (AG219) by sequencing across the junctions. The cat-sacB cassette was subsequently moved from AG219 to TP1004 by P1 transduction. The cat-sacB cassette was eliminated from the resulting strain (AG220) by transforming it with the pSIM5-tet plasmid, electroporating it with the oligonucleotide Delta_yedYZ (300 ng) and performing lambda red recombination as described above. Recombinants were selected on sucrose-containing media at 30 °C for 16 h. To eliminate the plasmid, the selected colonies were grown at 37 °C for 16 h. Loss of the cassette in the resulting AG227 strain was verified by positive (sucrose resistance) and negative (chloramphenicol sensitivity) selection and by PCR. The msrQ deletion mutant (strain BE105) was generated using the PCR knockout method developed in ref. 24. Briefly, a DNA fragment containing the cat gene flanked with the homologous sequences found upstream and downstream of the yedZ gene was PCR-amplified using pKD3 as template and the oligonucleotides P1_Up_YedZ and P2_Down_YedZ. Strain BE100, carrying plasmid pKD46, was then transformed by electroporation with the amplified linear fragment. Chloramphenicol-resistant clones were selected and verified by PCR. The msrP::lacZ fusion was constructed using the method described in ref. 25. Briefly, the msrP promoter region lying between nucleotide −797 and nucleotide +63, using the A nucleotide within the initiation triplet as a reference, was amplified by PCR with the appropriate oligonucleotides (lacI-msrP and lacZ-msrP' ). Using mini-lambda-mediated recombineering, the PCR product was then directly recombined with the chromosome of a modified E. coli wild-type strain (PM1205), carrying a P -cat-sacB cassette inserted in front of lacZ, at the ninth codon. Recombinants were selected for loss of the cat-sacB genes, resulting in the translational fusion of msrP to lacZ. The plasmids and primers used in this study are listed in Supplementary Tables 2 and 3, respectively. The YedY-His (MsrP-His ) expression vector was constructed as follows. Site-directed mutagenesis using primers pTAC_NdeI_Fw and pTAC_NdeI_Rv was performed using pTAC-MAT-Tag-2 as template to introduce an NdeI restriction site in the vector, yielding vector pAG177. yedY (msrP) DNA was amplified from the chromosome (MC4100) using primers pTAC_yedY_Fw and pTAC_yedY-His _Rv, which resulted in the fusion of a His tag coding sequence at the 3′ end. The PCR product was subsequently cloned into pAG177 using NdeI and BglII restriction sites, generating plasmid pAG178. To construct IPTG-inducible pTAC-MAT-Tag-2 vectors expressing either MsrP (without tag) or both MsrP and MsrQ, we first amplified the corresponding coding DNA sequences (msrP or the msrPQ operon) from the chromosome of strain MC4100 using primer pairs pTAC_yedY_Fw/ pTAC_yedY_Rv and pTAC_yedY_Fw/ pTAC_yedZ_Rv, respectively. The PCR products were then cloned into pAG177 using restriction sites NdeI and BglII, yielding pAG192 (MsrP) and pAG195 (MsrPQ). The complementation pAM238 vectors constitutively expressing either MsrP or MsrQ alone (without tag) or both MsrP and MsrQ were constructed as follows. We first amplified the corresponding coding DNA sequences (msrP, msrQ or the msrPQ locus) in addition to a 50 bp upstream region from each start codon (to include a ribosomal binding site) from the chromosome of strain MG1655 using primer pairs pAM238_yedY_Fw/ pAM238_yedY_Rv, pAM238_yedZ_Fw/ pAM238_yedZ_Rv and pAM238_yedY_Fw/ pAM238_yedZ_Rv, respectively. The PCR products were then cloned into pAM238 using restriction sites KpnI and PstI, yielding pAG264 (MsrP), pAG275 (MsrQ), and pAG265 (MsrPQ). The vector allowing the arabinose-inducible expression of SurA was constructed as follows. The surA-encoding DNA and its 50 bp upstream region (to include a ribosomal binding site) were amplified from the chromosome of strain MG1655 using the primer pair surA_Fw/surA_Rv. The PCR product was then cloned into pBAD33 using restriction sites KpnI and XbaI, yielding vector pAG290. Expression levels of the yedYZ (msrPQ) mRNA were assessed in M63 minimal medium supplemented with 0.5% glycerol, 0.15% casamino acids, 1 mM MgSO , 1 mM MoNa O , 17 μM Fe (SO ) and vitamins (thiamine 10 μg ml−1, biotin 1 μg ml−1, riboflavin 10 μg ml−1 and nicotinamide 10 μg ml−1). Overnight cultures of MG1655 were diluted to A  = 0.04 in fresh M63 minimal medium (100 ml) and cultured aerobically at 37 °C until A  = 0.8. Cells (10 ml) were then pelleted, resuspended in TriPure (Roche) and homogenized. After mixing with chloroform, RNA was isolated by centrifugation (15 min, 15,700g, 4 °C), precipitated with isopropanol, washed with ethanol 70%, dried and finally resuspended in DEPC water. Any residual DNA was eliminated by treatment of the sample with DNase (Turbo DNA-free Kit, Ambion). A RevertAid RT kit (Thermo Scientific) was used to generate complementary DNA (cDNA) from 1 μg RNA extracted from each of the cultured strains. cDNAs were then diluted 1/10 and submitted to qPCR, using a qPCR Core kit for SYBR Green I No ROX (Eurogentec) and a MyiQ Single-Colour Real-Time PCR Detection System (Bio-Rad). Expression levels of yedYZ were normalized to the expression of gapA. Primers used for qPCR analysis were qPCR_yedYZ_Fw and qPCR_yedYZ_Rv for yedYZ, and qPCR_gapA_Fw and qPCR_gapA_Rv for gapA (Supplementary Table 3). Synthesis of MsrP in strains JB590 and BE100 was assessed as follows. Overnight cultures were diluted to A  = 0.04 in fresh M63 minimal medium (100 ml) and cultured aerobically at 37 °C until A  = 0.8. Nine hundred microlitres of each culture were then precipitated with 10% ice-cold trichloroacetic acid (TCA), pellets were washed with ice-cold acetone, dried, resuspended and heated at 95 °C in Laemmli SDS sample buffer (SB buffer) (2% SDS, 10% glycerol, 60 mM Tris-HCl, pH 7.4, 0.01% bromophenol blue), and loaded on an SDS–PAGE gel for immunoblot analysis. The protein amounts loaded were standardized by taking into account the A values of the cultures. To monitor the MsrP expression levels after NaOCl or H O treatment, overnight cultures of wild-type cells (MG1655) were diluted to A  = 0.04 in fresh lysogeny broth (LB) medium (100 ml) and grown aerobically at 37 °C to A  = 0.5. NaOCl (2 mM) or H O (1 mM) was then added to the cultures. Samples were TCA-precipitated, washed with ice-cold acetone, dried, suspended in SB buffer, heated at 95 °C and loaded on an SDS–PAGE gel for immunoblot analysis. The protein amounts loaded were standardized by taking into account the A values of the cultures. The specificity of the anti-MsrP antibody was verified (Supplementary Fig. 5). l-Methionine sulfoxide ([α] 24 = +14.3° (water)), triethylamine (>99%) and methanol (>99.6%) were obtained from Sigma-Aldrich, picric acid from Prolabo and D O from SDS. Water was purified using Millipore Elix Essential 3 apparatus. 1H and 13C NMR were recorded on a Bruker Avance III Nanobay spectrometer (1H: 400 MHz; {1H}13C: 100 MHz). Chemical shifts (δ) were referenced to dioxane (1H: δ = 3.75 p.p.m.; 13C: δ = 67.19 p.p.m.)26, which was added as an internal reference; resonances are detailed as follows: 1H, δ in parts per million (multiplicity, J-coupling in hertz, integration, signal attribution); {1H}13C, δ in parts per million (signal attribution). For each diastereoisomer, chemical shifts are similar to those previously reported27. 13C resonance assignments were confirmed by heteronuclear single quantum coherence experiments. Optical rotations were measured on an Anton Paar Modular Circular Polarimeter 200 instrument at 25 °C and 589 nm from aqueous solution containing 0.8–1.2 g per 100 ml of l-methionine sulfoxide. The values reported are the average and s.d. relative to three independent measurements recorded on distinct solutions. The commercial mixture of diastereoisomers was separated following the previously reported method28. Briefly, 10 ml of water was added to l-methionine sulfoxide (1.333 g, 8.069 mmol) and picric acid (1.849 g, 8.071 mmol). The suspension was heated to reflux until complete dissolution and then slowly cooled to room temperature (~25 °C). The suspension was filtered on a sintered funnel and the solid was washed with cold water (10 ml in total). Both the solid (dextro) and filtrate (levo) were collected separately for further purification. Dextro. To the dried solid, 20 ml of water were added and the mixture was heated to reflux then allowed to cool slowly to room temperature. The solid was filtered out, washed with 10 ml water and dried. Again, 11 ml of methanol were added to the resulting solid and the mixture heated to reflux. After slow cooling, the yellow crystals were filtered, washed with 5 ml methanol and dried. A portion was used for structure determination by X-ray analysis. To the dextrogyre picrate salt (1.345 g, 3.42 mmol), ~1.1 equivalents of triethylamine were added as a dilute aqueous solution (22 ml, 175 mM, 3.85 mmol). Subsequently, 200 ml of acetone were added portion-wise to the above stirring suspension and a white solid precipitated. This was filtered, washed, triturated with acetone and finally dried in vacuum (533 mg, 80%). Levo. The volume of the filtrate was reduced in vacuum at 40 °C to about 3–4 ml to obtain a saturated solution and a small amount of precipitate. Then, 1.5 ml of water were added, the suspension was filtered and the solid washed with minimal water (2 ml). The whole step was repeated once (reduce the volume, dilute, filter and wash), and the resulting solution was then completely dried in vacuum. To the resulting yellow residue, 15 ml of methanol were added and the suspension was heated to reflux. In our hands, no solid precipitated upon cooling (in contrast with the reported method28); therefore the solution was dried again in vacuum. Following the same protocol as before, to the levogyre-enriched picrate salt (1.354 g, 3.44 mmol), ~1.1 equivalents of triethylamine were added as a concentrated aqueous solution (3.8 ml, 1 M, 3.8 mmol). Afterwards, 200 ml of acetone were added portion-wise and a white solid precipitated. This was filtered, washed, triturated with acetone and finally dried in vacuum (515 mg, 77%). Levo (l-methionine-R-sulfoxide): [α] 25 = −72.7 ± 0.5° (water); 1H NMR (400 MHz, D O pD = 6.5): 3.86 (t, 3J = 6.3, 1H, Hα ), 3.12 (ddd, J = 13.4, 9.6, 7.0, 1H, Hγ or Hγ ), 3.02 (m, 2H, Hγ ), 2.93 (ddd, J = 13.5, 9.1, 6.8, 1H, Hγ or Hγ ), 2.74 (s, 3H, Hε ), 2.31 (m, 2H, Hβ ); {1H}13C NMR (100 MHz, D O): 173.9 (COO ), 54.2 (Cα ), 54.0 (Cα ), 48.9 (Cγ ), 37.2 (Cε ), 37.0 (Cε ), 24.4 (Cβ ). Literature values from ref. 28: [α] 26 = −71.6° (water), from ref. 27: [α]  = −78° (water, room temperature); 1H NMR (300 MHz, D O): 4.10 (m, 1H), 3.08–2.78 (m, 2H), 2.59 (s, 3H), 2.32–2.13 (m, 2H); 13C NMR (75 MHz, D O): 171.1, 52.1, 48.4, 37.0, 23.7. In the 1H NMR spectra, the resonance centred at 3.02 p.p.m. was attributed to the S- enantiomer. The relative integral values suggest that R-Met-O is contaminated by 3% of the S- diastereoisomer. Moreover, comparing the measured [α] 25 values with those reported in ref. 27, the data are consistent with the presence of 3% S- diastereoisomer as a contaminant. Such purity is in line with previous reports using the same separation method28, 29. The absolute configuration of the l-methionine-S-sulfoxide was confirmed by X-ray structural analysis and matches previous assignments27, 30. To synthesize N-acetyl-Met-O, Met-O (30 mg; Sigma-Aldrich) was solubilized in 2 ml 100% acetic acid. After addition of 2 ml of 97% acetic anhydride, the resulting mixture was incubated 2 h at 23 °C. Then, 2 ml of water were added and the mixture was lyophilized overnight. Finally, the lyophilized N-acetyl-Met-O was washed three times with 6 ml of water, re-lyophilized and suspended in 500 mM Na HPO , pH 9.0 to a final concentration of 1.5 M. The pH was then adjusted to 7 with NaOH. The MsrP reductase activity was followed spectrophotometrically at 600 nm by monitoring the substrate-dependent oxidation of reduced benzyl viologen, serving as an electron donor. Reactions were performed anaerobically at 30 °C in degassed and nitrogen-flushed 50 mM MOPS, pH 7.0 using stoppered cuvettes. Benzyl viologen was used at a final concentration of 0.4 mM (molar extinction coefficient, ε, of reduced benzyl viologen = 7,800 M–1 cm–1) and reduced with sodium dithionite. The final reaction volume was kept constant, with the ordered addition of benzyl viologen, sodium dithionite, 1–32 mM N-acetyl-methionine sulfoxide (NacMet-O) and 10 nM MsrP-His . The concentrations used for the R- and S-Met-O diastereoisomers were 1–64 mM. The Michaelis–Menten parameters (maximum velocity (V ) and K ) were determined using Graphpad Prism software. The reductase activities of MsrA and MsrB were followed spectrophotometrically at 340 nm by monitoring the substrate-dependent oxidation of NADPH (ε = 6,220 M–1 cm–1). Reactions were performed at 37 °C in HEPES–KOH 20 mM, pH 7.4, NaCl 10 mM, and the final reaction volumes were kept constant, with the ordered addition of 250 μM NADPH (Roche), 2.6 μM TrxR, 40 μM Trx, 64 mM substrate and 1.5 μM of either MsrA or MsrB. The identification of the MsrP substrates was performed as follows. AG89 cells (2L) were grown aerobically at 37 °C in terrific broth to A  = 0.8. Periplasmic extracts were prepared as described previously31. Briefly, cells were pelleted by centrifugation at 3,000g for 20 min at 4 °C and incubated on ice with gentle shaking for 30 min in 100 mM Tris-HCl, pH 8.0, 20% sucrose, 1 mM EDTA. This mixture also contained 20 mM N-ethylmaleimide to alkylate reduced cysteine residues in proteins to prevent their subsequent oxidation. Periplasmic proteins were then isolated by centrifugation of the cells at 3,000g for 20 min at 4 °C. The periplasmic extract was subsequently concentrated by ultrafiltration in an Amicon cell (3,000 Da cutoff, YM-3 membrane) and loaded on a PD-10 column (GE Healthcare) equilibrated with 50 mM NaPi, pH 8.0, 50 mM NaCl. After concentration using a 5 kDa cutoff Vivaspin 4 (Sartorius) concentrator, the extract was finally separated in three samples. Two samples were incubated 10 min at 37 °C with 2 mM NaOCl whereas the third was left untreated to serve as reduced control. NaOCl was then removed by gel filtration using a NAP-5 column (GE Healthcare) equilibrated with 50 mM MOPS, pH 7.0. The untreated sample was also subjected to the NAP-5 gel filtration. One of the NaOCl-oxidized fractions was then reduced in vitro by incubation for 1 h at 37 °C with 10 μM MsrP, 10 mM benzyl viologen and an excess of sodium dithionite. The other NaOCl-oxidized fraction, used as an oxidized control, and the non-oxidized fraction were incubated with 10 mM benzyl viologen and an excess of sodium dithionite but without MsrP. The three samples were then de-salted by dialysis against 50 mM MOPS, pH 7.0 by using Slide-A-Lyzer 3,500 MWCO G2 cassettes (Thermo Scientific). The three samples (500 μg) were precipitated by adding TCA to a final concentration of 10% w/v. The resulting pellets were washed with ice-cold acetone, dried in a Speedvac, suspended in 0.1 M NH HCO , pH 8.0, digested overnight at 30 °C with 3 μg sequencing-grade trypsin, and analysed by two-dimensional LC–MS/MS essentially as described32. Briefly, peptides were first separated on a first-dimension hydrophilic interaction liquid chromatography (HILIC) column with a reverse acetonitrile gradient and 25 fractions of 1 ml collected (2 min per fraction). After drying, peptides were analysed by LC–MS/MS on a C18 column. The MS scan routine was set to analyse by MS/MS the five most intense ions of each full MS scan; dynamic exclusion was enabled to assure detection of co-eluting peptides. Raw data collection of approximately 230,000 MS/MS spectra per two-dimensional LC–MS/MS experiment was followed by protein identification using SEQUEST. All MS raw files have been deposited in the ProteomeXchange Consortium33 via the PRIDE partner repository with the data set identifier PXD002804. In detail, peak lists were generated using extract-msn (ThermoScientific) within Proteome Discoverer 1.4.1. From raw files, MS/MS spectra were exported with the following settings: peptide mass range 350–5,000 Da; minimal total ion intensity 500. The resulting peak lists were searched using SequestHT against a target-decoy E. coli protein database (release 07.01.2008, 8,678 entries comprising forward and reverse sequences) obtained from Uniprot. The following parameters were used: trypsin was selected with proteolytic cleavage only after arginine and lysine, number of internal cleavage sites was set to 1, mass tolerance for precursors and fragment ions was 1.0 Da, and considered dynamic modifications were +15.99 Da for oxidized methionine and +125.12 Da for N-ethylmaleimide on cysteines. Peptide matches were filtered using the q value and posterior error probability calculated by the Percolator algorithm ensuring an estimated false positive rate below 5%. The filtered SEQUEST HT output files for each peptide were grouped according to the protein from which they were derived using the multiconsensus results tool within Proteome Discoverer. Then the values of the spectral matches of only Met-containing peptides were combined from the three two-dimensional LC–MS/MS experiments and exported in a Microsoft Excel spreadsheet, with the rows referring to the peptide sequences and the columns to the fractions of the HILIC column. Oxidation of Met residues to Met-O by NaOCl causes a hydrophilic shift, which influences their retention time and makes them elute later (4–8 min) than their reduced counterpart on a HILIC column. If these Met-O are reduced by MsrP, they will then show a hydrophobic shift and elute at the same retention time on the HILIC column as in the control sample. By comparing the retention times and the number of peptide spectral matches of the Met-O-containing peptides in a periplasmic extract under three experimental conditions (control, oxidized by NaOCl with and without MsrP), one can identify ‘bona fide’ potential MsrP substrates. TP1004 cells harbouring plasmid pAG178 and overexpressing MsrP-His protein were grown aerobically at 30 °C in terrific broth (Sigma-Aldrich) supplemented with sodium molybdate (1.5 mM) and ampicillin (200 μg ml−1). When cells reached A  = 0.8, expression was induced with 0.1 mM IPTG for 3 h. Periplasmic proteins were then extracted as in ref. 32. MsrP-His was then purified by loading the periplasmic extract on a 1 ml HisTrap FF column (GE Healthcare) equilibrated with buffer A (NaPi 50 mM, pH 8.0, NaCl 300 mM). After washing the column with buffer A, MsrP-His was eluted by applying a linear gradient of imidazole (from 0 to 300 mM) in buffer A. The fractions containing MsrP-His were pooled, concentrated using a 5 kDa cutoff Vivaspin 15 (Sartorius) device and de-salted on a PD-10 column (GE Healthcare) equilibrated with 50 mM NaPi, pH 8.0, 150 mM NaCl. VU1 CaM, MsrA and MsrB were expressed and purified as described previously34, 35. Trx was expressed and purified as follows. BL21 (DE3) cells harbouring plasmid pMD205, overexpressing Trx with a carboxy (C)-terminal His tag, were grown aerobically at 37 °C in LB supplemented with kanamycin (50 μg ml−1). Expression was induced at A  = 0.6 with 1 mM IPTG for 3 h. Cells were then pelleted, resuspended in buffer A (NaPi 50 mM, pH 8.0, NaCl 300 mM) and disrupted by two passes through a French pressure cell at 12,000 psi. The lysate was then centrifuged at 30,000g and at 4 °C for 45 min, to remove cell debris, and Trx was purified as described for MsrP-His . Ni-NTA-purified Trx was then loaded on a 120 ml HiLoad 16/60 Superdex 75 PG column (GE Healthcare) previously equilibrated with HEPES–KOH 50 mM, pH 7.4, NaCl 100 mM. The resulting Trx-containing fractions were pooled and concentrated using a 5 kDa cutoff Vivaspin 15 device. Thioredoxin reductase (TrxR) was expressed and purified as follows. BL21 (DE3) cells harbouring plasmid pPL223-2, overexpressing TrxR with an amino (N)-terminal His tag, were grown aerobically at 37 °C in LB supplemented with ampicillin (200 μg ml−1). Expression was induced at A  = 0.6 with 1 mM IPTG for 3 h. Protein extraction was performed as described for Trx and purification was performed as described for MsrP-His . BL21 (DE3) cells harbouring plasmid pKD11, overexpressing SurA with a C-terminal His tag, were grown aerobically at 37 °C in LB supplemented with kanamycin (50 μg ml−1). Expression was induced at A  = 0.6 with 1 mM IPTG for 3 h. Protein extraction and purification were performed as described for MsrP-His . MG1655 cells harbouring plasmid pKD84, overexpressing SurA with a C-terminal Strep-tag, were grown aerobically at 37 °C in LB supplemented with ampicillin (200 μg ml−1). Expression was induced at A  = 0.7 with a final concentration of 200 μg l−1 anhydrotetracycline (AHT) for 5 h. Protein extraction was performed as described for MsrP-His . SurA-Strep was then purified by loading the periplasmic extract on a 5 ml Strep-Tactin Superflow cartridge H-PR (IBA) equilibrated with buffer A (Tris-HCl 100 mM, pH 8.0, NaCl 150 mM, EDTA 1 mM). After washing the column with buffer A, SurA-Strep was eluted by applying a linear gradient of desthiobiotin (from 0 to 2.5 mM) in buffer A. The fractions containing SurA-Strep were pooled, concentrated using a 5 kDa cutoff Vivaspin 15 (Sartorius) device and de-salted on a PD-10 column (GE Healthcare) equilibrated with 50 mM NaPi, pH 8.0, 150 mM NaCl. A modified version of Pal lacking the signal sequence and in which the first cysteine of the lipobox was replaced by an alanine (Pal ) was expressed with an N-terminal His tag from the pEB0513 vector in BL21 (DE3) cells. Cells were grown aerobically at 37 °C in LB supplemented with ampicillin (200 μg ml−1). Expression was induced at A  = 0.6 with 1 mM IPTG for 3 h. Protein extraction was performed as described for Trx and purification was performed as described for MsrP-His . CaM was oxidized in vitro as described previously36. SurA-His and Pal were oxidized in vitro by incubating the purified proteins (50 μM) for 2 h 30 min at 30 °C with 100 mM H O in a buffer containing 50 mM NaPi, pH 8.0, 50 mM NaCl. H O was then removed by gel filtration using a NAP-5 column (GE Healthcare) equilibrated with 50 mM NaPi, pH 8.0, 150 mM NaCl. In vitro repair of oxidized CaM (CaMox), SurA (SurA ox) and Pal (Pal ox) was assessed by incubating the oxidized proteins (2 μM of CaMox and SurA ox, 5 μM of Pal ox) with purified MsrP-His (2 μM for CaMox and SurA ox, 5 μM for Pal ox), 10 mM benzyl viologen and an excess of sodium dithionite at 37 °C for 1 h. As controls, the oxidized proteins were incubated separately with either MsrP-His or the inorganic reducing system (benzyl viologen and sodium dithionite). The reactions were stopped by adding SB buffer and heating at 95 °C for the CaM and SurA samples or by adding 0.1% trifluoroacetic acid for the Pal samples. The CaM and SurA samples were then loaded on an SDS–PAGE gel and the proteins visualized with the PageBlue Protein Staining Solution (Fermentas). For the Pal samples (20 μg), proteins were separated by reverse-phase high-performance liquid chromatography on a C4 column (Vydac 214TP54, 4.6 mm × 250 mm) at a flow rate of 400 μl min−1 with a linear gradient of acetonitrile in 0.1% trifluoroacetic acid (0–70% acetonitrile in 90 min). Absorbance was monitored at 214 nm and the peaks were collected. The fractions were dried in a Speedvac and the proteins resupsended in 25 μl of 100 mM NH HCO before overnight digestion at 30 °C with 0.5 μg of trypsin or EndoGlu-C. The peptides were then analysed as described below. For CaM and SurA, the gel bands corresponding to the different oxidation states were in-gel digested with trypsin and the resulting peptides analysed by LC–MS/MS on a C18 reverse-phase column as described above. Relative abundances of every Met-containing peptide in its different oxidation state were obtained by integration of peak area intensities, taking into account the extracted ion chromatogram of both doubly and triply charged ions. The in vivo repair of SurA ox and Pal ox by the MsrPQ system or MsrP alone expressed from plasmids pAG195 and pAG192, respectively, was performed as follows. Overnight cultures of AG233 (containing the empty pAG177 vector), AG234 (containing the pAG195 plasmid) and AG289 (containing the pAG192 plasmid) were diluted to A  = 0.04 into fresh LB medium (100 ml) and cells were grown aerobically at 37 °C in the presence of 0.1 mM IPTG and 200 μg ml−1 ampicillin. At A  = 0.5, cells were subjected to NaOCl treatment (3.5 mM) and protein synthesis was blocked by the addition of chloramphenicol (300 μg ml−1). Samples were taken at different time points after NaOCl addition and precipitated with TCA. The pellets were then washed with ice-cold acetone, suspended in SB buffer, heated at 95 °C and loaded on a SDS–PAGE gel for immunoblot analysis using anti-Pal37 and anti-SurA antibodies. The specificity of the anti-SurA antibody was verified (Supplementary Fig. 6). The protein amounts loaded were standardized by taking into account the A values of the cultures. SurA-Strep was oxidized in vitro by incubating the purified protein (200 μM) for 3 h at 30 °C with 100 mM H O in a buffer containing 50 mM NaPi, pH 8.0, 150 mM NaCl. H O was then removed by gel filtration using a NAP-5 column (GE Healthcare) equilibrated with 50 mM NaPi, pH 8.0, 150 mM NaCl. For the in vitro repair of oxidized SurA (SurA ox), the oxidized protein (30 μM) was incubated with purified MsrP-His (30 μM), 10 mM benzyl viologen and 10 mM of sodium dithionite at 37 °C for 1 h. Following repair, SurA was purified by passing the sample through a gravity flow column containing 200 μl Strep-Tactin Sepharose beads (from a 50% suspension, IBA), previously equilibrated with buffer A (Tris-HCl 100 mM, pH 8.0, NaCl 150 mM, EDTA 1 mM). After washing with buffer A, repaired SurA was eluted using buffer A containing 2.5 mM desthiobiotin. The elution fractions were pooled and submitted to buffer exchange using a NAP-5 column (GE Healthcare) equilibrated with 50 mM NaPi, pH 8.0, 150 mM NaCl. To check for the correct oxidation, repair and purification of SurA, samples were loaded on an SDS–PAGE gel and the proteins visualized with the PageBlue Protein Staining Solution (Fermentas). The ability of SurA to act as a chaperone preventing the thermal aggregation of citrate synthase (Sigma, reference C3260) was assessed as follows. The aggregation of citrate synthase (0.15 μM) was monitored at 43 °C in 40 mM HEPES–KOH, pH 7.5, in the absence or in the presence of 0.6 μM SurA, SurA ox or MsrP-repaired SurA ox using light-scattering measurements. To avoid effects that might have been caused by the protein buffer, all samples were added to the assay in constant volume. SurA ox and MsrP-repaired SurA ox were obtained as described above. Light-scattering measurements were made using a Varian Cary Eclipse spectrofluorometer both with excitation and with emission wavelengths set to 500 nm at a spectral bandwidth of 2.5 nm. Data points were recorded every 0.1 s. The ability of various E. coli strains (BE100, JB08, CH193, BE104) to assimilate Met-O was assessed on M9 minimal medium supplemented with either Met or Met-O at 20 μg ml−1. Plates were incubated at 37 °C for 72 h. Overnight cultures of strains AG272, AG273, AG279 and AG274 were diluted to A  = 0.04 into fresh M63 minimal medium (100 ml) supplemented with 0.5% glycerol, 150 μg ml−1 of each amino acid, 1 mM MgSO , 1 mM MoNa O , 17 μM Fe (SO ) , vitamins (thiamine 10 μg ml−1, biotin 1 μg ml−1, riboflavin 10 μg ml−1, and nicotinamide 10 μg ml−1) and 100 μg ml−1 spectinomycin, and grown aerobically at 37 °C. When A reached 0.5, cells (5 ml) were washed three times with M63 medium containing 150 μg ml−1 Met-O instead of methionine, and serially diluted in the same medium. Five microlitres of each dilution were then spotted on M63 plates containing either Met or Met-O at 150 μg ml−1, and plates were subsequently incubated at 37 °C for 40 h. The msrP::lacZ-containing strains (CH183, CH186 and CH187) were grown at 37 °C with shaking in M9 minimal medium. When cells reached A ≈ 0.2, cultures were split into two plastic tubes, one of them containing HOCl (200 μM). These tubes were then incubated with an inclination of 90° with shaking at 37 °C. After 30 min of incubation, 1 ml was harvested and the bacteria were resuspended in 1 ml of β-galactosidase buffer. Levels of β-galactosidase were measured as described38. NR744, NR745, CH0127 and AG190 cells were grown aerobically at 37 °C with shaking in 50 ml of LB medium in 500 ml flasks. When cells reached A ≈ 0.45, 5 ml samples were transferred to conical polypropylene centrifuge tubes (50 ml; Sarstedt) and HOCl (2 mM) was added. Cells were then incubated at 37 °C with shaking (150 r.p.m.) at 90° inclination. Samples were taken at various time points after stress, diluted in PBS buffer, spotted on LB agar and incubated at 37 °C for 16 h. Cell survival was determined by counting colony-forming units (c.f.u.) per millilitre. The absolute c.f.u. at time-point 0 (used as 100%) was ~108 cells per millilitre in all experiments. For strains CH194, CH196 and CH197, the same protocol was used with chloramphenicol (25 μg ml−1) and arabinose (0.2%) added to the cultures. Cells (MG1655 and BE107) were grown at 37 °C with shaking in 10 ml of LB (in 100 ml flasks). When cells reached A ≈ 0.8, 5 ml samples were transferred to conical polypropylene centrifuge tubes (50 ml, Sarstedt) and HOCl (2 mM) was added. After 5 min of incubation, samples were taken and diluted in PBS buffer to ~2 × 103 cells per millilitre. Aliquots (100 μl) were then spread on LB agar plates containing SDS (1%). Colonies were counted the next day. A non-redundant local protein database containing 1,342 complete prokaryotic proteomes available in NCBI (http://www.ncbi.nlm.nih.gov/) as of 30 July 2014 was built. This database was queried with the BlastP program (default parameters)39, using YedY (NP_416480) and YedZ (NP_416481) of E. coli strain K-12 substrate MG1655 as a seed. Distinction between homologous and non-homologous sequences was assessed by visual inspection of each BlastP output (no arbitrary cut-off on the E value or score). To ensure that we did not overlook divergent YedY or YedZ proteins, iterative BlastP queries were performed using homologues identified at each step as new seeds. The list of YedY and YedZ homologues is provided in Supplementary Data 1. The retrieved sequences were aligned using MAFFT version 7 (default parameters40; Supplementary Data 2 and 3). Each alignment was visually inspected and manually refined when necessary using the ED program from the MUST package41. Regions where the homology between amino-acid positions was doubtful were removed by using BMGE software (BLOSUM30 similarity matrix42). For each homologue, the genomic context was investigated using MGcV (Microbial Genomic context Viewer43). The domain composition and protein location of each homologue was also analysed using pfam version 27.0 (ref. 44), SignalP version 4.1 (ref. 45) and TMHMM server version 2.0 (ref. 46), respectively. For the YedY protein, preliminary phylogenetic analysis used FastTree version 2 and a gamma distribution with four categories47. On the basis of the resulting tree, the subfamily containing the sequence from E. coli was identified and selected for further phylogenetic investigations. The corresponding sequences were realigned using MAFFT version 7. The resulting alignment was trimmed with BMGE as previously described. Maximum likelihood trees were computed using PHYML version 3.1 (ref. 48) with the Le and Gascuel model (amino-acid frequencies estimated from the data set) and a gamma distribution (four discrete categories of sites and an estimated alpha parameter) to take into account variations in evolutionary rate across sites. Branch robustness was estimated by the non-parametric bootstrap procedure implemented in PhyML (100 replicates of the original data set with the same parameters). Bayesian inferences were performed using MrBayes 3.2 (ref. 49) with a mixed model of amino-acid substitution including a gamma distribution (four discrete categories). MrBayes was run with four chains for one million generations and trees were sampled every 100 generations. To construct the consensus tree, the first 2,000 trees were discarded as ‘burn in’.

No statistical methods were used to predetermine sample size. The investigators were not blinded to allocation during experiments and outcome assessment. Salmonella enterica serovar Typhimurium strain SL1344 constitutively expressing GFP from a chromosomal locus (strain JVS-3858) was previously described51 and is referred to as wild type throughout this study. The complete list of bacterial strains used in this study is provided in Supplementary Table 1. Routinely, bacteria were grown in Lennox broth (LB) medium at 37 °C with shaking at 220 r.p.m. When appropriate, 100 μg ml−1 ampicillin (Amp), 50 μg ml−1 kanamycin (Kan), or 20 μg ml−1 chloramphenicol (Cm) (final concentrations) were added to the liquid medium or agar plates. Chromosomal mutagenesis of Salmonella SL1344 was performed as previously described52. To construct a non-polar pinT mutant strain (YCS-034, GFP−; or JVS-10038, GFP+), the first ~60 nt of the gene were removed and replaced by a resistance cassette, while keeping the Rho-independent terminator intact. Then, the resistance cassette was eliminated using the FLP helper plasmid pCP20 at 42 °C52. All mutations were transduced into the wild-type background using P22 phage53. For plasmid transformation the respective Salmonella strains were electroporated with ~10 ng of DNA. The following cell lines were used in this study: human cervix carcinoma cells (HeLa-S3; ATCC CCL-2.2), human epithelial colorectal adenocarcinoma cells (CaCo-2; ATCC HTB-37), human epithelial colorectal adenocarcinoma cells (HT29; DSMZ No. ACC-299), human stomach adenocarcinoma cells (AGS; ATCC CRL-1739), human epithelial colon metastatic cells (LoVo; ATCC CCL-229), human embryonic kidney 293 cells (HEK293; ATCC CRL-1573), human monocytic cells (THP-1; ATCC TIB-202), murine fibroblast cells (L929; ATCC CCL-1), murine embryonic fibroblast cells (MEF; ATCC SCRC-1040), mouse leukaemic monocyte/macrophage cells (RAW264.7; ATCC TIB-71), porcine intestinal epithelial cells (IPEC-J2)54, porcine macrophage-like cells (3D4/31)55. HeLa-S3, CaCo-2, THP-1, HEK293; RAW264.7 and MEF cells were obtained from the group of Thomas Rudel (Biocentre, Würzburg). AGS cells were provided by Cynthia Sharma (Research Center for Infectious Diseases, Würzburg). L929 cells were obtained from Thomas Meyer (Max Planck Institute for Infection Biology, Berlin). HT29, LoVo, IPEC-J2 and 3D4/31 cells were provided by Karsten Tedin (Centre for Infection Medicine, Berlin). Cell lines have not been authenticated in our laboratory, but were routinely tested for mycoplasma contamination (MycoAlert Mycoplasma Detection Kit, Lonza). HeLa-S3 cells were cultured according the guidelines provided by the ENCODE consortium (http://genome.ucsc.edu/encode/protocols/cell/human/Stam_15_protocols.pdf). Briefly, cells were grown in DMEM (Gibco) supplemented with 10% fetal calf serum (FCS; Biochrom), 2 mM l-glutamine (Gibco) and 1 mM sodium pyruvate (Gibco) in T-75 flasks (Corning) in a 5% CO , humidified atmosphere, at 37 °C. Further cell lines used in this study (THP-1, CaCo-2, AGS, HT29, LoVo, HEK293, MEF, L929, RAW264.7, IPEC-J2 and 3D4/31) were cultured in RPMI (Gibco) supplemented with 10% FCS, 2 mM l-glutamine, 1 mM sodium pyruvate and 0.5% β-mercaptoethanol (Gibco) in a 5% CO , humidified atmosphere, at 37 °C. To differentiate THP-1 monocytes, seeded cells (1 × 106 cells per well; six-well format) were treated with 50 ng ml−1 (final concentration) of phorbol 12-myristate 13-acetate (PMA) (Sigma) for 72 h (after 48 h fresh PMA at the same concentration was added to the culture). For the differentiation of murine bone marrow derived macrophages (BMDMs), the marrow of femur and tibia was isolated from 8–12-week-old female C57BL/6 wild-type mice and stored in RPMI supplemented with 10% FCS. The cell suspension was centrifuged for 5 min at 250g and the leukocyte pellet was resuspended in differentiation medium consisting of X-vivo-15 medium (Lonza) supplemented with 10% FCS and 10% L929-conditioned DMEM medium (same composition as above). Cells were cultured at 3 × 106 cells per 10 ml in a T-75 flask. At day 3, another 3 ml of differentiation medium were added and cells were further cultured until day 5. Successful macrophage differentiation was validated by microscopy before the cells were detached using a rubber scraper (Sarstedt) and seeded into six-well plates at 105 cells per well in fresh differentiation medium. Infection was carried out on day 7 as described below. In vitro infection of HeLa-S3 cells was carried out following a previously published protocol56 with slight modifications. Two days before infection 2 × 105 HeLa-S3 cells were seeded in 2 ml complete DMEM (six-well format). Overnight cultures of Salmonella were diluted 1:100 in fresh LB medium and grown aerobically to an OD of 2.0. Bacterial cells were harvested by centrifugation (2 min at 12,000 r.p.m., room temperature) and resuspended in DMEM. Infection of HeLa-S3 cells was carried out by adding the bacterial suspension directly to each well. If not mentioned otherwise, infections were performed at a multiplicity of infection (m.o.i.) of 5. Immediately after addition of bacteria, the plates were centrifuged for 10 min at 250g at room temperature followed by 30 min incubation in 5% CO , humidified atmosphere, at 37 °C. Medium was then replaced for gentamicin-containing DMEM (final concentration: 50 μg ml−1) to kill extracellular bacteria. After a further 30 min incubation step, medium was again replaced by fresh DMEM containing 10 μg ml−1 of gentamicin, and incubated for the remainder of the experiment. Time point 0 was defined as the time when gentamicin was first added to the cells. Further cell types were infected as described for Hela-S3 cells except that infection was carried out in RPMI medium and that infection was with an m.o.i. of 10 (THP-1, CaCo-2, HT29, AGS, HEK293, MEF, L929 and RAW264.7) or 20 (IPEC-J2, 3D4/31), respectively. Infection of BMDMs was carried out with an m.o.i. of 10 and using X-vivo-15 medium (10% fetal calf serum, 10% L929-conditioned medium). Infection was carried out as described above, except that HeLa-S3 cells had been seeded onto coverslips (24-well format). At the respective timepoint, coverslips with infected HeLa-S3 were washed twice with PBS (Gibco) and fixed in 4% paraformaldehyde (PFA) for 15 min in a wet chamber. After two additional PBS washing steps, cells were stained with Hoechst 33342 (Invitrogen; diluted 1:5,000 in PBS) for 15 min in a wet chamber and again washed twice with PBS. After coverslips had been air-dried, they were embedded in Vectashield Mounting Medium (Biozol) and analysed using the Leica SP5 confocal microscope (Leica) and the LAS AF Lite software (Leica). To stain human mitochondria, MitoTracker Orange CMTMRos (Life Technologies; kindly provided by V. Kozjak-Pavlovic, Biocentre, Würzburg) was used. The dye was added in the dark to a final concentration of 200 nM directly into the medium of the infected cells in the 37 °C incubator, 30 min before their harvest. After the 30 min incubation with the dye, the plates were covered with aluminium foil to prevent bleaching during the following steps. The supernatant was aspirated and the cells were washed with PBS and fixed with 4% PFA at 4 °C overnight. Hoechst staining and sample preparation was performed as described above. For flow cytometry-based analyses, infected cultures were washed twice with PBS, detached from the bottom of the plate by trypsinization and resuspended in complete DMEM. Upon pelleting the cells (5 min at 250g, room temperature), they were resuspended in PBS and analysed by flow cytometry using a FACSCalibur instrument (BD Biosciences) and the Cyflogic (CyFlo Ltd; version 1.2.1) or Flowing (Cell Imaging Core, Turku Centre for Biotechnology, Finland; version 2.5.0) software, respectively. Selection of intact HeLa-S3 cells was achieved by gating based on cell diameter (forward-scatter) and granularity (side-scatter) (linear scale). Of those, infected (GFP-positive) and non-infected (GFP-negative) sub-fractions were defined based on GFP signal intensity (FITC channel) versus auto-fluorescence (PE channel) (logarithmic scale). For cell sorting, RNAlater-fixed cells (see below) were first passed through MACS Pre-Separation Filters (30 μm exclusion size; Miltenyi Biotec) and then analysed and sorted using the FACSAria III device (BD Biosciences) at 4 °C (cooling both the input tube holder and the collection tube rack) and at a medium flow rate using the same gating strategy as described above, except that the gates for GFP-positive and GFP-negative fractions were conservative in order to prevent cross-contamination (as exemplified in Extended Data Fig. 1d). Typically ~2 × 105 cells of each fraction were collected for RNA isolation. To detect apoptotic cells, HeLa-S3 cells were washed twice with PBS and resuspended in 1× binding buffer (BD Pharmingen) to a concentration of 106 cells per ml. 100 μl of this cell suspension were mixed with 5 μl of APC-labelled annexin V (BD Pharmingen) and 1 μl of 500 mg ml−1 propidium iodide (PI; lyophilized stock from Sigma). Upon incubation for 15 min at room temperature, (light-protected) cells were subjected to flow cytometry using the MACSQuant Analyzer (Miltenyi Biotec). Upon gating of the fraction of intact cells based on cell diameter (forward-scatter) and granularity (side-scatter), the annexin-positive/PI-negative sub-population was determined by comparison against the appropriate single-stained controls in the APC vs PerCP channels, and quantified. Necrosis was evaluated by quantifying released lactate dehydrogenase (LDH) via the Cytotox96 assay (Promega) according to the manufacturer’s instructions. The absorbance at 490 nm was measured using a Multiskan Ascent instrument (Thermo Fisher). In order to convert the measured absorbance values into the relative proportion of dead cells, the maximal absorbance was determined by using 1× lysis solution (Promega) following the manufacturer’s instructions and referred to as 100% cytotoxicity. For both apoptosis and cytotoxicity measurements each biological replicate comprised three technical replicates. To quantify bacterial intracellular replication (Extended Data Fig. 1b), infected host cells were analysed by flow cytometry as described above, except that the increase in GFP intensity (geometric mean) was measured in the GFP-positive sub-population over time and normalized to that of the non-infected population in the same sample (example in Extended Data Fig. 1c). Alternatively, infected HeLa-S3 cultures were solubilized with PBS containing 0.1% Triton X-100 (Gibco) at the respective time points. Cell lysates were serially diluted in PBS, plated onto LB plates and incubated at 37 °C overnight. The number of colony forming units (c.f.u.) recovered was compared to that obtained from the bacterial input solution used for infection. In all cases, each biological replicate comprised three technical replicates. Infected cells were washed twice with PBS, trypsinized and pelleted. For ethanol fixations, cell pellets were re-dissolved in 0.1 volume of ice-cold PBS and then 0.9 volume of ice-cold ethanol (either 70% or 100%; as indicated) were added in single droplets during shaking (400 r.p.m., 4 °C) to avoid cell clumping. Fixation using stop solution (95% EtOH/5% water-saturated phenol)57 was performed by resuspending the cell pellet in PBS before the addition of 0.2 volume of stop solution and mixing. When PFA was used, the pellet was resuspended in the respective PFA concentration (0.5% or 4% PFA, pH 7.4, with or without 4% sucrose) and shaken for 15 min at 400 r.p.m., room temperature. PFA-induced crosslinks were reverted by an additional heating step for 15 min at 70 °C (refs 58, 59). For fixation with RNAlater (Qiagen), cell pellets were directly resuspended in RNAlater (1 ml per 5 × 106 cells). For systematic evaluation of different fixation protocols (Extended Data Fig. 1e–g), fixed cells had not been sorted but were either directly analysed upon fixation (30 min) or stored at −20 °C (ethanol-based fixatives) or 4 °C (others), respectively, overnight. To prepare RNAlater-fixed samples for sorting, tubes containing ~5 × 106 fixed cells were filled up with 10 ml of ice-cold PBS, centrifuged (5 min, 500g, 4 °C) and cell pellets resuspended in 2 ml of cold PBS. This cell suspension was filtered and sorted (as described above). In the dual RNA-seq experiments, as a reference for gene expression changes in host cells upon infection, a non-infected yet mock-treated control was included. The bacterial reference samples were derived from Salmonella grown in LB to an OD of 2.0, which either were then shifted to DMEM for 15 min, pelleted and fixed in RNAlater (see above) or were fixed directly (that is, without a medium exchange step) as indicated. Fixed Salmonella cells were pelleted and lysed using the lysis/binding buffer of the mirVana kit (Ambion). In order to maintain the approximate ratio of bacterial to host transcripts during RNA isolation, Salmonella lysates were mixed with host cell lysate in a way that the calculated proportion of individual Salmonella cells per infected host cell at the latest time point (see Extended Data Fig. 1h) was matched. The resulting mixture was then processed collectively. RNA was extracted from cells using the mirVana kit (Ambion) following the manufacturer’s instructions for total RNA isolation. To remove contaminating genomic DNA, samples were treated with 0.25 U of DNase I (Fermentas) per 1 μg of RNA for 45 min at 37 °C. If applicable, RNA quality was checked on the Agilent 2100 Bioanalyzer (Agilent Technologies). For qRT–PCR experiments total RNA was isolated using the TRIzol LS reagent (Invitrogen) according to the manufacturer’s recommendations and treated with DNase I (Fermentas) as described above. qRT–PCR was performed with the Power SYBR Green RNA-to-CT 1-Step kit (Applied Biosystems) according to the manufacturer’s instructions. Fold changes were determined using the 2(−ΔΔC ) method60. Primer sequences are given in Supplementary Table 1 and their specificity had been confirmed using Primer-BLAST (NCBI). For the estimation of Salmonella RNA within infection samples (Extended Data Fig. 1h), a dilution series of separately isolated Salmonella and HeLa-S3 total RNA was set up and in each case the ratio of rfaH/ACTB mRNAs was determined. The same was done for biological samples from infected cells as well as for the Salmonella reference controls. From the resulting trend-line equation the approximate proportion of the Salmonella transcriptome within mixed prokaryotic and eukaryotic total RNA samples could be deduced. Where indicated (Supplementary Table 1), Salmonella and eukaryotic host rRNA were removed using the Ribo-Zero Magnetic Gold Kit (Epidemiology) purchased from Epicentre/Illumina. Following the manufacturer’s instructions, ~500 ng of total, DNase-I-treated RNA from infection samples was used as an input to the ribosomal transcript removal procedure. rRNA-depleted RNA was precipitated in ethanol for 3 h at −20 °C. cDNA libraries for Illumina sequencing were generated by Vertis Biotechnologie AG, Freising-Weihenstephan, Germany. For dual RNA-seq of total RNA, at least 100 ng RNA were used for cDNA library preparation. DNase-I-treated total RNA samples were first sheared via ultra-sound sonication (4 pulses of 30 s at 4 °C each) to generate ~200–400 bp (average) fragmentation products. Fragments <20 nt were removed using the Agencourt RNAClean XP kit (Beckman Coulter Genomics). As an internal quality control for the pilot experiment (shown in Fig. 1), spike-in RNA (5′-AAAUCCGUUCGUACGGGCCC-3′; 5′-monophosphorylated and gel-purified) was added to a final concentration of 0.5%. The samples were poly(A)-tailed using poly(A) polymerase and the 5′ triphosphate (or eukaryotic 5′ cap) structures were removed using tobacco acid pyrophosphatase (TAP). Afterwards, an RNA adaptor was ligated to the 5′ monophosphate of the RNA fragments. First-strand cDNA synthesis was performed using an oligo(dT)-adaptor primer and the M-MLV reverse transcriptase (NEB). The resulting cDNA was PCR-amplified to about 20–30 ng μl−1 using a high fidelity DNA polymerase (barcode sequences for multiplexing were part of the 3′ primers). The cDNA library was purified using the Agencourt AMPure XP kit (Beckman Coulter Genomics) and analysed by capillary electrophoresis (Shimadzu MultiNA microchip electrophoresis system). cDNA libraries for dual RNA-seq on rRNA-depleted samples were constructed as described above, except for the following modifications. Upon RNA fragmentation, dephosphorylation with Antarctic Phosphatase (AP, NEB) and re-phosphorylation with T4 Polynucleotide Kinase (PNK, NEB) were performed. Oligonucleotide adapters were ligated to both the 5′ and 3′ ends of the RNA samples. First-strand cDNA synthesis was performed using M-MLV reverse transcriptase and the 3′ adaptor as primer. cDNA libraries from Salmonella-only samples were generated by fragmenting 5 μg of total RNA using ultrasound and RNAs <20 nt were removed using the Agencourt RNAClean XP kit (Beckman Coulter Genomics) as above. The RNA samples were poly(A)-tailed and 5′ppp structures were removed as before. RNA adapters were ligated to the 5′ monophosphate of the RNA and first-strand cDNA synthesis was performed using an oligo(dT)-adaptor primer and the M-MLV reverse transcriptase. The resulting cDNAs were PCR-amplified, purified using the Agencourt AMPure XP kit (Beckman Coulter Genomics) and analysed by capillary electrophoresis (Shimadzu MultiNA microchip). Generally, for sequencing cDNA samples were pooled in approximately equimolar amounts. The cDNA pool was size-fractionated in the size range of 150–600 bp using a differential clean-up with the Agencourt AMPure kit. For the dual RNA-seq pilot experiment (Fig. 1), single-end sequencing (100 cycles) was performed on an Illumina HiSeq 2000 machine at the Max Planck Genome Centre Cologne, Cologne, Germany. For dual RNA-seq on rRNA-free samples as well as for conventional RNA-seq of Salmonella-only samples, single-end sequencing (75 cycles) was performed on a NextSeq500 platform at Vertis Biotechnologie AG, Freising-Weihenstephan, Germany. All RNA-seq data discussed in this publication have been deposited in NCBI’s Gene Expression Omnibus and are accessible through GEO Series accession number GSE60144. For the accession numbers of individual experiments, see Supplementary Table 1. Total RNA prepared with TRIzol LS reagent (Invitrogen) was separated in 6% (vol/vol) polyacrylamide-8.3 M urea gels and blotted as described11. We loaded per lane either 5–10 μg of RNA from pure bacterial samples (Extended Data Figs 3d and 9a), 2 μg total RNA from sorted cell samples (Extended Data Fig. 8b), or 50 μg total RNA from unsorted infection samples (Fig. 2b). Hybond XL membranes (Amersham) were hybridized at 42°C with gene-specific [32P] end-labelled DNA oligonucleotides (see Supplementary Table 1 for sequences) in Hybri-Quick buffer (Carl Roth AG). The pinT promoter region was amplified by PCR using primers JVO-7036/-7037 and inserted via the AatII and NheI sites in the backbone of plasmid pAS093, resulting in plasmid pYC65. To identify the PhoP binding sites in a minimal fragment, the pinT promoter region was truncated by amplifying pYC65 using Phusion polymerase (NEB) with JVO-9393/-7387. The critical residues in the PhoP binding motif (T T ) were mutated to adenines by site-directed mutagenesis with JVO-12461/-12462 and Phusion polymerase (NEB). For pulse-expression of PinT in in vitro grown Salmonella, we used arabinose-induced overexpression of PinT from a pBAD plasmid previously described10, 51, 61 with minor modifications. Briefly, wild-type Salmonella that carried either a pKP8-35 (pBAD control), pYC5-34 (pBAD-PinT) or pYC60 (pBAD-PinT*) plasmid were grown overnight in LB and, the next day, the cultures were 1:100 diluted and further grown in LB to an OD of 2.0. l-arabinose (Sigma) was added to a final concentration of 0.2%; 5 min later RNA was extracted using TRIzol LS reagent (Invitrogen) and analysed by RNA-seq (~3–5 million reads/library). For the same experiment under SPI-2-inducing conditions, overnight cultures of the three strains were washed 2× with PBS and 1× with SPI-2 medium28, diluted 1:50 in SPI-2 medium and grown to an OD of 0.3 before PinT expression was induced as above. For the pulse-expression of PinT inside host cells (Extended Data Fig. 6d, e), HeLa-S3 cells were infected with the same three strains as above and 4 h after infection, 0.2% l-arabinose was supplemented directly into the DMEM medium. Activation of inducible sRNA expression in intracellular bacteria was confirmed by qRT–PCR over a time-course of 20 min (Extended Data Fig. 6d), demonstrating full induction levels to be reached already at 5 min. Thus, for Extended Data Fig. 6e the host cells were lysed at 5 min after induction with ice-cold 0.1% Triton X-100/PBS and further incubated for 30 min on ice with pipetting up and down from time to time to improve host cell lysis efficiency. Then the intact bacterial cells were pelleted by centrifugation for 2 min at 16,100g (4 °C) and resuspended in RNAlater (Qiagen). The fixed bacterial cells were further enriched against the host background via cell sorting (FACSAria III, BD Biosciences) and selective gating for the fraction of GFP+ bacterial cells released from their hosts. From those, total RNA was isolated and analysed by RNA-seq as above except that sequencing was to a depth of ~20 million reads per library as necessitated by remaining host-derived RNA fragments. Immunoblotting of Salmonella proteins was done as previously described62. Briefly, samples from Salmonella in vitro cultures were taken corresponding to 0.4 OD , centrifuged for 4 min at 16,100g at 4 °C, and pellets resuspended in sample loading buffer to a final concentration of 0.01 OD per μl. After denaturation for 5 min at 95 °C, 0.05-OD equivalents of the sample were separated via SDS–PAGE. Gel-fractionated proteins were blotted for 90 min (0.2 mA per cm2; 4 °C) in a semi-dry blotter (Peqlab) onto a PVDF membrane (Perkin Elmer) in transfer buffer (25 mM Tris base, 190 mM glycin, 20% methanol). Blocking was for 1 h at room temperature in 10% dry milk/TBST20. Appropriate primary antibodies (see Supplementary Table 1) were hybridized at 4 °C overnight and – following 3 × 10 min washing in TBST20 – secondary antibodies (Supplementary Table 1) for 1 h at room temperature. For western blotting of human proteins, infected cells were harvested in sample loading buffer (500 μl per well; six-well format), transferred to 1.5 ml reaction tubes, boiled for 5 min at 95 °C and 20 μL per lane were loaded onto a 10% PAA gel for SDS–PAGE as above. After blotting and blocking (as above), the membrane was probed with the respective primary antibody at 4 °C overnight and—upon washing (as above)—with the secondary antibody for 1 h at room temperature (a full list with information about all antibodies and sera used is given in Supplementary Table 1). After three additional washing steps for each 10 min in TBST20, blots were developed using western lightning solution (Perkin Elmer) in a Fuji LAS-4000. In Fig. 3e, intensities of protein bands were quantified using the AIDA software (Raytest, Germany) and normalized to GroEL levels. To mimic the early stages of the infection of a host cell in vitro, the indicated Salmonella strains were grown in LB overnight, diluted 1:100 in LB and grown to an OD of 2.0 (that is, a condition under which SPI-1 is highly induced4, 11), washed twice with PBS and once with SPI-2 medium28 at room temperature, diluted 1:50 in pre-warmed SPI-2 medium (defined as t ) and grown further in Erlenmeyer flasks at 37 °C for the indicated time periods. At the respective time points, samples were taken for RNA-seq, western blotting, and GFP fluorescence measurements. To measure the GFP intensity of reporter strains, bacteria were grown in LB in presence of Amp and Cm until an OD of 2.0 was reached. Salmonella cells corresponding to 1 OD were pelleted and fixed with 4% PFA. GFP fluorescence intensity was quantified for each 100,000 events by flow cytometry with the FACSCalibur instrument (BD Biosciences). Data were analysed using the Cyflogic software (CyFlo). To monitor SPI-2 activation in real time, a transcriptional gfp reporter was constructed by inserting the SPI-2-dependent ssaG promoter into plasmid pAS0093 via AatII/NheI sites as previously described8. The resulting plasmid pYC104 was co-transformed with either the pBAD-ctrl. or pBAD-PinT plasmid into the indicated strain backgrounds. The resulting strains were grown overnight in LB (+Amp + Cm) and then diluted 1:100 and further grown in the same medium to an OD of 2.0. A volume of 1 ml of the culture was pelleted and the collected cells shifted to SPI-2 medium28 (defined as t ) as described above, except that the growth experiment was conducted in 96-well plates (Nunc Microwell 96F, Thermo Scientific). After measuring the OD and GFP intensity at t , l-arabinose was added to each well to final concentration of 0.2% for sRNA induction and bacteria were grown for 20 h at 37 °C (with shaking) with measurements of both the OD and GFP fluorescence in 10 min intervals using the Infinite F200 PRO plate reader (Tecan). HeLa-S3 cells were infected with wild-type Salmonella, ΔpinT or pinT+ mutant strains at an m.o.i. of 5 as described above. Culture supernatant samples were taken at 20 h p.i. and analysed using the ELISA kit for human CXCL8/IL-8 (R&D Systems). Code availability. In order to document the details and parameters of the (dual) RNA-seq data analyses and to make the biocomputational approaches reproducible for others, we implemented the workflows as Unix Shell scripts. These scripts are deposited at Zenodo (DOI: 10.5281/zenodo.34695, https://zenodo.org/record/34695). Please refer to Supplementary Table 1 for descriptions of the analyses. For all RNA-seq experiments listed in Supplementary Table 1, Illumina reads in FASTQ format were trimmed with a Phred quality score cut-off of 20 by the program fastq_quality_trimmer from FASTX toolkit version 0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit/). Reads shorter than 20 nt after adaptor- and poly(A)-trimming were discarded before the mapping. The reads were aligned to the Salmonella enterica SL1344 genome (NCBI RefSeq accession numbers: NC_016810.1, NC_017718.1, NC_017719.1, NC_017720.1) and—where applicable—the human (hg19 – GRCh37; retrieved from the 1000 Genomes Project63), the mouse (GENCODE M2, GRCm38.p2), or the porcine genome sequence (ENSEMBL, Sscrofa10.2), in parallel. The mapping was performed using the READemption pipeline (version 0.3.5)64 and the short read mapper segemehl and its remapper lack (version 0.2.0)65 allowing for split reads66. Mapped reads with an alignment accuracy <90% as well as cross-mapped reads, that is, reads which could be aligned equally well to both host and Salmonella reference sequences, were discarded. The resulting data were used for visualization (see for example, Fig. 1b and Extended Data Fig. 2b). Reads of the high resolution time-course experiment (cDNA libraries numbers 27–77 in Supplementary Table 1) that were detected as cross-mapped by READemption (see above) were further inspected: their median percentage over the entire time-course was 0.25% with increased fractions for the later time points, implying that those reads are mainly contributed by Salmonella cells. We observed that the majority of the cross-mapped reads aligned to Salmonella rRNA or tRNA loci, while on the human side no gene class preference was observed (data not shown). For dual RNA-seq experiments (cDNA libraries 1–184, 215–256 in Supplementary Table 1) after mapping differential expression analysis was carried out separately for the host and the pathogen. Strand-specific gene-wise quantifications for each data subset were performed by READemption64. Host transcript expression analyses are based on annotations from GENCODE (version 19)67, NONCODE (version 4)68 and miRBase (version 20)69 after removing redundant entries. The annotation for Salmonella genes was retrieved from NCBI (under the above mentioned accession numbers) and manually extended with small RNA annotations4, 70. In either organism, multi-mapped reads were removed and only uniquely mapped reads were considered for the expression analysis. Differential gene expression analyses were performed with the edgeR package (version 3.10.2)71 using an upper-quartile normalization and a prior count of 1. Where needed (that is, to correct for batch effects in the comparisons between wild-type and mutant infections; the comparisons displayed in Figs 3 and 4 and Extended Data Figs 5, 7,8,9), sequencing data were further normalized using the RUVs correction method72 with k = 3. For this purpose, we treated the samples time-point-wise to remove unwanted nuisance factors. At each time point our covariate of interest was the pinT status of the infecting bacterium. This is constant within replicate blocks, which are used for the RUVs correction. Host or bacterial genes with at least 10 uniquely mapped reads in three replicates were considered detected. Genes with an adjusted P value < 0.05 were considered differentially expressed. Differential expression analysis for conventional (bacteria only) RNA-seq experiments (cDNA libraries numbers 185–214 in Supplementary Table 1) was done similarly, except that a cut-off of ≥50 uniquely mapped reads was used as a detection threshold. Based on the obtained BAM files, coverage files in wiggle format were generated by READemption64 in a strand-specific manner and split by organism. In each case, coverage files are based on uniquely mapped reads and normalized by the total number of uniquely aligned reads per organism. For Fig. 4e, wiggle files were visualized using the Integrated Genome Browser (version 8.4.4)73. A database of pathways, regulons, and genomic islands was constructed using information obtained from the KEGG database74 (organism code sey), the SL1344 genome annotation70, and relevant literature sources (see Supplementary Table 1). Pearson correlation coefficients between changes in PinT expression and changes in expression of each gene within each regulon over the time-course of wild-type Salmonella infection (cDNA libraries number 27, 30, 33, 36, 39, 42, 44, 47, 50, 53, 56, 59, 61, 64, 67, 70, 73, 76 in Supplementary Table 1) were plotted in Fig. 2d. To assess enrichment of differentially expressed transcripts in pathways in the comparative infection experiments (cDNA libraries numbers 27–77 and 152–184 in Supplementary Table 1) and the in vitro assay (cDNA libraries numbers 185–202 in Supplementary Table 1), gene set enrichment analysis (GSEA; version 2.1.0) was run on the log fold changes reported by edgeR. The GSEA was performed in ranked list mode (with statistic classic) and gene sets containing less than 15 or more than 100 entries were excluded. Extended Data Fig. 5a reports all pathways significant at an FDR-corrected P value of at most 0.05 in at least one time point. Host pathway enrichment studies were performed consistently with bacterial analyses using GSEA on human pathways available in the KEGG database (downloaded January 22, 2014) using the same settings described above. Pathways with an adjusted P value ≤ 0.05 were considered to be significantly modulated. Data visualization for Extended Data Fig. 8a was produced using the Bioconductor package Pathview75. Genes displayed in Fig. 1d, that is, genes whose transcription is known or predicted to be regulated by the binding of nuclear factor κB (NF-κB) to their promoter or genes whose products have been shown to promote an NF-κB response, were retrieved from the GeneCards76 and Boston University Biology (http://www.bu.edu/nf-kb/gene-resources/target-gene) databases or refs 77, 78. STAT3 target genes denoted in Fig. 4b were retrieved from ref. 79. We used Cufflinks/Cuffdiff (version 2.2.1)80, 81 to test for differentially expressed isoforms in the high-resolution, comparative dual RNA-seq time-course data set (cDNA libraries number 27–77 in Supplementary Table 1). In a first step, we used Cufflinks to quantify transcript isoforms in the mapped read data. Afterwards, all transcript annotations were merged using Cuffmerge and differentially expressed isoforms were called using Cuffdiff. To identify bacterial and human genes with similar expression kinetics across the time-course of the infection of HeLa-S3 cells (cDNA libraries number 27–77 in Supplementary Table 1), we used RUVs-corrected, abundance-filtered and normalized read counts (see above). Absolute counts were then transformed into standard z-scores for each gene over all considered samples as follows: for each gene, the z-score was calculated as the absolute read count minus the mean read count over all samples, divided by the standard deviation of all counts over all samples. Genes with a standard deviation <2 were excluded from further analysis. Pearson correlation coefficients were calculated between all remaining bacterial genes and all remaining human genes, and P values were calculated using the function cor.test in R. To account for a possible temporal delay between Salmonella expression changes and effect manifestation in the host cell, a time-shift was allowed. This means the expression of Salmonella genes at each time point was compared to host expression at the subsequent time point. Human genes were considered to be correlated with a bacterial gene if they had a P value of less than 10−4 and a Pearson’s r greater than 0.65. This resulted in a total of 751 clusters of human genes showing correlation in expression with a bacterial gene, approximately half of which (see Supplementary Table 1) had at least one enriched GO term associated with them (adjusted P value < 0.05) as tested using the software tool Ontologizer 2.0 (build: 20100310-351)82 with the gene ontology definition obtained from the Gene Ontology Consortium (data-version: releases/2015-09-26) and the Universal Protein Resource (UniProt) gene annotation (generated: 2015-09-14). To account for the possibility that multiple bacterial genes might be associated with a human gene cluster a correlation analysis was performed for all against all bacterial genes as described above, with the only exception that no time-shift was allowed. For this, we focused on seventeen gene clusters that were built on bacterial genes encoding for secretion-associated gene products (according to UniProt; see Supplementary Table 1). Detailed inspection of these clusters revealed the one depicted in Fig. 4b (centred on the bacterial SPI-2 gene sseC) which contained many further (bacterial and human) genes with pronounced PinT-dependent expression changes – that is, genes that showed differential expression between wild-type and ΔpinT infection at several time points p.i. In all RNA-seq-based analyses, transcript expression changes that were associated with an adjusted P value < 0.05 (reported by edgeR) were considered significantly differentially expressed. For Fig. 3b, a Monte Carlo permutation test was performed on the median fold change of genes in the SPI-2 regulon, using 105 randomly selected gene sets of the same size. This indicated the significant de-repression (P < 0.05) of the SPI-2 regulon in the absence of PinT at 2 and 8 h after the infection of HeLa cells, at 2, 6 and 16 h after the infection of 3D4/31 cells, and in the in vitro assay. Tests for the evaluation of increased host cell death in Extended Data Fig. 1a were performed using a one-tailed Student’s t-test. *P values ≤ 0.05 were considered significant and ***P values ≤ 0.001 were considered very significant. The significance of gene activation in qRT–PCR results in Fig. 4c and Extended Data Figs 5b, c and 7c, d or the ELISA assay in Extended Data Fig. 7e was assessed using a one-tailed Mann–Whitney U-test. The significance of differences in intracellular replication between the ΔpinT strain and wild-type Salmonella (Extended Data Fig. 4d) was evaluated using a two-tailed Mann–Whitney U-test.

Discover hidden collaborations