News Article | October 25, 2016
Strains of P. falciparum (Dd2, 3D7, D6, K1, NF54, V1/3, HB3, 7G8, FCB and TM90C2B) were obtained from the Malaria Research and Reference Reagent Resource Center (MR4). PfscDHODH, the transgenic P. falciparum line expressing S. cerevisiae DHODH19, was a gift from A. B. Vaidya. P. falciparum isolates were maintained with O-positive human blood in an atmosphere of 93% N , 4% CO , 3% O at 37 °C in complete culturing medium (10.4 g l−1 RPMI 1640, 5.94 g l−1 HEPES, 5 g l−1 albumax II, 50 mg l−1 hypoxanthine, 2.1 g l−1 sodium bicarbonate, 10% human serum and 43 mg l−1 gentamicin). Parasites were cultured in medium until parasitaemia reached 3–8%. Parasitaemia was determined by checking at least 500 red blood cells from a Giemsa-stained blood smear. For the compound screening, a parasite dilution at 2.0% parasitaemia and 2.0% haematocrit was created with medium. 25 μl of medium was dispensed into 384-well black, clear-bottom plates and 100 nl of each compound in DMSO was transferred into assay plates along with the control compound (mefloquine). Next, 25 μl of the parasite suspension in medium was dispensed into the assay plates giving a final parasitaemia of 1% and a final haematocrit of 1%. The assay plates were incubated for 72 h at 37 °C. 10 μl of detection reagent consisting of 10× SYBR Green I (Invitrogen; supplied in 10,000× concentration) in lysis buffer (20 mM Tris-HCl, 5 mM EDTA, 0.16% (w/v) Saponin, 1.6% (v/v) Triton X-100) was dispensed into the assay plates. For optimal staining, the assay plates were left at room temperature for 24 h in the dark. The assay plates were read with 505 dichroic mirrors with 485 nm excitation and 530 nm emission settings in an Envision (PerkinElmer). High-throughput screening hits were hierarchically clustered by structural similarity using average linkage on pairwise Jaccard distances43 between ECFP4 fingerprints44. Pipeline Pilot45 was used for fingerprint and distance calculation; clustering and heat-map generation was done in R (ref. 46). HepG2 cells (ATCC) were maintained in DMEM, 10% (v/v) FBS (Sigma), and 1% (v/v) antibiotic–antimycotic in a standard tissue culture incubator (37 °C, 5% CO ). P. berghei (ANKA GFP–luc) infected A. stephensi mosquitoes were obtained from the New York University Langone Medical Center Insectary. For assays, ∼17,500 HepG2 cells per well were added to a 384-well microtitre plate in duplicate. After 18–24 h at 37 °C the media was exchanged and compounds were added. After 1 h, parasites obtained from freshly dissected mosquitoes were added to the plates (4,000 parasites per well), the plates were spun for 10 min at 1,000 r.p.m. and then incubated at 37 °C. The final assay volume was 30 μl. After a 48-h incubation at 37 °C, Bright-Glo (Promega) was added to the parasite plate to measure relative luminescence. The relative signal intensity of each plate was evaluated with an EnVision (PerkinElmer) system. Micropatterned co-culture (MPCC) is an in vitro co-culture system of primary human hepatocytes organized into colonies and surrounded by supportive stromal cells. Hepatocytes in this format maintain a functional phenotype for up to 4–6 weeks without proliferation, as assessed by major liver-specific functions and gene expression47, 48, 49. In brief, 96-well plates were coated homogenously with rat-tail type I collagen (50 μg ml−1) and subjected to soft-lithographic techniques to pattern the collagen into 500-μm-island microdomains that mediate selective hepatocyte adhesion. To create MPCCs, cryopreserved primary human hepatocytes (BioreclamationIVT) were pelleted by centrifugation at 100g for 6 min at 4 °C, assessed for viability using Trypan blue exclusion (typically 70–90%), and seeded on micropatterned collagen plates (each well contained ~10,000 hepatocytes organized into colonies of 500 μM) in serum-free DMEM with 1% penicillin–streptomycin. The cells were washed with serum-free DMEM with 1% penicillin–streptomycin 2–3 h later and replaced with human hepatocyte culture medium48. 3T3-J2 mouse embryonic fibroblasts were seeded (7,000 cells per well) 24 h after hepatocyte seeding. 3T3-J2 fibroblasts were courtesy of H. Green50. MPCCs were infected with 75,000 sporozoites (NF54) (Johns Hopkins University) 1 day after hepatocytes were seeded48, 49. After incubation at 37 °C and 5% CO for 3 h, wells were washed once with PBS, and the respective compounds were added. Cultures were dosed daily. Samples were fixed on day 3.5 after infection. For immunofluorescence staining, MPCCs were fixed with −20 °C methanol for 10 min at 4 °C, washed twice with PBS, blocked with 2% BSA in PBS, and incubated with mouse anti-P. falciparum Hsp70 antibodies (clone 4C9, 2 μg ml−1) for 1 h at room temperature. Samples were washed with PBS then incubated with Alexa 488-conjugated secondary goat anti-mouse for 1 h at room temperature. Samples were washed with PBS, counterstained with the DNA dye Hoechst 33258 (Invitrogen; 1:1,000), and mounted on glass slides with fluoromount G (Southern Biotech). Images were captured on a Nikon Eclipse Ti fluorescence microscope. Diameters of developing liver stage parasites were measured and used to calculate the corresponding area. All rhesus macaques (Macaca mulatta) used in this study were bred in captivity for research purposes, and were housed at the Biomedical Primate Research Centre (BPRC; AAALAC-certified institute) facilities under compliance with the Dutch law on animal experiments, European directive 86/609/EEC and with the ‘Standard for Humane Care and Use of Laboratory Animals by Foreign Institutions’ identification number A5539-01, provided by the Department of Health and Human Services of the US National Institutes of Health. The local independent ethical committee first approved all protocols. Non-randomized rhesus macaques (male or female; 5−14 years old; one animal per month) were infected with 1 × 106 P. cynomolgi (M strain) blood-stage parasites, and bled at peak parasitaemia. Approximately 300 female A. stephensi mosquitoes (Sind-Kasur strain, Nijmegen University Medical Centre St Radboud) were fed with this blood as described previously51. Rhesus monkey hepatocytes were isolated from liver lobes as described by previously52. Sporozoite infections were performed within 3 days of hepatocyte isolation. Sporozoite inoculation of primary rhesus monkey hepatocytes was performed as described previously53, 54. On day 6, intracellular P. cynomolgi malaria parasites were fixed, stained with purified rabbit antiserum reactive against P. cynomolgi Hsp70.1 (ref. 53), and visualized with FITC-labelled goat anti-rabbit IgG antibodies. Quantification of small ‘hypnozoite’ exoerythrocytic forms (1 nucleus, a small round shape, a maximal diameter of 7 μm) or large ‘developing parasite’ exoerythrocytic forms (more than 1 nucleus, larger than 7 μm and round or irregular shape) was determined for each well using a high-content imaging system (Operetta, PerkinElmer). P. falciparum 3D7 stage IV–V gametocytes were isolated by discontinuous Percoll gradient centrifugation of parasite cultures treated with 50 mM N-acetyl-d-glucosamine for 3 days to kill asexual parasites. Gametocytes (1.0 × 105) were seeded in 96-well plates and incubated with compounds for 72 h. In vitro anti-gametocyte activity was measured using CellTiter-Glo (Promega). A detailed description of the method is published elsewhere55. In brief, NF54pfs16-LUC-GFP highly synchronous gametocytes were induced from a single intra-erythrocytic asexual replication cycle. On day 0 of gametocyte development, spontaneously generated gametocytes were removed by VarioMACS magnetic column (MAC) technology. Early stage I gametocytes were collected on day 2 of development and late-stage gametocytes (stage IV) on day 8 using MAC columns. Percentage parasitaemia and haematocrit was adjusted to 10 and 0.1, respectively. 45 μl of parasite sample were added to PerkinElmer Cell carrier poly-d-lysine imaging plates containing 5 μl of test compound at 16 doses, including control wells containing 4% DMSO and 50 μM puromycin (0.4% and 5 μM final concentrations, respectively), the plates sealed with a membrane (Breatheasy or 4ti-05 15/ST) and incubated for 72 h in standard incubation conditions of 5% CO , 5% O , 90% N and 60% humidity at 37 °C. After incubation, 5 μl of 0.07 μg ml−1 MitoTracker Red CM-H2XRos (MTR) (Invitrogen) in PBS was added to each well, and plates were resealed with membranes and incubated overnight under standard conditions. The following day, the plates were brought to room temperature for at least one hour before being measured on the Opera QEHS Instrument. Image analysis was performed using an Acapella (PerkinElmer)-based algorithm that identifies gametocytes of the expected morphological shape with respect to degree of elongation and specifically those parasites that are determined as viable by the MitoTracker Red CM-H2XRos fluorescence size and intensity. IC values were determined using GraphPad Prism 4, using a 4-parameter log dose, nonlinear regression analysis, with sigmoidal dose–response (variable slope) curve fit. P. falciparum transmission-blocking activity of BRD7929 was assessed in a standard membrane feeding assay as previously described56. In brief, P. falciparumNF54 hsp70-GFP-luc reporter parasites were cultured up to stage V gametocytes (day 14). Test compounds were serially diluted in DMSO and subsequently in RPMI medium to reach a final DMSO concentration of 0.1%. Diluted compound was either pre-incubated with stage V gametocytes for 24 h (indirect mode) or directly added to the blood meal (direct mode). Gametocytes were adjusted to 50% haematocrit, 50% human serum and fed to A. stephensi mosquitoes. All compound dilutions were tested in duplicate in independent feeders. After 8 days, mosquitoes were collected and the relative decrease in oocysts density in the midgut was determined by measurement of luminescence signals in 24 individual mosquitoes from each cage. For each vehicle (control) cage, an additional 10 mosquitoes were dissected and examined by microscopy to determine the baseline oocyst intensity. In vitro resistance selections were performed as previously described15. In brief, approximately 1 × 109 P. falciparum Dd2 parasites were treated with 60 nM (EC ) or 150 nM (10 × EC ) of BRD1095 in each of four independent flasks for 3–4 days. After the compounds were removed, the cultures were maintained in compound-free complete RPMI growth medium with regular media exchange until healthy parasites reappeared. Once parasitaemia reached 2–4%, compound pressure was repeated and these steps were executed for about 2 months until the initial EC shift was observed. Three out of four independent selections pressured at 60 nM developed a phenotypic EC shift. None of the selections pressured at 150 nM resulted in resistant parasites. After an initial shift in the dose–response phenotype was observed, selection at an increased concentration was repeated in the same manner until at least a threefold shift in EC was observed. Selected parasites were then cloned by limiting dilution. BRD73842-resistant selections were conducted in a similar manner except that parasites were initially treated at 0.5 μM (10× EC ) for 4 days or 150 nM (EC ) for 2 days in each of two independent flasks. The Y1356N mutant was derived from a flask pressured at 0.5 μM and the L1418F mutant was developed from one of the flasks exposed to the 150 nM. DNA libraries were prepared for sequencing using the Illumina Nextera XT kit (Illumina), and quality-checked before sequencing on a Tapestation. Libraries were clustered and run as 100-bp paired-end reads on an Illumina HiSeq 2000 in RapidRun mode, according to the manufacturer’s instructions. Samples were analysed by aligning to the P. falciparum 3D7 reference genome (PlasmoDB v. 11.1). To identify SNVs and CNVs, a sequencing pipeline developed for P. falciparum (Plasmodium Type Uncovering Software, Platypus) was used as previously described, with the exception of an increase in the base quality filter from 196.5 to 1,000 (ref. 57). Substrate-dependent inhibition of recombinant P. falciparum DHODH protein was assessed in an in vitro assay in 384-well clear plates (Corning 3640) as described previously58. A 20-point dilution series of inhibitor concentrations were assayed against 2–10 nM protein with 500 μM l-dihydroorotate substrate (excess), 18 μM dodecylubiquinone electron acceptor (~K ), and 100 μM 2,6-dichloroindophenol indicator dye in assay buffer (100 mM HEPES pH 8.0, 150 mM NaCl, 5% glycerol, 0.5% Triton X-100). Assays were incubated at 25 °C for 20 min and then assessed via OD . Data were normalized to 1% DMSO and excess inhibitor (25 μM DSM265; ref. 7). Substrate-dependent inhibition of recombinant human DHODH protein was assessed in an in vitro assay in 384-well clear plates (Corning 3640) as described previously59. A 20-point dilution series of inhibitor concentrations was assayed against 13 nM protein with 1 mM l-dihydroorotate substrate (excess), 100 μM dodecylubiquinone electron acceptor, and 60 μM 2,6-dichloroindophenol indicator dye in assay buffer (50 mM Tris HCl pH 8.0, 150 mM KCl, 0.1% Triton X-100). Assays were incubated at 25 °C for 20 min and then assessed via OD . Data were normalized to 1% DMSO and no enzyme. The synthetic gene for full-length P. vivax PI4K (PVX_098050) was synthesized from GeneArt (ThermoScientific), and was expressed and purified as previously described20. Aliquots of P. vivax PI4Kβ were flash-frozen in liquid nitrogen and stored at −80°C. Full-length human PI4KB (uniprot gene Q9UBF8-2) was expressed and purified as previously described60. 100 nM extruded lipid vesicles were made to mimic Golgi organelle vesicles (20% phosphatidylinositol, 10% phosphatidylserine, 45% phosphatidylcholine and 25% phosphatidylethanolamine) in lipid buffer (20 mM HEPES pH 7.5 (room temperature), 100 mM KCl, 0.5 mM EDTA). Lipid kinase assays were carried out using the Transcreener ADP2 FI Assay (BellBrook Labs) following the published protocol as previously described61. 4-μl reactions ran at 21 °C for 30 min in a buffer containing 30 mM HEPES pH 7.5, 100 mM NaCl, 50 mM KCl, 5 mM MgCl , 0.25 mM EDTA, 0.4% (v/v) Triton X-100, 1 mM TCEP, 0.5 mg ml−1 Golgi-mimic vesicles and 10 μM ATP. P. vivax PI4Kβ was used at 7.5 nM and human PI4KB was used at 200 nM. Fluorescence intensity was measured using a Spectramax M5 plate reader with excitation at 590 nm and emission at 620 nm (20-nm bandwidth). IC values were calculated from triplicate inhibitor curves using GraphPad Prism software. The model was built using the SWISS-MODEL online resource62, 63, 64 and Prime65 (Schrödinger Release 2015-2: Prime, version 4.0, Schrödinger), with human PheRS (PDB accession 3L4G) as a template for P. falciparum PheRS (PlasmoDB Gene ID: PF3D7_0109800). The template was chosen based on highest sequence identity and similarity identified via PSI-BLAST. Target-template alignment was made using ProMod-II and validated with Prime STA. Coordinates from residues that were conserved between the target and the template were copied from the template to the model, and remaining sites were remodelled using segments from known structures. The side chains were then rebuilt, and the model was finally refined using a force field. Protein sequences of both α- (PF3D7_0109800) and β- (PF3D7_1104000) subunits of cytoplasmic P. falciparum PheRS were obtained from PlasmoDB ( http://plasmodb.org/plasmo/). Full length α- and β-subunit gene sequences optimized for expression in E. coli were cloned into pETM11 (Kanamycin resistance) and pETM20 (ampicillin resistance) expression vectors using Nco1 and Kpn1 sites and co-transformed into E. coli B834 cells. Protein expression was induced by addition of 0.5 mM isopropyl β-d-1-thiogalactopyranoside (IPTG) and cells were grown until an OD of 0.6–0.8 was reached at 37 °C. They were then allowed to grow at 18 °C for 20 h after induction. Cells were separated by centrifugation at 5,000g for 20 min and the bacterial pellets were suspended in a buffer consisting of 50 mM Tris–HCl (pH 7.5), 200 mM NaCl, 4 mM β-mercaptoethanol, 15% (v/v) glycerol, 0.1 mg ml−1 lysozyme and 1 mM phenylmethylsulfonyl fluoride (PMSF). Cells were lysed by sonication and cleared by centrifugation at 20,000g for 1 h. The supernatant was applied on to prepacked NiNTA column (GE Healthcare), and bound proteins were eluted by gradient-mixing with elution buffer (50 mM Tris–HCl (pH 7.5), 80 mM NaCl, 4 mM β-mercaptoethanol, 15% (v/v) glycerol, 1 M imidazole). Pure fractions were pooled and loaded on to heparin column for further purification. Again, bound proteins were eluted using gradient of heparin elution buffer 50 mM Tris–HCl (pH 7.5), 1 M NaCl, 4 mM β-mercaptoethanol, 15% (v/v) glycerol). Pure fractions were again pooled and dialysed overnight into a buffer containing 50 mM Tris–HCl (pH 7.5), 200 mM NaCl, 4 mM β-mercaptoethanol, 1 mM DTT and 0.5 mM EDTA. TEV protease (1:50 ratio of protease:protein) was added to the protein sample and incubated at 20 °C for 24 h to remove the polyhistidine tag. Protein was further purified via gel-filtration chromatography on a GE HiLoad 60/600 Superdex column in 50 mM Tris–HCl (pH 7.5), 200 mM NaCl, 4 mM β-mercaptoethanol, 1 mM MgCl . The eluted protein (a heterodimer of P. falciparum cPheRS) were collected, assessed for purity via SDS–PAGE and stored at −80 °C. Nuclear encoded tRNAPhe from P. falciparum was synthesized using an in vitro transcription method as described earlier22, 66. Aminoacylation and enzyme inhibition assays for P. falciparum cytosolic PheRS were performed as described earlier22, 67. Enzymatic assays were performed in buffer containing 30 mM HEPES (pH 7.5), 150 mM NaCl, 30 mM KCl, 50 mM MgCl , 1 mM DTT, 100 μM ATP, 100 μM l-phenylalanine, 15 μM P. falciparum tRNAPhe, 2 U ml−1 E. coli inorganic pyrophosphatase (NEB) and 500 nM recombinant P. falciparum PheRS at 3 °C. Reactions at different time points were stopped by the addition of 40 mM EDTA and subsequent transfer to ice. Recombinant maltose binding protein was used as negative control. The cPheRS inhibition assays were performed using inhibitor concentrations of 0.01 nM, 0.1 nM, 1 nM, 10 nM, 100 nM, 1 μM, 5 μM and 10 μM for strong binders and 1 nM, 10 nM, 100 nM, 1 μM, 10 μM, 100 μM and 500 μM for weaker binders in the assay buffer. Enzymatic and inhibition experiments were performed twice in triplicate. Mammalian cells (HepG2, A549, and HEK293) were obtained from the ATCC and cultured routinely in DMEM with 10% FBS and 1% (v/v) antibiotic–antimycotic. For cytotoxicity assays, 1 × 106 cells were seeded into 384-well plates 1 day before compound treatment. Cells were treated with ascending doses of compound for 72 h, and viability was measured using Cell-Titer Glo (Promega). All cell lines were tested for Mycoplasma contamination using Universal mycoplasma Detection Kit (ATCC). In vitro characterization assays (protein binding, microsomal stability, hepatocyte stability, cytochrome P450 (CYP) inhibition, and aqueous solubility) were performed according to industry-standard techniques. Ion channel inhibition studies were performed using the Q-Patch system using standard techniques. All animal experiments were conducted in compliance with institutional policies and appropriate regulations and were approved by the institutional animal care and use committees for each of the study sites (the Broad Institute, 0016-09-14; Harvard School of Public Health, 03228; Eisai, 13-05, 13-07, 14-C-0027). No method of randomization or blinding was used in this study. CD-1 mice (n = 4 per experimental group; female; 6–7-week-old; 20–24 g, Charles River) were intravenously inoculated with approximately 1 × 105 P. berghei (ANKA GFP-luc) blood-stage parasites 24 h before treatment and compounds were administered orally (at 0 h). Parasitaemia was monitored by the in vivo imaging system (IVIS SpectrumCT, Perkinelmer) to acquire the bioluminescence signal (150 mg kg-1 of luciferin was intraperitoneally injected approximately 10 min before imaging). In addition, blood smear samples were obtained from each mouse periodically, stained with Giemsa, and viewed under a microscope for visual detection of blood parasitaemia. Animals with parasitaemia exceeding 25% were humanely euthanized. CD-1 mice (n = 4 per experimental group; female; 6–7-week-old; 20–24 g, Charles River) were inoculated intravenously with approximately 1 × 105 P. berghei (ANKA GFP-luc) sporozoites freshly dissected from A. stephensi mosquitoes. Immediately after infection, the mice were treated with single oral doses of BRD7929; infection was monitored as described for the P. berghei erythrocytic-stage assay. For time-course experiments, the time of compound treatment (single oral dose of 10 mg kg−1) was varied from 5 days before infection to 2 days after infection. CD-1 (n = 3 per experimental group; female; 6–7-week-old; 21–24 g, Charles River) mice were infected with P. berghei (ANKA GFP-luc) for 96 h before treatment with vehicle or BRD7929 (day 0). On day 2, female A. stephensi mosquitoes were allowed to feed on the mice for 20 min. After 1 week (day 9), the midguts of the mosquitoes were dissected out and oocysts were enumerated microscopically (12.5× magnification). In vivo adapted P. falciparum (3D7HLH/BRD) were selected as described previously68. In brief, NSG mice (n = 2 per experimental group; female; 4–5-week-old; 19–21 g; The Jackson Laboratory) were intraperitoneally injected with 1 ml of human erythrocytes (O-positive, 50% haematocrit, 50% RPMI 1640 with 5% albumax) daily to generate mice with humanized circulating erythrocytes (huRBC NSG). Approximately 2 × 107 blood-stage P. falciparum 3D7HLH/BRD (ref. 69) were intravenously infected to huRBC NSG mice and >1% parasitaemia was achieved 5 weeks after infection. After three in vivo passages, the parasites were frozen and used experimentally. Approximately 48 h after infection with 1 × 107 blood-stage of P. falciparum 3D7HLH/BRD, the mean parasitaemia was approximately 0.4%. huRBC NSG mice were orally treated with a single dose of compound and parasitaemia was monitored for 30 days by IVIS to acquire the bioluminescence signal (150 mg kg-1 of luciferin was intraperitoneally injected approximately 10 min before imaging). huRBC NSG mice (n = 2 per experimental group; female; 4–5-week-old; 18–20 g; Jackson Laboratory) were infected with blood-stage P. falciparum 3D7HLH/BRD for 2 weeks to allow the development of mature gametocytes. Subsequently, the mice were treated with a single oral dose of BRD7929. Blood samples were collected for 11 days. For molecular detection of parasite stages, 40 μl of blood was obtained from control and treated mice. In brief, total RNA was isolated from blood samples using RNeasy Plus Kit with genomic DNA eliminator columns (Qiagen). First-strand cDNA synthesis was performed from extracted RNA using SuperScript III First-Strand Synthesis System (Life Technologies). Parasite stages were quantified using a stage-specific qRT–PCR assay as described previously33, 69. Primers were designed to measure transcript levels of PF3D7_0501300 (ring stage parasites), PF3D7_1477700 (immature gametocytes) and PF3D7_1031000 (mature gametocytes). Primers for PF3D7_1120200 (P. falciparum UCE) transcript were used as a constitutively expressed parasite marker. The assay was performed using cDNA in a total reaction volume of 20 μl, containing primers for each gene at a final concentration of 250 nM. Amplification was performed on a Viia7 qRT–PCR machine (Life Technologies) using SYBR Green Master Mix (Applied Biosystems) with the following reaction conditions: 1 cycle × 10 min at 95 °C and 40 cycles × 1 s at 95 °C and 20 s at 60 °C. Each cDNA sample was run in triplicate and the mean C value was used for the analysis. C values obtained above the cut-off (negative control) for each marker were considered negative for the presence of specific transcripts. Blood samples from each mouse before parasite inoculation were also tested for ‘background noise’ using the same primer sets. No amplification was detected from any samples. FRG knockout on C57BL/6 (human repopulated, >70%) mice (huHep FRG knockout; n = 2 per experimental group; female; 5.5–6-month-old; 19–21 g; Yecuris) were inoculated intravenously with approximately 1 × 105 P. falciparum (NF54HT-GFP-luc) sporozoites and BRD7929 was administered as a single 10 mg kg−1 oral dose one day after inoculation31. Infection was monitored daily by IVIS. Daily engraftment of human erythrocytes (0.4 ml, O-positive, 50% haematocrit, 50% RPMI 1640 with 5% albumax) was initiated 5 days after inoculation. For qPCR analysis, blood samples (40 μl) were collected 7 days after inoculation. For molecular detection of the blood-stage parasite, 40 μl of blood was obtained from control and treated mice. In brief, total RNA was isolated from blood samples using RNeasy Plus Kit with genomic DNA eliminator columns (Qiagen). First-strand cDNA synthesis was performed from extracted RNA using SuperScript III First-Strand Synthesis System (Life Technologies). The presence of the blood-stage parasites was quantified using a highly stage-specific qRT–PCR assay as described previously33, 70. Primers were designed to measure transcript levels of PF3D7_1120200 (P. falciparum UCE). The assay was performed using cDNA in a 20 μl total reaction volume containing primers for each gene at a final concentration of 250 nM. Amplification was performed on a Viia7 qRT–PCR machine (Life Technologies) using SYBR Green Master Mix (Applied Biosystems) and the reaction conditions are as follows: 1 cycle × 10 min at 95 °C and 40 cycles × 1 s at 95 °C and 20 s at 60 °C. Each cDNA sample was run in triplicate and the mean C value was used for the analysis. C values obtained above the cut-off (negative control) for each marker were considered negative for presence of specific transcripts. Blood samples from each mouse were also tested for background noise using the same primer sets before parasite inoculation. No amplification was detected from any samples. In vitro cultures of P. falciparum Dd2, with the initial inocula ranging from 105 to 109 parasites, were maintained in complete medium supplemented with 20 nM of BRD7929 (EC against Dd2). Media was replaced with fresh compound added daily and cultures monitored for 60 days to identify propensity for recrudescent parasitaemia as described34. Atovaquone was used as a control (EC = 2 nM). Solubility was determined in PBS pH 7.4 with 1% DMSO. Each compound was prepared in triplicate at 100 μM in both 100% DMSO and PBS with 1% DMSO. Compounds were allowed to equilibrate at room temperature with a 750 r.p.m. vortex shake for 18 h. After equilibration, samples were analysed by UPLC–MS (Waters) with compounds detected by single-ion reaction detection on a single quadrupole mass spectrometer. The DMSO samples were used to create a two-point calibration curve to which the response in PBS was fit. Plasma protein binding was determined by equilibrium dialysis using the Rapid Equilibrium Dialysis (RED) device (Pierce Biotechnology) for both human and mouse plasma. Each compound was prepared in duplicate at 5 μM in plasma (0.95% acetonitrile, 0.05% DMSO) and added to one side of the membrane (200 μl) with PBS pH 7.4 added to the other side (350 μl). Compounds were incubated at 37 °C for 5 h with 350 r.p.m. orbital shaking. After incubation, samples were analysed by UPLC–MS (Waters) with compounds detected by SIR detection on a single quadrupole mass spectrometer. The required potency to inhibit the hERG channel in expressed cell lines were evaluated using an automated patch-clamp system (QPatch-HTX). Pharmacokinetics of BRD3444 and BRD1095 were performed by Shanghai ChemPartner Co. Ltd., following single intravenous and oral administrations to female CD-1 mice. BRD3444 and BRD1095 were formulated in 70% PEG400 and 30% aqueous glucose (5% in H O) for intravenous and oral dosing. Test compounds were dosed as a bolus solution intravenously at 0.6 mg kg−1 (dosing solution; 70% PEG400 and 30% aqueous glucose, 5% in H O) or dosed orally by gavage as a solution at 1 mg kg−1 (dosing solution; 70% PEG400 and 30% aqueous glucose, 5% in H O) to female CD-1 mice (n = 9 per dose route). Pharmacokinetic parameters of BRD7929 and BRD3316 were determined in CD-1 mice. BRD7929 and BRD3316 were formulated in 10% ethanol, 4% Tween, 86% saline for both intravenous and oral dosing. Pharmacokinetic parameters were estimated by non-compartmental model using WinNonlin 6.2. Pharmacokinetic parameters for BRD7929 and BRD3316 were estimated by a non-compartmental model using proprietary Eisai software. Pharmacokinetic parameters of BRD7539 and BRD9185 were determined in CD-1 mice. Compounds were formulated in 70% PEG300 and 30% (5% glucose in H O) at 0.5 mg ml−1 for oral dosing, and 5% DMSO, 10% cremophor, and 85% H O at 0.25 mg ml−1 for intravenous formulation. Pharmacokinetic parameters were estimated by non-compartmental model using WinNonlin 6.2. Pharmacokinetics of BRD7539 and BRD9185 were performed by WuXi AppTec. The protocol was approved by Eisai IACUC, 13-07, 13, 05, and 14-c-0027. Compounds were evaluated in vitro to determine their metabolic stability in incubations containing liver microsomes or hepatocytes of mouse and human. In the presence of NADPH, liver microsomes (0.2 mg ml−1) from mouse (CD-1) and human were incubated with compounds (0.5 and 5 μM) for 0, 10 and 90 min. The depletion of compounds in the incubation mixtures, determined using liquid chromatography tandem mass spectromety LC–MS/MS, was used to estimate K and V values and determine half-lives for both mouse and human microsomes. Compounds were evaluated in vitro for the potential inhibition of human cytochrome P450 (CYP) isoforms using human liver microsomes. Two concentrations (1 and 10 μM) of compound were incubated with pooled liver microsomes (0.2 mg ml−1) and a cocktail mixture of probe substrates for selective CYP isoform. The selective activities tested were CYP1A2-mediated phenacetin O-demethylation, CYP2C8-mediated rosiglitazone para-hydroxylation, CYP2C9-mediated tolbutamide 4′-hydroxylation, CYP2C19-mediated (S)-mephenytoin 4′-hydroxylation, CYP2D6-mediated (±)-bufuralol 1′-hydroxylation and, CYP3A4/5-mediated midazolam 1′-hydroxylation. The positive controls tested were α-naphthoflavone for CYP1A2, montelukast for CYP2C8, sulfaphenazole for CYP2C9, tranylcypromine for CYP2C19, quinidine for CYP2D6, and ketoconazole for CYP3A4/5. The samples were analysed by LC–MS/MS. IC values were estimated using nonlinear regression. The time-dependent inactivation potential of compounds were assessed in human liver microsomes for CYP2C9, CYP2D6, and CYP3A4/5 by determining K and k values when appropriate. Two concentrations (6 and 30 μM) of compound were incubated in primary reaction mixtures containing phosphate buffer and 0.2 mg ml−1 human liver microsomes for 0, 5, and 30 min in a 37 °C water bath. The reactions were initiated by the addition of NADPH. Phosphate buffer was substituted for NADPH solution for control. At the respective times, 25 μl of primary incubation was diluted tenfold into pre-incubated secondary incubation mixture containing each CYP-selective probe substrate in order to assess residual activity. The second incubation time was 10 min. The probe substrates used for CYP1A, 2C9, CYP2C19, CYP2D6, and CYP3A4 were phenacetin (50 μM), tolbutamide (500 μM), (S)-mephenytoin (20 μM), bufuralol (50 μM), and midazolam (30 μM), respectively. The CYP time- dependent inhibitors used were furafyllin, tienilic acid, ticlopidine, paroxetin and troleandomycin for CYP2C8, CYP2C9, CYP2C19, CYP2D6 and CYP3A, respectively, at two concentrations. The samples were analysed by LC–MS/MS.
News Article | November 10, 2016
No statistical methods were used to predetermine sample size. The experiments were not randomized. The investigators were not blinded to allocation during experiments and outcome assessment Patients at the Massachusetts General Hospital consented preoperatively to take part in the study in all cases following the Institutional Review Board Protocol 1999P008145. Fresh tumours were collected at time of resection and the presence of malignant cells was confirmed in frozen sections on adjacent, representative pieces of tissue. Fresh tumour tissue was minced with a scalpel and enzymatically dissociated using a gentle papain-based brain tumour dissociation kit (Miltenyi Biotec). Large pieces of debris were removed with a 100 μm strainer, and dissociated cells were layered carefully onto a 5 ml density gradient (Lympholyte-H, Cedar Lane labs), which was centrifuged at 2,000 r.p.m. for 10 min at room temperature to pellet dead cells and red blood cells. The interface containing live cells was saved and used for staining and flow cytometry. Viability was measured using trypan blue exclusion, which confirmed >90% cell viability. For primary tumour sorting, tumour cells were blocked in 1% bovine serum albumin in Hanks buffered saline solution (BSA / HBSS), and then stained first with CD45-Vioblue direct antibody conjugate (Miltenyi Biotec) for 30 min at 4 °C. Cells were washed with cold PBS, and then resuspended in 1 ml of BSA / HBSS containing 1 μM calcein AM (Life Technologies) and 0.33 μM TO-PRO-3 iodide (Life Technologies) to co-stain for 30 min before sorting. Fluorescence-activated cell sorting was performed on FACSAria Fusion Special Order System (Becton Dickinson) using 488 nm (calcein AM, 530/30 filter), 640 nm (TO-PRO-3, 670/14 filter), and 405 nm (Vioblue, 450/50 filter) lasers. Fluorescence-minus-one controls were included with all tumours, as well as heat-killed controls in early pilot experiments, which were crucial to ensure proper identification of the TO-PRO-3 positive compartment and ensure sorting of the live cell population. Standard, strict forward scatter height versus area criteria were used to discriminate doublets and gate only singlets. Viable cells were identified by staining positive with calcein AM but negative for TO-PRO-3. Single cells were sorted into 96-well plates containing cold buffer TCL buffer (Qiagen) containing 1% β-mercaptoethanol, snap frozen on dry ice, and then stored at −80 °C before whole transcriptome amplification, library preparation and sequencing. Libraries from isolated single cells were generated based on the Smart-seq2 protocol (Picelli 2014) with the following modifications. RNA from single cells was first purified with Agencourt RNAClean XP beads (Beckman Coulter) before oligo-dT primed reverse transcription with Maxima reverse transcriptase and locked TSO oligonucleotide, which was followed by 20 cycle PCR amplification using KAPA HiFi HotStart ReadyMix (KAPA Biosystems) with subsequent Agencourt AMPure XP bead purification as described. Libraries were tagmented using the Nextera XT Library Prep kit (Illumina) with custom barcode adapters (sequences available upon request). Libraries from 384 cells with unique barcodes were combined and sequenced using a NextSeq 500 sequencer (Illumina). We also analysed 96 cells from MGH60 with an alternative protocol that incorporates random molecular tags (RMTs, also known us unique molecular identifiers, or UMIs) in order to control for PCR amplification bias, as described previously29 and we obtained similar results. Paired-end, 38-base reads were mapped to the UCSC hg19 human transcriptome using Bowtie with parameters “-q–phred33-quals -n 1 -e 99999999 -l 25 -I 1 -X 2000 -a -m 15 -S -p 6”, which allows alignment of sequences with single base changes, such as point mutation in the IDH1 gene. Expression values were calculated from SAM files using RSEM v1.2.3 in paired-end mode using parameters “–estimate-rspd–paired end -sam -p 6”, from which TPM values for each gene were extracted. Haematoxylin and eosin and single antibody staining (GFAP, Ki-67) was done by the clinical pathology laboratory at the Massachusetts General Hospital per routine protocol. For GFAP and Ki-67 double immunohistochemistry, paraffin-embedded sections were mounted on glass slides, deparaffinized in xylene, treated with 0.5% peroxide in methanol, and rehydrated. Antigen retrieval was done using sodium citrate-based, heat-induced antigen retrieval at pH 6.0. The Dako EnVision G/2 double stain system was used for blocking, staining, and development using rabbit anti-Ki67 antibody (Abcam ab15580 at 1:300) and mouse anti-GFAP antibody (Dako M0761 at 1:100). Human tissue was obtained from the Massachusetts General Hospital according to an Institutional Review Board-approved protocol (1999P008145) and informed consent was obtained from all patients. ViewRNA technology (Affymetrix) was used for manual format RNA in situ hybridization. Tissue sections mounted on glass slides were stored at −80 °C until they were used for hybridization. Slides were baked at 60 °C for 1 h, then denatured at 80 °C for 3 min, deparaffinized with Histoclear and ethanol dehydration. RNA targets in dewaxed sections were unmasked by treating with pre-treatment buffer at 95 °C for 10 min and digested with 1:100 dilution protease at 40 °C for 10 min, followed by fixation with 10% formalin for 5 min at room temperature. Probe concentrations were 1:40 for both type 1 (red) and type 6 (blue) probe sets, except that the ApoE probe was used at 1:80 dilution. The probe was incubated on sections for 2 h at 40 °C and then washed serially. Affymetrix Panomics probes included ApoE (type 6, catalogue number VA6-16904 and type 1, catalogue number VA1-18265), OMG (type 1, catalogue number VA1-18161), Sox4 (type 6, catalogue number VA6-18162), CCND2 (type 6, catalogue number VA6-18266), Ki-67 (type 1, catalogue number VA1-11033). Signal was amplified using PreAmplifier mix QT for 25 min at 40 °C followed by Amplifier mix QT for 15 min at 40 °C, and then signal was hybridized with labelled probe at 1:1,000 dilution for 15 min at 40 °C. Colour was developed using Fast Blue substrate for Type 6 probes and Fast Red substrate for Type 1 probes for 30 min at 40 °C. Tissue was counterstained with Gill’s haematoxylin for 25 s at room temperature followed by mounting with ADVANTAGE mounting media (Innovex). For quantification of compartments by ISH, at least 1,000 cells were counted in representative areas of the tumours. The probes used in this study consisted of centromeric (CEP) and locus-specific identifiers (LSI) probes. CEP probes included: CEP2 (2p11.1-q11.1, spectrum orange), CEP4 (4p11-q11, spectrum aqua), CEP9 (9p11-q11, spectrum aqua), CEP12 (12p11.1-q11, spectrum green), CEP17 (17p11.1-q11.1, spectrum aqua) and Y (Yp11.1-q11.1, spectrum green) all obtained from Abbott Molecular, Inc. (Des Plaines, IL). LSI probes were1p36/1q25 and 19q13/19p13 dual-colour probe set (Abbott), and bacterial artificial chromosome RP11-351D16 (10q11.21, spectrum red or green; CHORI, Oakland, CA). FISH was performed as described previously30. Briefly, 5-μm sections of formalin-fixed, paraffin-embedded tumour material were deparaffinized, hydrated, and pretreated with 0.1% pepsin for 1 h. Slides were then washed in 2× saline-sodium citrate buffer (SSC), dehydrated, air dried, and co-denatured at 80 °C for 5 min with a three-colour probe panel and hybridized at 37 °C overnight using the Hybrite Hybridization System (Abbott). Two 2-min post-hybridization washes were performed in 2× SSC/0.3%NP40 at 72 °C followed by one 1-min wash in 2× SSC at room temperature. Slides were mounted with Vectashield containing 4′,6-diamidino-2-phenylindole (Vector, Burlingame, CA, USA). Entire sections were observed with an Olympus BX61 fluorescent microscope equipped with a charge-coupled device camera and analysed with Cytovision software (Applied Imaging, Santa Clara, CA). Human NPCs were dissociated from the subventricular zone of 19 week fetal tissue and resulting neurospheres were expanded in a 1:1 mixture of DMEM/F12 and Neurobasal A (Invitrogen), supplemented with B27 lacking vitamin A, EGF, FGF, and heparin. Single live NPCs were isolated by FACS from a passage 8 culture and sorted into 96-well plates containing Buffer TCL (Qiagen) + 1% β-mercaptoethanol. For differentiation assays, NPCs were plated in chamber slides coated with poly-d-lysine and laminin, and proliferation media was exchanged over a period of 3 days with base media supplemented with either 1% FBS, 1% FBS + 60 ng ml−1 T3, or FBS + 100 nM trans-retinoic acid and 10 ng ml−1 NT3. Multipotency was confirmed by indirect immunofluorescence after 7 days of differentiation with GFAP (Abcam ab53554), Olig2 (Millipore AB9610), and Neurofilament (Aves). Expression levels were quantified as E = log (TPM /10 + 1), where TPM refers to transcript-per-million for gene i in sample j, as calculated by RSEM31. TPM values are divided by 10, since we estimate the complexity of single-cell libraries in the order of 100,000 transcripts and would like to avoid counting each transcript ~10 times, as would be the case with TPM, which may inflate the difference between the expression level of a gene in cells in which the gene is detected and those in which it is not detected. For each cell, we quantified two quality measures: the number of genes for which at least one read was mapped, and the average expression level of a curated list of housekeeping genes. We then conservatively excluded all cells with either fewer than 3,000 detected genes or an average housekeeping expression (E, as defined above) below 2.5. For the remaining cells we calculated the aggregate expression of each gene as log (average(TPM )+1), and excluded genes with an aggregate expression below 4, leaving a set of 8,008 analysed genes. For the remaining cells and genes, we defined relative expression by centering the expression levels, Er = E -average[E ]. Centring was performed within each tumour separately in order to decrease the impact of inter-tumoural variability on the combined analysis across tumours. Initial CNVs (CNV ) were estimated by sorting the analysed genes by their chromosomal location and applying a moving average to the relative expression values, with a sliding window of 100 genes within each chromosome, as previously described6. To avoid considerable impact of any particular gene on the moving average, we limited the relative expression values to [−3,3] by replacing all values above 3 by 3, and replacing values below −3 by −3. This was performed only in the context of CNV estimation. For visualization purposes, in order to include the two chromosomes with fewest analysed genes (chromosome 18 and 21 with 105 and 75 genes, respectively), we extended the moving average to include up to 50 genes from the flanking chromosomes (for example, the first window in chromosome 18 consisted of the last 50 genes of chromosome 17 and the first 50 genes of chromosome 18, whereas the 51 through 56 windows in that chromosome consisted only of chromosome 18 genes). This initial analysis is based on the average expression of genes in each cell compared to the other cells and therefore does not have a proper reference which is required to define the baseline. However, we detected a cluster of cells that have higher values at chromosome 1p and 19q, which we know are deleted in the six tumours, and that have consistent ‘CNV patterns’ across the genome, despite the fact that they originate from all three tumours. We thus defined these as the non-cancer cells and used the average CNV estimate at each gene across these cells as the baseline. The non-cancer cells included both microglia and oligodendrocytes, which differed in gene expression patterns and therefore also in CNV estimates (for example, the MHC region in chromosome 6 had consistently higher values in microglia than in oligodendrocytes and cancer cells). We therefore defined two baselines, as the average of all microglia and the average of all oligodendrocytes, and based on these the maximal (BaseMax) and minimal (BaseMin) baseline at each genomic position. The final CNV estimate of cell i at position j was defined as: We performed principal component analysis (PCA) for the relative expression values of all cancer cells (as defined by CNV analysis) from the three tumours combined. The covariance matrix used for PCA was generated using an approach outlined in ref. 32 to decrease the weight of less reliable ‘missing’ values in the data. The basis of this approach is that due to the limited sensitivity of single-cell RNA-seq, many genes are not detected in particular cells despite being expressed. This is particularly pronounced for genes expressed at low levels, and for cells with low library complexity (that is, for which relatively few genes are detected), and results in non-random patterns in the data, whereby cells may cluster based on their complexity and genes may cluster based on their expression levels, rather than ‘true’ co-variation. To mitigate this effect, we assign weights to missing values, such that the weight of E is proportional to the expectation that gene i will be detected in cell j given the average expression of gene i and the total complexity (number of detected genes) of cell j. To further verify that the PCA results are not driven by library complexity, we compared the PCA results to those of shuffled data. We iteratively swapped the expression of individual genes between pairs of cells with similar complexities, swapping each gene in each cell at least once. In that way we shuffled the data and removed the biological clustering, but maintained the distribution of complexities across cells, as well as the distribution of expression levels for each gene. PCA over the shuffled data defined the complexity-based effect, as evident by a Pearson correlation of 0.96 between the PC1 cell scores and their complexities (in the original data this correlation is only 0.41). We then compared PC1 gene scores between the original and the shuffled data (Extended Data Fig. 2f). Although PC1 gene scores of most genes are comparable between the two analyses, the loadings of the oligodendrocyte and astrocyte gene sets (Supplementary Table 1) were highly affected. Oligodendrocyte genes were originally associated with highly positive PC1 scores, and their scores are significantly decreased upon shuffling (97% of the oligodendroglial genes were among the 5% genes with the most decreased loadings, P < 10−32); similarly, astrocytic genes were originally associated with negative PC1 scores, and their scores are significantly increased upon shuffling (all astrocytic genes were among the 5% of genes with the most increased loadings, P < 10−32). As a result, none of the genes with highest and lowest PC1 scores (after shuffling) overlap with our oligodendroglial and astrocytic gene sets. Thus, complexity does not account for the association of PC1 with the differentiation programs. Similarly, complexity clearly does not account for the PC2 and PC3 stemness program, as PC2 cell scores are positively correlated with complexity (R = 0.27), whereas PC3 cell scores are negatively correlated with complexity (R = −0.24) and stemness genes were defined as those associated with both PC2 and PC3. The top correlated genes with PC1 scores (across all tumour cells) were defined as PC1-associated genes. We focused on the genes with an absolute correlation value above 0.35, but note that other thresholds gave similar results (not shown). Of those genes, the subset that was differentially expressed by at least threefold between oligodendrocyte (OC) and astrocyte (AC) mouse cells9, and for which the two comparisons were consistent (that is, PC1-positively correlated genes with higher OC expression, and PC1-negatively correlated genes with higher AC expression) were defined as the OC and AC lineage gene sets. Lineage scores were then calculated as the average relative expression of the lineage gene set minus the average relative expression of a control gene set, that is, Lin = average[Er(G ,i)] - average[Er(G cont,i)], in which Lin is the score of cell i to lineage j, G is the gene set for lineage j and G cont is a control gene set for lineage j. The control gene set was defined by first binning all 8,008 analysed genes into 25 bins of aggregate expression levels and then, for each gene in the lineage gene set, randomly selecting 100 genes from the same expression bin. In this way, the control gene set has a comparable distribution of expression levels to that of the lineage gene set and the control gene set is 100-fold larger, such that its average expression is analogous to averaging over 100 randomly selected gene sets of the same size as the lineage gene set. The final lineage score of each cell was defined as the maximal score over the two lineages, LIN = max(Lin OC, Lin AC). For visualization purposes where the two lineage scores are shown in a single axis, we first assigned random scores within (0–0.15) to all cells with LIN < 0, to avoid having many overlapping cells at x = 0. Second, we assigned negative scores to the cells with higher AC than OC scores (that is, a cell with AC and OC scores of 0.1 and 1, respectively, would be assigned a lineage score of 1, wheresa a cell with AC and OC scores of 1 and 0.1 would be assigned a lineage score of −1). Both PC2 and PC3 were associated with intermediate values of PC1 (Extended Data Fig. 2c) and therefore with presumably less differentiated cells, and both were correlated with a shared set of genes, but distinguished by their correlation with cell ‘complexity’. We considered their sum as a potential stemness program. To detect potential stem-related genes, we chose the top 100 most positively correlated genes with PC2 + PC3 scores across all cancer cells from the three tumours. The 100 candidate genes were then restricted to (1) genes that are positively correlated with both PC2 and PC3, which primarily excluded ribosomal protein genes that were only correlated with PC2; (2) genes for which the average relative expression among the stem-like cells was above average. Stemness scores for each cell, stem(i), were then defined as the average relative expression of the stemness gene-set minus the average of a control gene set and minus the lineage score of cell i: Cells were scored for the three programs defined above (two lineage scores and a stemness score) and assigned to the subpopulation that corresponds to their highest scoring program, if the maximal score was above 0.5 and was higher by 0.5 than the score for the other programs. Cells in which the maximal score did not pass these thresholds were assigned to the undifferentiated subpopulation, for which we did not detect a specific expression program. We note that the expression programs are continuous and thus it is difficult to assign every cell to a discrete subpopulation. Nevertheless, most cells are highly biased towards one of the three states, and the overall estimates are consistent between analysis of single-cell RNA-seq data and tissue staining experiments (Extended Data Fig. 8c and Supplementary Table 3). Furthermore, very few cells (~1% on average, and 5% at most) scored for two programs simultaneously (with the same threshold of 0.5, Supplementary Table 3). Analysis of single-cell RNA-seq in human (293T) and mouse (3T3) cell lines20, and in mouse haematopoietic stem cells21 revealed in each case two prominent cell cycle expression programs that overlap considerably with genes that are known to function in replication and mitosis, respectively, and that have also been found to be expressed at G1/S phases and G2/M phases, respectively, in bulk samples of synchronized HeLa cells33. We thus defined a core set of 43 G1/S and 55 G2/M genes that included those genes that were detected in the corresponding expression clusters in all four datasets from the three studies described above (Supplementary Table 2). As expected, the genes in each of those expression programs were highly co-regulated in a small fraction of the oligodendroglioma cells, such that some cells expressed only the G1/S or the G2/M programs and other cells expressed both programs (Extended Data Fig. 6a). Plotting the average expression of these programs revealed an approximate circle (Fig. 3a), which we hypothesize describes the progression along the cell cycle. Putative cycling cells were identified by at least a twofold upregulation and a t-test P value < 0.01 for either the G1/S or the G2/M gene set compared to the average of all cells. Although we cannot confidently define the regions that correspond to each phase of the cell cycle in an automatic way, we manually defined four regions in the apparent circle and assigned them to approximate cell cycle phases. Output from Illumina software was processed by the Picard processing pipeline to yield BAM files containing aligned reads (bwa version 0.5.9, to the NCBI Human Reference Genome Build hg19) with well-calibrated quality scores34, 35. Sample contamination by DNA originating from a different individual was assessed using ContEst36. Somatic single nucleotide variations (sSNVs) were then detected using MuTect37. Following this standard procedure, we filter sSNVs by (1) removing potential DNA oxidation artefacts38; (2) removing events seen in sequencing data of a large panel of ~8,000 TCGA normal samples; (3) realigning identified sSNVs with NovoAlign (http://www.novocraft.com) and performing an additional iteration of MuTect with the newly aligned BAM files. sSNVs were finally annotated using Oncotator39. Sample purity and ploidy, as well as Cancer Cell Fraction (CCF) of identified sSNVs were determined by ABSOLUTE25. Genome-wide copy-ratio profiles were inferred using CapSeg. Read depth at capture targets in tumour samples was calibrated to estimate copy ratio using the depths observed in a panel of normal genomes. Next, we performed allelic copy analysis using reference and alternate counts at germline heterozygous SNP sites. sSNVs that were identified by WES were examined in single-cell RNA-seq data by the mpileup command of SAMtools40. The fraction of cells in which we identified these mutations was, on average, only 1.3% of the expected fraction estimated by ABSOLUTE. This low sensitivity primarily reflects the low coverage of the RNA-seq reads over the transcriptome of single cells. Accordingly, sensitivity was correlated with the expression levels of the genes that harbour the mutations, and reached 20.4% for the top 10% most highly expressed genes. Sensitivity was also affected by heterozygosity and allele-specific expression, as in some heterozygote mutant cells we might only sequence the wild-type allele. We used a targeted sequencing approach to increase our sensitivity for three specific mutations in MGH54 which were identified by WES but detected in very few cells by single-cell RNA-seq. We designed primers flanking these three mutations (in ZEB2, EEF1B2 and DNAJC4), PCR-amplified single cell cDNAs (frozen stocks of product from the pre-amplification reaction of the Smart-seq2 protocol) and sequenced the amplified material. This approach was applied for 1,056 cells from MGH54. Mutant cells were defined as those with at least 50 reads that mapped to the mutant allele as defined by WES, and for which the fraction of mutant reads was at least 20% of all reads and fivefold higher than the overall rate of mutant reads (in order to exclude a low rate of mutant reads due to PCR or sequencing errors). The mutations detected by this criteria were highly consistent with those identified from single-cell RNA-seq (P < 10−5, hypergeometric test) and uncovered 19 additional mutant calls (three for ZEB2, three for EEF1B2 and 13 for DNAJC4). We next focused on the 23 subclonal mutations for which (1) the estimated clonal fraction by ABSOLUTE was at most 60%; (2) at least three cells were identified as harbouring the mutation; and (3) at least one cell was identified as having a wild-type allele of the mutant gene. For each of those 23 mutations we plotted the lineage and stemness scores of all mutant cells to examine their distribution of expression states (Fig. 4 and Extended Data Fig. 9). Note that for these mutations we detected on average 9.4% of the expected fraction by ABSOLUTE. To estimate the frequency of false-positive errors we defined, for each mutation that is detected by WES and analysed by RNA-seq mutation calling, (i) ‘expected mutations’: the number of events in which we find the exact mutation reported by WES; and (ii) ‘false mutations’, the number of events in which we find a mismatch in the same exact site but to a different base than expected by WES (there are 2 such possible bases). This approach focuses on the exact genomic context of the real mutations to obtain a reliable estimate of the false positive rate. This estimate is half the number of false mutations divided by the number of expected mutations (given 4 bases, one of which is wild type, there are two types of false mutations but only one type of expected mutations). The result of this analysis was an estimated average false positive rate of 0.85%, suggesting that the confidence of each detected mutation is, on average, higher than 99%. Accordingly, even in the most extreme case (for example, ZEB2) where only a single mutant cell is detected in one of the compartments of the hierarchy, we still have a 99% confidence that the mutation is represented in that compartment. To detect CIC mutations in single cells from MGH53, we performed qPCR using SuperSelective PCR primers, which are highly specific to single base changes due to a loop-out sequence adjacent to the mutant base (http://legacy.labroots.com/user/webinars/details/id/95). The following qPCR primers were designed to target the c.4543 C > T, p.1515 R > C mutation on CIC cDNA which had been identified as subclonal in MGH53 via whole-exome sequencing analysis. Wild-type-specific forward primer: 5′-CCCTCCAAGGTTTGTCTGCAGccattcGAGGTGC-3′; mutant-specific forward primer: 5′-CCCTCCAAGGTTTGTCTGCAGccattcGAGGTGT-3′; universal reverse primer: 5′-tcgGGCAGCCTGCATGATCTT-3′. The specificity of the single cell qPCR primers was validated by two approaches: (1) qPCR on artificial templates differing by only the mutant base; and (2) qPCR on cDNA of single MGH53 tumour cells for which RNA-seq already detected mutant or wild-type reads. These positive control reactions were highly consistent between duplicates and with the mutation status as inferred from RNA-seq: qPCR identified 7 out of 7 mutant cells and 12 out of 15 wild-type cells, while the remaining three cells had no qPCR signal, and therefore all qPCR signal was consistent with RNA-seq data. We also took advantage of the fact that CIC is located on chr19q which is deleted in MGH53 cancer cells and therefore each cell only contains one CIC allele (loss-of-heterozygosity, LOH). Thus, in a single MGH53 cancer cell, we expect evidence of either mutant or wild-type CIC, but not both. Indeed, all cells with a signal in the positive control assay showed a difference in C values of at least 5 between mutant and wild-type reactions, consistent with LOH. cDNA was taken from frozen stocks of product from the preamplification reaction of the Smartseq2 protocol. 1 μl from each well of cDNA was used as template for a second round of Smartseq2 preamplification and bead purification in order to increase overall signal downstream. qPCR was performed with the Fast Plus EvaGreen qPCR Master Mix Low Rox (Biotium 31014-1) according to the manufacturer’s instructions with the sole modification of adding EDTA to a final reaction concentration of 1.6 mM to enhance primer selectivity. Cp ≥ 33 were considered negative signal; Cp <33 was considered positive signal. We performed SuperSelective qPCR on cDNA from 467 single MGH53 tumour cells. Of these, 61 cells had signal in both replicates for either mutant or wild type primers, but never for both. These were used to define 28 mutant CIC cells and 27 wild-type CIC cells, after excluding 6 cells which did not pass the single cell RNA-seq quality control filters. To identify genes regulated by the CIC mutation, we compared the 28 mutant CIC cells and 27 wild-type CIC cells and identified genes with at least twofold average expression difference and P < 0.01 (before correction for multiple hypothesis testing) based both on a permutation test and a t-test. To further filter the list of differentially expressed genes we also compared the mutant CIC cells to the 671 unresolved cells (in which we did not detect signal for either mutant or wild-type alleles by qPCR and by RNA-seq). As the fraction of CIC mutants was estimated as 30% by ABSOLUTE, we expect the unresolved cells to be a mixture of about one-third mutant CIC cells and about two-thirds wild-type CIC cells, and thus CIC-regulated genes should also differ between this mixture and the CIC mutants but to a lesser extent; we used a threshold of 1.5-fold difference between the average expression in CIC mutants and in unresolved cells. The resulting set of differentially expressed genes is given in Supplementary Table 5. We simulated this analysis with 1,000 randomly selected sets of cells (to replace the mutant CIC and wild-type CIC cells) and found an average of only five upregulated genes by the same criteria, suggesting a false discovery rate lower than 0.1 for the genes upregulated by the CIC mutation. Data generated for this study are available through the Gene Expression Omnibus (GEO) under accession number GSE70630.
News Article | January 20, 2016
No statistical methods were used to predetermine sample size. Male and female Ptc+/−/Math1-SB11/T2Onc or T2Onc2 mice (12 to 20 weeks of age; at the time they developed signs of medulloblastoma) were used. We did not perform a formal sample size estimate for the study but based our experimental plan on our previous experience with Sleeping Beauty mutagenesis screening. When mice showed early clinical signs of brain tumours they were anaesthetized with isoflourane, ophthalmic ointment applied to the eyes and the scalp antiseptically prepared. A 1.5 cm long midline incision was made to expose the skull from the coronal suture to the cranio-cervical junction. Using a high-speed drill and a 2.5 mm trephine bit, a cranial defect is drilled 2 mm posterior to lambda to avoid the transverse sinuses. The skull and the dura are lifted with micro-dissecting forceps, the bulk of the tumour is then removed using a harmon forceps with teeth, while smaller sections of tumour are removed with a microcurette (2 mm). Surgical samples are saved in dry ice, the bleeding from the tumour site is counteracted with direct pressure and Gelfoam. When haemostasis is obtained, the surgical wound is sutured using interrupted stitching with absorbable sutures. Animals received analgesia and dexamethasone post-operatively to contain the brain oedema. Male and female Ptc+/−/Math1-SB11/T2Onc or T2Onc2 control mice were monitored for early clinical signs of brain tumours but not subjected to surgery and CSI irradiation, no formal randomization was used. All the procedures involving animals have been approved by the institutional Animal Care Committee, in no case were tumour-bearing animals allowed a tumour burden compromising normal behaviour, food and water intake or exceeding the approved volume of 1,700 mm3. Mice that had recovered from tumour resection were anaesthetized with isoflurane and placed in the brain irradiation bed in the image guided small animal irradiator (X-Rad 225CX, Precision Xray, North Branford, CT, USA). Correct animal setup was confirmed using 2D fluoroscopic images with and without the brain collimator (2 × 2 cm) in place, all images were acquired at 40 kVp, 0.5 mA, using the same X-ray tube which is used for radiation treatment. 3D volumetric cone-beam CT images were used for the visualization of bone and soft tissues within the animal and isocentre placement. The imaging capability of the unit were described previously43, the imaging dose to the animal was estimated to be less than 1 cGy. The delivered dose per fraction was 2 Gy, administered 3 times a week for the first week to prevent brain oedema, followed by five times a week treatment for the following 3 weeks. Each daily dose was delivered with two parallel opposed-lateral beams to correct for tissue attenuation of the dose, total daily dose of 2 Gy. Dose rate for the brain collimator was measured at 3.2 Gy per min at 225 kVp, 13 mA, on a 0.3 mm Cu filter (HVL: 0.93 mm Cu, added filtration: 0.3 mm Cu). The tube was calibrated at these settings following the TG61 protocol44. The spine treatment was introduced on the second week of CSI irradiation, we used a 4.76Gy per 6 fractions schedule, and the mice received 2 spinal fractions per week. Radiation to the spinal cord was delivered to mice placed supine on the irradiator stage the irradiation was done with single or multiple posterior beams. The same imaging strategy with 2D and volumetric 3D imaging was adopted for spinal cord targeting, using a 0.5 × 5 cm collimator or multiple fields of 0.5 × 2 cm; for the spine treatment a dose correction was applied to compensate for the different depth of the cervical spine compared to lumbo/sacral. Treatment dose was administered at 2.8Gy per min at 225 kVp, 13 mA settings on a 0.3 mm Cu filter. The end-point date of the control and CSI treated groups was assessed by independent veterinary technicians blinded to the experimental group. Medulloblastoma-free survival from the time of diagnosis was assessed for control mice and mice that underwent surgery and radiation, no animal was removed from the study and mice euthanized during the study for different reasons than medulloblastoma were censored in the Kaplan–Meier estimate for tumour-free survival. Genomic DNA was isolated and purified from mouse tissues with a PureLink genomic DNA extraction kit (Invitrogen). Libraries for Illumina HiSeq sequencing were prepared as described previously25 with minor modifications. 2 μg of gDNA were digested and ligated to the adapters, after a BamHI digest to eliminate the untransposed copies of the concatamer, an enrichment PCR followed by a barcoding PCR were performed25. The barcode PCR was modified to incorporate a paired-end (PE) sequencing adaptor for paired end sequencing, the sequence of the PE adaptor was: CAAGCAGAAGACGGCATACGAGATCGGTCTCGGCATTCCTGCTGAACCGCTCTTCCGATCTTAGGGCTCCGCTTAAGGGAC. Libraries were purified and pooled as previously described and sequenced on an Illumina HiSeq 2000 (ref. 25). Sequenced libraries were demultiplexed and aligned as described previously25. Demultiplexing and trimming of SB transpon sequences was performed using custom scripts, alignment of reads was performed with Novoalign to mouse assembly NCBI37/mm9 (July 2007) A chi-squared test was used to asses statistical enrichment of the integration events within each transcription unit considering the following: the number of TA dinucleotide sites within the gene relative to the number of TA sites in the genome, the number of integration sites within each tumour, and the total number of tumours in each cohort. This gCIS analysis produced a P value for each of the 19,000 mouse RefSeq genes, and a Bonferroni correction was therefore used to adjust for multiple hypothesis testing. gCIS predictions were manually curated to filter out ambiguities, artefacts and local hopping. BioProject ID PRJNA306269. The model assumes that tumour cell division and growth are initiated by a founding transposon insertion event, and that additional insertion events can subsequently occur in daughter cells. According to the model, insertion events in the transformed daughter cells are expected to decrease by a factor of 2n relative to the initial transformed cell, where n is the number of intervening cell divisions. Details of the model are described in Supplementary Note 1. As with any model it is important to note its limitations. First, there is a limit to the degree to which distinct lineages can be separated. If two lineages are governed by two sufficiently close values of the parameter G, the components will be superimposed. If the value of d is also the same, the identification of the initiating insertions will not be affected; otherwise, the lineage with lower d will incorrectly identify its initiating mutation as a passenger. The extent of this issue is dependent upon the closeness of G and the depth at which a sample has been sequenced. It almost certainly true that other lineages are present in the data, but arose relatively late and/or have relatively low growth rates. Therefore, the model is best described as identifying the most clear and unambiguous lineages. Second, a lineage which have undergone multiple gene disruptions that affect growth rate at different generations can appear as two separate lineages. For example, if a disruption of gene A causes rampant cell division/growth, and is followed up two generations later by a disruption in gene B that further increases the growth rate, this will appear as two lineages with putative genotypes A- and B-. In reality, the genotypes are A- and A-;B-. Importantly, this does not affect the ultimate identification of both of these genes as initiators. Relative clonal prevalence was calculated for the genes predicted as driver as:2d G and normalized to the total number of predicted drivers for each sample. Driver events predicted to happen in the founder clones (highest G) for each sample, or showing relative cell abundance >10% were selected for pathway enrichment analysis. The primers for amplifying Sleeping Beauty transposon insertion sites were designed based on the chromosomal location of each insertion site and its orientation to the transcription of the gene hosting the insertion. The primers at the inverted repeats/direct repeats (left) (IRL) and inverted repeats/direct repeats (right) (IRR) of the transposon were 5′-AAATTTGTGGAGTAGTTGAAAAACGA-3′ and 5′-GGATTAAATGTCAGGAATTGTGAAAA-3′, respectively. The input represents genomic DNA with Sleeping Beauty transposition, which was illustrated by Sleeping Beauty excision PCR that detected the transposon post-transposition26. Three points of input (1×, 5× and 25×) were used. Mice showing signs of late stage brain tumours were euthanized and tissue harvested for genomic DNA extraction as well as histological examination. Extent and location of recurrences was evaluated by standard haematoxylin and eosin staining, Trp53 pathway status was evaluated by p21 staining performed at the Paediatric Laboratory Medicine Department, The Hospital for Sick Children, (Toronto, Canada) using the Ventana BenchMark XT model. The conditions were as follows: HIER: 40 min in a Tris based buffer (pH 8.5) Ventana CC1 (http://www.ventana.com/product/203?type=204), primary antibody p21 (1:50) (BD bioscience 556431, clone SXM30) was incubated for 1 h at 37 °C. The signal was detected using Ventana OptiView DAB IHC Detection Kit. The following fly stocks were used: UAS-mCD8-GFP to label cell membrane; insc-Gal4 (Gal41407 inserted in an inscuteable promoter) to drive gene expression in the neuroblast lineage; UAS-dpn for overexpression of dpn. Flies were mated and maintained at 25 °C. Fly larvae were retrieved at late third instar stage for whole body irradiation at 40 Gy. The larval brains were dissected 4 h after irradiation, followed by fixation and immunohistochemistry analysis. Larval brains were dissected, fixed, and stained as previously described29. Briefly, third instar larvae brains were dissected in PBS, fixed in 4% paraformaldehyde solution for 20 min at room temperature, and incubated with the primary antibody (rabbit anti-phospho-histone 3, Millipore, 1:200) overnight at 4 °C and secondary antibody for 2 h at room temperature. Fluorescence images were acquired using a Leica SP5 confocal microscope. Representative images of the dorsal brain lobes were shown in Fig. 1d, e and Extended Data Fig. 1h. All patients gave informed consent to the samples collection; unless indicated otherwise, the samples were sequenced and analysed at Canada’s Michael Smith Genome Sciences Centre at the BC Cancer Agency (GSC). Libraries for whole-genome sequencing were constructed using either the plate-based or SPRI-TE library construction protocol. 2 μg of genomic DNA in a 96-well format was fragmented by Covaris E210 sonication for 30 s using a ‘duty cycle’ of 20% and ‘intensity’ of 5. The paired-end sequencing library was prepared following the BC Cancer Agency’s Genome Sciences Centre 96-well genomic ~350–450 bp insert Illumina Library Construction protocol on a Biomek FX robot (Beckman-Coulter, USA). Briefly, the DNA was purified in a 96-well microtitre plate using Ampure XP SPRI beads (40–45 μl beads per 60 μl DNA), and was subject to end-repair, and phosphorylation by T4 DNA polymerase, Klenow DNA Polymerase, and T4 polynucleotide kinase respectively in a single reaction, followed by cleanup using Ampure XP SPRI beads and 3′ A-tailing by Klenow fragment (3′ to 5′ exo minus). After cleanup using Ampure XP SPRI beads, PicoGreen quantification was performed to determine the amount of Illumina PE adapters used in the next step of adaptor ligation reaction. The adaptor-ligated products were purified using Ampure XP SPRI beads, then PCR-amplified with Phusion DNA Polymerase (Thermo Fisher Scientific Inc. USA) using Illumina’s PE indexed primer set, with cycle conditions: 98 °C for 30 s followed by 6 cycles of 98 °C for 15 s, 62 °C for 30 s and 72 °C for 30 s, and a final extension at 72 °C for 5 min. The PCR products were purified using Ampure XP SPRI beads, and checked with Caliper LabChip GX for DNA samples using the High Sensitivity Assay (PerkinElmer, USA). PCR products of the desired size range were gel purified (8% PAGE or 1.5% Metaphor agarose in an in-house custom built robot), and the DNA quality was assessed and quantified using an Agilent DNA 1000 series II assay and Quant-iT dsDNA HS Assay Kit using Qubit fluorometer (Invitrogen), then diluted to 8 nM. The final concentration was confirmed by Quant-iT dsDNA HS Assay before generating 100 bp paired-end reads on the Illumina HiSeq 2000/2500 platform using v3 chemistry. Whole-genome libraries of patient samples medulloblastoma-Rec-03, -04, -06, -11, -12, -18, -19, -22–24, -26-33 have been constructed using the Spri-TE 300-600 bp fragment protocol as follows. Genome libraries with fragment size ranges of approximately 400 bp were constructed on a SPRI-TE robot (Beckman Coulter, USA) according to the manufacturer’s instructions (SPRIworks Fragment Library System I Kit, A84801). Briefly, 1 μg of genomic DNA in a 60 μl volume, and 96-well format, was fragmented by Covaris E210 sonication for 30 s using a ‘duty cycle’ of 20% and ‘intensity’ of 5. Up to 10 paired-end genome sequencing libraries were prepared in parallel using the SPRI-TE 300–600 bp size-selection program. Following completion of the SPRI-TE run the adaptor ligated library templates were quantified using a Qubit fluorometer. 5 ng of adaptor ligated template was PCR amplified using Phusion DNA Polymerase (Thermo Fisher Scientific, USA) and Illumina’s PE indexed primer set, with cycle conditions: 98 °C for 30 s followed by 10 cycles of 98 °C for 15 s, 62 °C for 30 s and 72 °C for 30 s, and a final amplicon extension at 72 °C for 5 min. The PCR products were purified using Ampure XP SPRI beads, and checked with Caliper LabChip GX for DNA samples using the High Sensitivity Assay (PerkinElmer, USA). PCR products of the desired size range were purified using gel electrophoresis (8% PAGE or 1.5% Metaphor agarose gels in a custom built robot) and the DNA quality was assessed and quantified using an Agilent DNA 1000 series II assay and Quant-iT dsDNA HS Assay Kit using Qubit fluorometer (Invitrogen), then diluted to 8 nM. The final concentration was verified by Quant-iT dsDNA HS Assay before Illumina Sequencing before generating 100 bp paired-end reads on the Illumina HiSeq 2000/2500 platform using v2 or v3 chemistry. Alignment. After marking chastity failed reads, paired-end 100 bp raw reads were aligned to the reference genome GRCh37-lite (http://www.bcgsc.ca/downloads/genomes/9606/hg19/1000genomes/bwa_ind/genome) with the Burrows–Wheeler Aligner (BWA; version 0.5.7)45. Bam files were sorted with SAMTools (version 0.1.13) and merged using Picard MarkDuplicates.jar (version 1.71). The merged bam files were subsequently indexed with SAMTools index (version 0.1.17) and submitted to the European Genome-phenome Archive (EGAD00001000946). German Cancer Research Centre (DKFZ). Patient samples medulloblastoma-REC-13-16 and medulloblastoma-REC-34-35 were processed at the DKFZ in Heidelberg as previously described2. Analysed DNA was isolated using using a Qiagen Allprep DNA/RNA/Protein Mini Kit. On average 125 mg of homogenized (TissueLyser, Qiagen) tumour tissue was used for isolation of analytes. The manufacturer’s protocol was adapted to allow for DNA and total RNA (including miRNA) isolation. DNA from matching blood samples was extracted using Qiagen Blood and Cell Culture Midi Kit according to the manufacturer’s protocol. After quality control of isolated DNA (gel electrophoresis), extracted nucleic acids were submitted for sequencing. Paired-end (PE) DNA library preparation was carried out using Illumina Inc. v2 protocols. In brief, 1-5μg of genomic DNA were fragmented to ~300 bp (PE) insert-size with a Covaris device, followed by size selection through agarose gel excision. Deep sequencing was carried out with Illumina HiSeq2000 instruments. Whole-exome sequencing at McGill. Patient samples medulloblastoma-REC-36-38 and medulloblastoma-REC-48-55 were prepared and sequenced by the Genome Quebec Innovation Centre and analysed at the McGill University Health Centre as follows. Paired-end libraries were prepared with the Illumina’s Nextera Rapid Capture Exome kit. Captured exome DNA fragments were then sequenced on Illumina HiSeq 2500 (rapid-run mode) generating 100-bp paired-end reads. Adaptor sequences were removed and low-quality reads were trimmed using the FASTX toolkit. Quality trimmed reads were aligned to the human genome reference library (hg19) using Burrows–Wheeler Aligner (BWA) version 0.5.9 (ref. 45). Indels were realigned using the Genome Analysis Toolkit (GATK)46 and duplicate reads were marked using Picard. SNVs from WGS data were analysed using all three methods described below, whereas SNVs from exome-seq data were analysed only with MutationSeq. SNVs were analysed with SAMtools mpileup v.0.1.17 either on single or paired libraries. Each chromosome was analysed separately using the -C50-DSBuf parameters. The resulting vcf files were merged and filtered to remove low-quality SNVs by using samtools varFilter (with default parameters) as well as to remove SNVs with a QUAL score of less than 20. Finally, SNVs were annotated with gene annotations from Ensembl v66 using snpEff and the dbSNP v137 db membership assigned using SnpSift47. To analyse compartment specific SNVs and indels, samples were analysed pair-wise with the default settings of Strelka v0.4.7 (ref. 48). Primary tumour samples and relapse/met were compared against the germline sample. In the absence of a germline sample, the relapse/met samples were compared against the primary tumour sample. Variant allele frequencies (VAF) of somatic damaging SNVs (called by Strelka in 14 patients with matched germline samples) were classified into distinct clusters using the R package mclust, which uses finite mixture estimation via iterative expectation maximization steps (EM) and the Bayesian Information Criterion (BIC). Each cluster is manually categorized as either ‘homozygous’, ‘clonal’, or ‘subclonal’, depending on the cluster VAF and the uncertainty separating it from the next cluster. Multiple subclonal populations are numbered sequentially, starting with the most highly prevalent population. SNVs were analysed pair-wise with SAMtools mpileup v.0.1.17 (ref. 49). Each chromosome was analysed separately using the -C50-DSBuf parameters. Before merging the resulting vcf files, they were filtered to remove all indels and low quality SNVs by using samtools varFilter (with default parameters) as well as to remove SNVs with a QUAL score of less than 20 (vcf column 6). The SNVs in the resulting vcf files were further filtered and scored using mutationSeq v1.0.2 and annotated with gene annotations from Ensembl v66 using SnpEff and the dbSNP v137 and Cosmic 64 db membership using SnpSift Indels were called in the low quality exomes using VarScan version 2.3.6, using the following parameters: P value 95 × 10−2 –strand-filter 1–min-avg-qual 20. The indels in the resulting vcf files were annotated with gene annotations from Ensembl 66 using SnpEff as described above, and screened against dbSNP137 using SnpSift. EMu was used to define mutation spectra for 11 samples with germline (that is, excluding the DKFZ samples), using the expectation-maximization algorithm50. To assess significant changes in the distributions of mutation spectra across primary, local and distal recurrences from each medulloblastoma patient, we used the chi-squared test. Changes in (1) the number of compartment-specific mutations and (2) in frequencies of transversion mutations, were tested with the Wilcoxon rank-sum test. Changes in the frequency of C > T and T > G transversions between primary and recurrent tumours were tested using factorial ANOVA with rank transformation. The techniques outlined in ref. 51 were followed to analyse copy number changes. Sequence quality filtering was used to remove all reads of low mapping quality (Q < 10). Due to the varying amounts of sequence reads from each sample, aligned reference reads were first used to define genomic bins of equal reference coverage to which depths of alignments of sequence from each of the tumour samples were compared. This resulted in a measurement of the relative number of aligned reads from the tumours and reference in bins of variable length along the genome, where bin width is inversely proportional to the number of mapped reference reads. A hidden Markov model (HMM) was used to classify and segment continuous regions of copy number loss, neutrality, or gain using methodology outlined previously52. The five states reported by the HMM were: loss (1), neutral (2), gain (3), amplification (4), and high-level amplification (5). In cases with germline, copy number gains and losses are called against the germline sample. In cases without germline, CNV calls were made using the primary instead of the germline sample, such that gains and losses reported in the recurrent tumour are relative to the copy number state in the primary. The limitations of this approach are that (1) when both primary and recurrent tumours share an event, the CNV output looks normal, and (2) when a gain (or loss) is called in the recurrent tumour versus the primary tumour, we cannot distinguish between the two scenarios that can give rise to such a result. The first scenario is that there is a gain the recurrence vs the primary, and the second is that there is a loss in the primary only. To resolve this uncertainty for particular chromosomes of interest in a subset of patients without germline, we additionally ran the Control-FREEC algorithm53. Control-FREEC was run using the following default parameters, with the following exceptions: breakPointType = 4, telocentromeric = 75,000, minimalCoveragePerPosition = 5. Structural variant detection was performed using ABySS (v1.3.2). Genome (WGS) libraries were assembled in single-end mode using k-mer values of k24, and k44. The contigs and reads were then reassembled at k64 in single end mode and then finally at k64 in paired end mode. Large-scale rearrangements and gene fusions were identified using BWA (v0.6.2-r126) alignments. Evidence for the alignments were provided from aligning reads back to the contigs and from aligning reads to genomic coordinates. Events were then filtered on read thresholds. Insertions and deletions were identified by gapped alignment of contigs to the human reference using BWA. Confidence in the event was calculated from the alignment of reads back to the event breakpoint in the contigs. The events were then screened against dbSNP and other variation databases to identify putative novel events. To verify SNVs, samples were subjected to targeted deep amplicon sequencing of the tumour and normal DNA. Primers were designed with the Primer3 software54 with a GC clamp and an optimal Tm of 64 °C to ensure specificity. Primers aligned against the human reference genome were tested with a combination of UCSC’s in silico PCR tool and custom in-house scripts to obtain unique hits. The primer pairs were designed such that the variant is located within a maximum of 250 bp of the 5′ or 3′ amplicon end. The primers were tagged with Illumina adapters eliminating the need for adaptor ligation during sample preparation. The Illumina adaptor tags are as follows: 5′-CGCTCTTCCGATCTCTG on the forward amplicon primer and 5′-TGCTCTTCCGATCTGAC on the reverse amplicon primers. Genomic DNA templates or library construction intermediates were used as starting material to generate PCR products using Phusion DNA polymerase (Fisher Scientific, catalogue number F-540L). The amplicons ranged in size from 188–625 bp. Amplicons were pooled by template for direct sequencing. Preparation for sequencing involved a second round of amplification (6 cycles with Phusion DNA polymerase) with PE primer 1.0-DS (5′-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTCTG-3′) and a custom PCR primer (5′-CAAGCAGAAGACGGCATACGAGATNNNNNNGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTGAC-3′) containing an unique six-nucleotide ‘index’ shown here as the letter N. PCR products of the desired size range were purified using 8% PAGE gels. DNA quality was assessed using the Agilent DNA 1000 series II assay (Agilent, Santa Clara CA, USA) and DNA quantity was measured using by Quant-iT dsDNA HS assay on a Qubit fluorometer (Life Technologies, Grand Island, NY, USA). The indexed libraries were pooled together and sequenced on the Illumina MiSeq platform with paired-end 250 bp reads using v2 reagents. An in-house generated PhiX sequencing control library was spiked in to the samples at molar ratio of 1:100. Reads were aligned using BWA-SW, and SNVs called with Samtools mpileup with the following parameters: -d 1000000 -B -C50 -DES. Indels were called using VarScan and the following parameters: mpileup2indel–min-var.-freq 0–p-value 1–strand-filter 0. SNVs with allelic frequencies greater than 15% in recurrent tumours were considered clonal. To find evidence for rare subclones (<5%) of these SNVs in the primary samples, we generated base quality (baseQ) distributions supporting the reference and all alternate alleles in the primary (and the recurrent) compartments. Due to our amplification and sequencing strategy, all reads start at the same position, and the target SNV is always at a specific position in the read (that is, a given mutation covered by 2,000 reads will be at base position 40 in all reads). Thus, unlike shotgun protocols where read starts are random, the SNVs are never affected by sequencing errors at the end of the read (where errors tend to happen more often), and cumulative sequencing error rates for whole reads are not applicable in estimating local error rates at a specific base. Instead, detection of a real mutation is only confounded by the subset of sequencing errors at the same position in the read that causes a base change to match the mutation; sequencing errors matching the other two possible bases (that is, non-reference and non-mutation) are a non-ambiguous measure of the error rate at a particular position. Thus, to distinguish sequencing errors from real subclonal mutations, for each allele (that is, the reference allele and all three alternate alleles), we generated base quality (baseQ) distributions from all reads covering the position of the mutation; the reference base was further used as the benchmark distribution of a base without appreciable sequencing errors (Extended Data Fig. 8). The non-reference alleles that had the highest (1) mean baseQ value, (2) max baseQ value, and (3) highest number of reads with baseQ values >30, were considered real events. When all three criteria were not matched, the subclonal presence of the mutation could not be confirmed. At positions where these criteria were matched, the baseQ distributions of the alternate allele closely matched the baseQ distribution of the positive control reference base, could be easily distinguished from sequencing errors, and nearly always matched the expected mutation at that position, confirming the subclonal presence of the mutation in the diagnostic sample. The allelic ratios are modelled using a binomial distribution and incorporated into the HMM Titan calculations, where the output is a list of copy number and LOH events. The Titan run for a tumour sample that has the lowest SDbw score is the optimal result and the corresponding number of clonal clusters is the optimal one—this copy number information was then chosen for use in further analysis. Minor and major copy number counts calculated from the optimal TitanCNA zygosity states were attached to the allele frequency information for each SNV and was used as input for Pyclone 0.12.3. PyClone was used to infer subclonal populations for all samples in each case. It introduces a framework that can analyse all samples from a single case in the same run improving accuracy of the inference. Pyclone outputs cellular frequencies and clonal cluster membership for each genomic position, accounting for confounding factors such as mutational genotype in the context of copy number changes. All Pyclone analyses were done using a multi-sample model and a beta-binomial distribution, with pre-calculated parental copy number inferred by TitanCNA. Copy number and LOH information was called for 14 patients with matched germline samples using Control-FreeC53, an algorithm that provides fractional copy number level for segments. Sensitive mutation calling was performed using muTect55 and clonal and subclonal somatic mutations were shortlisted if there was adequate sequence coverage in both primary and relapse tumour compartments (10 reads minimum). Shortlisted mutations and copy number segments in areas of neutral heterozygosity were used as input to EXPANDS36. Phylogenetic relationships between the subpopulations inferred by the EXPANDS algorithm in primary and recurrent tumours were generated using both SNV and copy number segments and the BIONJ algorithm. The inferred cellular prevalence values of each subpopulation was used to generate a Shannon Index value for each compartment37. We identified 14q associated genes in Shh medulloblastoma using ANOVA in the Partek Genomics Suite. Gene expression profiles were analysed according to 14q status in samples from a previously published Toronto data set containing only SHH medulloblastoma samples (n = 82)56 in a subset of cases with available SNP6 data5, 57. The top 20 ranking signature genes were applied using k-means clustering using the R2 platform (http://hgserver1.amc.nl/cgi-bin/r2/main.cgi) on a non-overlapping, independent gene expression profiling cohort from Boston58 sub-selecting only SHH medulloblastomas. Survival differences were analysed using log-rank statistics and Kaplan–Meier estimates. Two micrograms of total RNA samples were arrayed into a 96-well plate and polyadenylated (Poly(A)+) messenger RNA (mRNA) was purified using the 96-well MultiMACS mRNA isolation kit on the MultiMACS 96 separator (Miltenyi Biotec, Germany) with on-column DNaseI-treatment as per the manufacturer’s instructions. The eluted poly(A)+ mRNA was ethanol precipitated and resuspended in 10 μl of DEPC-treated water with 1:20 SuperaseIN (Life Technologies, USA). First-strand cDNA was synthesized from the purified poly(A)+ mRNA using the Superscript cDNA Synthesis kit (Life Technologies, USA) and random hexamer primers at a concentration of 5 μM along with a final concentration of 1 μg ul−1 actinomycin D, followed by Ampure XP SPRI beads on a Biomek FX robot (Beckman-Coulter, USA). The second strand cDNA was synthesized following the Superscript cDNA Synthesis protocol by replacing the dTTP with dUTP in dNTP mix, allowing the second strand to be digested using UNG (Uracil-N-Glycosylase, Life Technologies, USA) in the post-adaptor ligation reaction and thus achieving strand specificity. The cDNA was quantified in a 96-well format using PicoGreen (Life Technologies, USA) and VICTOR3V Spectrophotometer (PerkinElmer, Inc. USA). The quality was checked on a random sampling using the High Sensitivity DNA chip assay (Agilent). The cDNA was fragmented by Covaris E210 (Covaris, USA) sonication for 55 s, using a duty cycle of 20% and intensity of 5. Plate-based libraries were prepared following the BC Cancer Agency’s Michael Smith Genome Sciences Centre (BCGSC) paired-end (PE) protocol on a Biomek FX robot (Beckman-Coulter, USA). Briefly, the cDNA was purified in 96-well format using Ampure XP SPRI beads, and was subject to end-repair and phosphorylation by T4 DNA polymerase, Klenow DNA Polymerase, and T4 polynucleotide kinase respectively in a single reaction, followed by cleanup using Ampure XP SPRI beads and 3′ A-tailing by Klenow fragment (3′ to 5′ exo minus). After cleanup using Ampure XP SPRI beads, PicoGreen quantification was performed to determine the amount of Illumina PE adapters used in the next step of adaptor ligation reaction. The adaptor-ligated products were purified using Ampure XP SPRI beads, then PCR-amplified with Phusion DNA Polymerase (Thermo Fisher Scientific USA) using Illumina’s PE primer set, with cycle conditions of 98 °C 30 s followed by 10–15 cycles of 98 °C for 10 s, 65 °C for 30 s and 72 °C for 30 s, and then 72 °C for 5 min. The PCR products were purified using Ampure XP SPRI beads, and checked with a Caliper LabChip GX for DNA samples using the High Sensitivity assay (PerkinElmer, USA). PCR products with a desired size range were purified using a 96-channel size selection robot developed at the BCGSC, and the DNA quality was assessed and quantified using an Agilent DNA 1000 series II assay and Quant-iT dsDNA HS Assay Kit using Qubit fluorometer (Invitrogen), then diluted to 8 nM. The final concentration was verified by Quant-iT dsDNA HS assay. The libraries, 2×100 PE lanes, were sequenced on the Illumina HiSeq 2000/2500 platform using v3 chemistry and HiSeq Control Software version 2.0.10. Illumina paired-end RNA sequencing data was aligned to GRCh37-lite genome-plus-junctions using BWA (version 0.5.7)49, 59. This reference is a combination of GRCh37-lite assembly and exon–exon junction sequences with coordinates defined based on transcripts in Ensembl (v61), Refseq and known genes from the UCSC genome browser (both were downloaded from UCSC in November 2011; The GRCh37-lite assembly is available at http://www.bcgsc.ca/downloads/genomes/9606/hg19/1000genomes/bwa_ind/genome). BWA “aln” and “sampe” were run with default parameters, except for the inclusion of the (-s) option to disable the Smith-Waterman alignment, which is unsuitable for insert size distribution in paired-end RNA-seq data. Finally, reads failing the Illumina chastity filter are flagged with a custom script, and duplicated reads were flagged with Picard Tools (version 1.31). After the alignment, the junction-aligned reads that mapped to exon–exon junctions were repositioned to the genome as large-gapped alignments and tagged with “ZJ:Z”59. We compared the expression values (RPKM) of genes in the primary and recurrent tissues of each tumour with data in both compartments (n = 7 patients). A gene was considered differentially expression when the absolute difference between compartments was greater than 10 and the log fold-change was greater than 2. Gene sets enrichment analysis was run on differentially expressed genes that were observed in at least two patients by subgroup, using mSigDB60.
News Article | September 7, 2016
All mice were bred and maintained under pathogen-free conditions at an American Association for the Accreditation of Laboratory Animal Care accredited animal facility at the University of Pennsylvania or Yale University. Mice were housed in accordance with the procedures outlined in the Guide for the Care and Use of Laboratory Animals under an animal study proposal approved by an institutional Animal Care and Use Committee. Male and female mice between 4 and 12 weeks of age were used for all experiments. Littermate controls were used whenever possible. C57BL/6 (wild type) and B6.SJL-Ptprca Pepcb/Boy (B6.SJL) mice were purchased from The Jackson Laboratory. We generated Morrbid-deficient mice and the in cis and in trans double heterozygous mice (Morrbid+/−, Bcl2l11+/−) mice using the CRISPR/Cas9 system as previously described26. In brief, to generate Morrbid-deficient mice, single guide RNAs (sgRNAs) were designed against regions flanking the first and last exon of the Morrbid locus (Extended Data Fig. 1g). Cas9-mediated double-stranded DNA breaks resolved by non-homologous end joining (NHEJ) ablated the intervening sequences containing Morrbid in C57BL/6N one-cell embryos. The resulting founder mice were Morrbid−/+, which were then bred to wild-type C57BL/6N and then intercrossed to obtain homozygous Morrbid-/- mice. One Morrbid-deficient line was generated. To control for potential off-target effects, mice were crossed for at least 5 generations to wild-type mice and then intercrossed to obtain homozygosity. Littermate controls were used when possible throughout all experiments. To generate the in cis and in trans double heterozygous mice (Morrbid+/−, Bcl2l11+/−) mice, we first obtained mouse one-cell embryos from a mating between Morrbid−/− female mice and wild-type male mice. As such, the resulting one-cell embryos were heterozygous for Morrbid (Morrbid+/−). We then micro-injected sgRNAs designed against intronic sequences flanking the second exon of Bcl2l11, which contains the translational start site/codon, into Morrbid−/+ one-cell embryos (Extended Data Fig. 9). Cas9-mediated double-stranded DNA breaks resolved by NHEJ ablated the intervening sequences containing the second exon of Bcl2l11 in Morrbid+/− (C57BL/6N) one-cell embryos, generating founder mice that were heterozygous for both Bcl2l11 and Morrbid (Bcl2l11+/−; Morrbid−/+). Founder heterozygous mice were then bred to wild-type C57BL/6N to interrogate for the segregation of the Morrbid-deficient and Bcl2l11-defient alleles (Extended Data Fig. 9). Pups that segregated such alleles were named in trans and pups that did not segregate were labelled in cis. One line of in cis and in trans double heterozygous mice (Bcl2l11+/−; Morrbid−/+) lines were generated. To control for potential off-target effects, mice were crossed for at least 5 generations to wild-type (C57BL/6N) mice (for in cis) and to Morrbid−/− mice (for in trans) to maintain heterozygosity. To determine genetic rescue, samples from mice containing different permutations of Morrbid and Bcl2l11 alleles (Fig. 4g–j) were analysed in a blinded manner by a single investigator not involved in the breeding or coding of these samples. Cells were isolated from the indicated tissues (blood, spleen, bone marrow, peritoneal exudate, adipose tissue). Red blood cells were lysed with ACK. Single-cell suspensions were stained with CD16/32 and with indicated fluorochrome-conjugated antibodies. If run live, cells were stained with 7-AAD (7-amino-actinomycin D) to exclude non-viable cells. Otherwise, before fixation, Live/Dead Fixable Violet Cell Stain Kit (Invitrogen) was used to exclude non-viable cells. Active caspase staining using Z-VAD-FMK (CaspGLOW, eBiosciences) was performed according to the manufacturer's specifications. Apoptosis staining by annexin V+ (Annexin V Apoptosis Detection kit) was performed according to the manufacturer’s recommendations. BrdU staining was performed using BrdU Staining Kit (eBioscience) according to the manufacturer’s recommendations. For BCL2L11 staining, cells were fixed for 15 min in 2% formaldehyde solution, and permeabilized with flow cytometry buffer supplemented with 0.1% Triton X-100. All flow cytometry analysis and cell-sorting procedures were done at the University of Pennsylvania Flow Cytometry and Cell Sorting Facility using BD LSRII cell analysers and a BD FACSAria II sorter running FACSDiva software (BD Biosciences). FlowJo software (version 10 TreeStar) was used for data analysis and graphic rendering. All fluorochrome-conjugated antibodies used are listed in Supplementary Table 2. 1 × 106 wild-type and Morrbid-deficient neutrophils sorted from mouse bone marrow were assayed for BCL2L11 protein expression by western blotting (Bim C34C5 rabbit monoclonal antibody, Cell Signaling), as previously described. 2 × 106 wild-type and Morrbid-deficient neutrophils sorted from mouse bone marrow were cross-linked in a 1% formaldehyde solution for 5 min at room temperature while rotating. Crosslinking was stopped by adding glycine (0.2 M in 1 × PBS (phosphate buffered saline)) and incubating on ice for 2 min. Samples were spun at 2500g for 5 min at 4 °C and washed 4 times with 1 × PBS. The pellets were flash frozen and stored at −80 °C. Cells were lysed, and nuclei were isolated and sonicated for 8 min using a Covaris S220 (105 Watts, 2% duty cycle, 200 cycles per burst) to obtain approximately 200–500 bp chromatin fragments. Chromatin fragments were pre-cleared with protein G magnetic beads (New England BioLabs) and incubated with pre-bound anti-H3K27me3 (Qiagen), anti-EZH2 (eBiosciences), or mouse IgG1 (Santa Cruz Biotechnology) antibody-protein G magnetic beads overnight at 4 °C. Beads were washed once in low-salt buffer (20 mM Tris, pH 8.1, 2 mM EDTA, 50 mM NaCl, 1% Triton X-100, 0.1% SDS), twice in high-salt buffer (20 mM Tris, pH 8.1, 2 mM EDTA, 500 mM NaCl, 1% Triton X-100, 0.1% SDS), once in LiCl buffer (10 mM Tris, pH 8.1, 1 mM EDTA, 0.25 mM LiCl, 1% NP-40, 1% deoxycholic acid) and twice in TE buffer (10 mM Tris-HCl, pH 8. 0, 1 mM EDTA). Washed beads were eluted twice with 100 μl of elution buffer (1% SDS, 0.1 M NaHCO ) and de-crosslinked (0.1 mg ml−1 RNase, 0.3 M NaCl and 0.3 mg ml−1 Proteinase K) overnight at 65 °C. The DNA samples were purified with Qiaquick PCR columns (Qiagen). qPCR was carried out on a ViiA7 Real-Time PCR System (ThermoFisher) using the SYBR Green detection system and indicated primers. Expression values of target loci were directly normalized to the indicated positive control loci, such as MyoD1 for H3K27me3 and EZH2 ChIP analysis, and Actb for Pol II ChIP analysis. ChIP–qPCR primer sequences are listed in Supplementary Table 1. 50,000 wild-type and knockout cells, in triplicate, were spun at 500g for 5 min at 4 °C, washed once with 50 μl of cold 1× PBS and centrifuged in the same conditions. Cells were resuspended in 50 μl of ice-cold lysis buffer (10 mM Tris-HCl, pH7.4, 10 mM NaCl, 3 mM MgCl , 0.1% IGEPAL CA-630). Cells were immediately spun at 500g for 10 min at 4 °C. Lysis buffer was carefully pipetted away from the pellet, which was then resuspended in 50 μl of the transposition reaction mix (25 μl 2× TD buffer, 2.5 μl Tn5 Transposase (Illumina), 22.5 μl nuclease-free water) and then incubated at 37 °C for 30 min. DNA purification was performed using a Qiagen MinElute kit and eluted in 12 μl of Elution buffer (10 mM Tris buffer, pH 8.0). To amplify library fragments, 6 μl of the eluted DNA was mixed with NEBnext High-Fidelity 2× PCR Master Mix, 25 μM of customized Nextera PCR primers 1 and 2 (Supplementary Table 1), 100x SYBR Green I and used in PCR as follow: 72 °C for 5 min; 98 °C for 30 s; and thermocycling 4 times at 98 °C for 10 s; 63 °C for 30 s; 72 °C for 1 min. 5 μl of the 5 cycles PCR amplified DNA was used in a qPCR reaction to estimate the additional number of amplification cycles. Libraries were amplified for a total of 10–11 cycles and were then purified using a Qiagen PCR Cleanup kit and eluted in 30 μl of Elution buffer. The libraries were quantified using qPCR and bioanalyser data, and then normalized and pooled to 2 nM. Each 2 nM pool was then denatured with a 0.1 N NaOH solution in equal parts then further diluted to form a 20 pM denatured pool. This pool was then further diluted down to 1.8 pM for sequencing using the NextSeq500 machine on V2 chemistry and sequenced on a 1 × 75 bp Illumina NextSeq flow cell. ATAC sequencing cells was done on Illumina NextSeq at a sequencing depth of ~40–60 million reads per sample. Libraries were prepared in triplicates. Raw reads were deposited under GSE85073. 2 × 75 bp paired-end reads were mapped to the mouse mm9 genome using ‘bwa’ algorithm with ‘mem’ option. Only reads that uniquely mapped to the genome were used in subsequent analysis. Duplicate reads were eliminated to avoid potential PCR amplification artifacts and to eliminate the high numbers of mtDNA duplicates observed in ATAC–seq libraries. Post-alignment filtering resulted in ~26–40 million uniquely aligned singleton reads per library and the technical replicates were merged into one alignment BAM file to increase the power of open chromatin signal in downstream analysis. Depicted tracks were normalized to total read depth. ATAC–seq enriched regions (peaks) in each sample was identified using MACS2 using the below settings: 10 × 106 wild-type and knockout mice neutrophils were cross-linked in a 1% formaldehyde solution for 10 min at room temperature while rotating. Crosslinking was stopped by adding glycine (0.2 M in 1 × PBS) and incubating on ice for 2 min. Samples were spun at 2500g for 5 min at 4 °C and washed 4 times with 1× PBS. The pellets were flash frozen and stored at −80 °C. Cells were lysed and sonicated (Branson Sonifier 250) for 9 cycles (30% amplitude; time, 20 s on, 1 min off). Lysates were spun at 18,400g for 10 min at 4 °C and resuspended in 3 ml of lysis buffer. A sample of 100 μl was kept aside as input and the rest of the samples were divided by the number of antibodies to test. Chromatin immunoprecipitation was performed with 10 μg of antibody-bound beads (anti-H3K27ac, H3K4me3, H3K4me1, H3K36me3 (Abcam) and anti-rabbit IgG (Santa Cruz), Dynal Protein G magnetic beads (Invitrogen)) and incubated overnight at 4 °C. Bead-bound DNA was washed, reverse cross-linked and eluted overnight at 65 °C, shaking at 950 r.p.m. Beads were removed using a magnetic stand and eluted DNA was treated with RNase A (0.2 μg μl−1) for 1 h at 37 °C shaking at 950 r.p.m., then with proteinase K (0.2 μg μl−1) for 2 h at 55 °C. 30 μg of glycogen (Roche) and 5 M of NaCl were adding to the samples. DNA was extracted with 1 volume of phenol:chlorofrom:isoamyl alcohol and washed out with 100% ethanol. Dried DNA pellets were resuspended in 30 μl of 10 mM Tris HCl, pH 8.0, and DNA concentrations were quantified using Qubit. Starting with 10 ng of DNA, ChIP–seq libraries were prepared using the KAPA Hyper Prep Kit (Kapa Biosystems, Inc.) with 10 cycles of PCR. The libraries were quantified using qPCR and bioanalyser data then normalized and pooled to 2 nM. Each 2 nM pool was then denatured with a 0.1 N NaOH solution in equal parts then further diluted to form a 20 pM denatured pool. This pool was then further diluted down to 1.8 pM for sequencing using the NextSeq500 machine on V2 chemistry and sequenced on a 1 × 75 bp Illumina NextSeq flow cell. ChIP sequencing was done on an Illumina NextSeq at a sequencing depth of ~30–40 million reads per sample. Raw reads were deposited under GSE85073. 75 bp single-end reads were mapped to the mouse mm9 genome using ‘bowtie2’ algorithm. Duplicate reads were eliminated to avoid potential PCR amplification artifacts and only reads that uniquely mapped to the genome were used in subsequent analysis. Depicted tracks were normalized to control IgG input sample. ChIP–seq-enriched regions (peaks) in each sample was identified using MACS2 using the below settings: 107 immortalized BMDMs were collected by trypsinization and resuspended in 2 ml PBS, 2 ml nuclear isolation buffer (1.28 M sucrose; 40 mM Tris-HCl, pH 7.5; 20 mM MgCl ; 4% Triton X-100), and 6 ml water on ice for 20 min (with frequent mixing). Nuclei were pelleted by centrifugation at 2,500g for 15 min. Nuclear pellets were resuspended in 1 ml RNA immunoprecipitation (RIP) buffer (150 mM KCl, 25 mM Tris, pH 7.4, 5 mM EDTA, 0.5 mM DTT, 0.5% NP40; 100 U ml−1 SUPERaseIn, Ambion; complete EDTA-free protease inhibitor, Sigma). Resuspended nuclei were split into two fractions of 500 μl each (for mock and immunoprecipitation) and were mechanically sheared using a dounce homogenizer. Nuclear membrane and debris were pelleted by centrifugation at 15,800g. for 10 min. Antibody to EZH2 (Cell Signaling 4905S; 1:30) or normal rabbit IgG (mock immunoprecipitation, SantaCruz; 10 μg) were added to supernatant and incubated for 2 hours at 4 °C with gentle rotation. 25 μl of protein G beads (New England BioLabs S1430S) were added and incubated for 1 hour at 4 °C with gentle rotation. Beads were pelleted by magnetic field, the supernatant was removed, and beads were resuspended in 500 μl RIP buffer and repeated for a total of three RIP buffer washes, followed by one wash in PBS. Beads were resuspended in 1 ml of Trizol. Co-precipitated RNAs were isolated, reverse-transcribed to cDNA, and assayed by qPCR for the Hprt and Morrbid-isoform1. Primer sequences are listed in Supplementary Table 1. EZH2 PAR–CLIP dataset (GSE49435) was analysed as previously described22. Adapter sequences were removed from total reads and those longer than 17 bp were kept. The Fastx toolkit was used to remove duplicate sequences, and the resulting reads were mapped using BOWTIE allowing for two mismatches. The four independent replicates were pooled and analysed using PARalyzer, requiring at least two T→C conversions per RNA–protein contact site. lncRNAs were annotated according to Ensemble release 67. 13 × 106 wild-type bone marrow derived mouse eosinophils were fixed with 1% formaldehyde for 10 minutes at room temperature, and quenched with 0.2 M glycine on ice. Eosinophils were lysed for 3–4 hours at 4 °C (50 mM Tris, pH 7.4, 150 mM NaCl, 0.5% NP-40, 1% Triton X-100, 1× Roche complete protease inhibitor) and dounce-homogenized. Lysis was monitored by Methyl-green pyronin staining (Sigma). Nuclei were pelleted and resuspended in 500 μl 1.4× NEB3.1 buffer, treated with 0.3% SDS for one hour at 37 °C, and 2% Triton X-100 for another hour at 37 °C. Nuclei were digested with 800 units BglII (NEB) for 22 hours at 37 °C, and treated with 1.6% SDS for 25 minutes at 65 °C to inactivate the enzyme. Digested nuclei were suspended in 6.125 ml of 1.25× ligation buffer (NEB), and were treated with 1% Triton X-100 for one hour at 37 °C. Ligation was performed with 1,000 units T4 DNA ligase (NEB) for 18 hours at 16 °C, and crosslinks were reversed by proteinase K digestion (300 μg) overnight at 65 °C. The 3C template was treated with RNase A (300 μg), and purified by phenol-chloroform extraction. Digested and undigested DNA were run on a 0.8% agarose gel to confirm digestion. To control for PCR efficiency, two bacterial artificial chromosomes (BACs) spanning the region of interest were combined in equimolar quantities and digested with 500 units BglII at 37 °C overnight. Digested BACs were ligated with 100 units T4 Ligase HC (Promega) in 60 μl overnight at 16 °C. Both BAC and 3C ligation products were amplified by qPCR (Applied Biosystems ViiA7) using SYBR fast master mix (KAPA biosystems). Products were run side by side on a 2% gel, and images were quantified using ImageJ. Intensity of 3C ligation products was normalized to intensity of respective BAC PCR product. Mice were infected with 30,000 CFUs of Listeria monocytogenes (strain 10403s, obtained as a gift from E. J. Wherry) intravenously (i.v.). Mice were weighed and inspected daily. Mice were analysed at day 4 of infection to determine the CFUs of L. monocytogenes present in the spleen and liver. Papain was purchased from Sigma Aldrich and resuspended in at 1 mg ml−1 in PBS. Mice were intranasally challenged with 5 doses of 20 μg papain in 20 μl of PBS or PBS alone every 24 hours. Mice were killed 12 hours after the last challenge. Bronchoalveolar lavage was collected in two 1 ml lavages of PBS. Cellular lung infiltrates were collected after 1 hour digestion in RPMI supplemented with 5% FCS, 1 mg ml−1 collagenase D (Roche) and 10 μg ml−1 DNase I (Invitrogen) at 37 °C. Homogenates were passed through a cell strainer and infiltrates separated with a 27.5%, Optiprep gradient (Axis-Shield) by centrifugation at 1,175g for 20 min. Cells were removed from the interface and treated with ACK lysis buffer. Congenic C57BL/6 (wild-type) bone marrow expressing CD45.1 and CD45.2 and Morrbid-deficient bone marrow expression CD45.2 was mixed in a 1:1 ratio and injected into C57BL/6 hosts irradiated twice with 5 Gy 3 hours apart that express CD45.1 (B6.SJL-Ptprca Pepcb/BoyJ). Mice were analysed between 4–9 weeks after injection. Bone marrow was isolated and cultured as previously described9. Briefly, unfractionated bone marrow cells were cultured with 100 ng ml−1 stem cell factor (SCF) and 100 ng ml−1 FLT3-ligand (FLT3-L). At day 4, the media was replaced with media containing 10 ng ml−1 interleukin (IL-5). Mature bone-marrow-derived eosinophils were analysed between day 10–14. Bone marrow cells were isolated and cultured in media containing recombinant mouse M-CSF (10 ng ml−1) for 7–8 days. On day 7–8, cells were re-plated for use in experimental assays. Bone-marrow-derived macrophages were stimulated with LPS (250 ng ml−1) for the indicated periods of time. Briefly, 40 × 107 Immortalized bone-marrow-derived macrophages were fixed with 40 ml of 1% glutaraldehyde for 10 min at room temperature. Crosslinking was quenched with 0.125 M glycine for 5 min. Cells were rinsed with PBS, pelleted for 4 min at 2,000g, snap-frozen in liquid nitrogen, and stored at −80 °C. Cell pellets were thawed at room temperature and resuspended in 800 μl of lysis buffer (50 mM Tris-HCl, pH 7.0, 10 mM EDTA, 1% SDS, 1 mM PMSF, complete protease inhibitor (Roche), 0.1 U ml−1 Superase In (Life Technologies)). Cell suspension was sonicated using a Covaris S220 machine (Covaris; 100 W, duty factor 20%, 200 cycles per burst) for 60 minutes until DNA was in the size range of 100–500 bp. After centrifugation for 5 min at 16100 g at 4 °C, the supernatant was aliquoted, snap-frozen in liquid nitrogen, and stored at −80 °C. 1 ml of chromatin was diluted in 2 ml hybridization buffer (750 mM NaCl, 1% SDS, 50 mM Tris HCl, pH 7.0, 1 mM EDTA, 15% formamide) and input RNA and DNA aliquots were removed. 100 pmoles of probes (Supplementary Table 1) were added and mixed by rotation at 37 °C for 4 h. Streptavidin paramagnetic C1 beads (Invitrogen) were equilibrated with lysis buffer. 100 μl washed C1 beads were added, and the entire reaction was mixed for 30 min at 37 °C. Samples were washed five times with 1 ml of washing buffer (SSC 2×, 0.5% SDS and fresh PMSF). 10% of each sample was removed from the last wash for RNA isolation. RNA aliquots were added to 85 μl RNA PK buffer, pH 7.0, (100 mM NaCl, 10 mM TrisCl, pH 7.0, 1 mM EDTA, 0.5% SDS, 0.2 U μl−1 proteinase K) and incubated for 45 min with end-to-end shaking. Samples were spun down, and boiled for 10 min at 95 °C. Samples were chilled on ice, added to 500 μl TRizol, and RNA was extracted according to the manufacturer’s recommendations. Equal volume of RNA was reverse-transcribed and assayed by qPCR using Hprt and Morrbid-exon1-1 primer sets (Supplementary Table 1). DNA was eluted from remaining bead fraction twice using 150 μl DNA elution buffer (50 mM NaHCO , 1%SDS, 200 mM NaCl, 100 μg ml−1 RNase A, 100 U ml−1 RNase H) incubated for 30 min at 37 °C. DNA elutions were combined and treated with 15 μl (20 mg ml−1) Proteinase K for 45 min at 50 °C. DNA was purified using phenol:chloroform:isoamyl and assayed by qPCR using the indicated primer sequences (Supplementary Table 1). shRNAs of indicated sequences (Supplementary Table 1) were cloned into pGreen shRNA cloning and expression lentivector. Psuedotyped lentivirus was generated as previously described, and 293T cells were transfected with a packaging plasmid, envelop plasmid, and the generated shRNA vector plasmid using Lipofectamine 2000. Virus was collected 14–16 h and 48 h after transfection, combined, 0.4-μm filtered, and stored at −80 °C. For generation of in vivo BM chimaeras, virus was concentrated 6 times by ultracentrifugation using an Optiprep gradient (Axis-Shield). For transduced BM-derived eosinophils, cultured BM cells on day 3 of previously described culture conditions were mixed 1:1 with indicated lentivirus and spinfected for 2 h at 260g at 25 °C with 5 μg ml−1 polybrene. Cultures were incubated overnight at 37 °C, and media was exchanged for IL-5 containing media at day 4 of culture as previously described9. Cells were sorted for GFP+ cells on day 5 of culture, and then cultured as previously described for eosinophil generation. Cells were assayed on day 11 of culture. For transduced in vivo BM chimaeras, BM cells were cultured at 2.5 × 106 cells per ml in mIL-3 (10 ng ml−1), mIL-6 (5 ng ml−1) and mSCF (100 ng ml−1) overnight at 37 °C. Culture was readjusted to 2 ml at 2.5 × 106 cells per ml in a 6-well plate, and spinfected for 2 h at 260g at 25 °C with 5 μg ml−1 polybrene. Cells were incubated overnight at 37 °C. On the day before transfer, recipient hosts were irradiated twice with 5 Gy 3 hours apart. Mice were analysed between 4 and 5 weeks following transfer. Bone marrow-derived macrophages (BMDMs) were transfected with pooled Morrbid or scrambled locked nucleic acid (LNA) antisense oligonucleotides of equivalent total concentrations using Lipofectamine 2000. Morrbid LNA pools contained Morrbid LNA 1-4 sequences at a total of 50 or 100 nM (Supplementary Table 1). After 24 h, the transfection media was replaced. The BMDMs were incubated for an additional 24 h and subsequently stimulated with LPS (250 ng ml−1) for 8−12 h. Eosinophils were derived from mouse BM as previously described. On day 12 of culture, 1 × 106 to 2 × 106 eosinophils were transfected with 50 nm of Morrbid LNA 3 or scrambled LNA (Supplementary Table 1) using TransIT-oligo according to manufacturer’s protocol. RNA was extracted 48 h after transfection. Guide RNAs (gRNAs) targeting the 5’ and 3’ flanking regions of the Morrbid promoter were cloned into Cas9 vectors pSPCas9(BB)-2A-GFP(PX458) (Addgene plasmid 48138) and pSPCas9(BB)-2A-mCherry (a gift from the Stitzel lab, JAX-GM) respectively. gRNA sequences are listed in Supplementary Table 1. The cloned Cas9 plasmids were then transfected into RAW 264.7, a mouse macrophage cell line using Lipofectamine 2000, according to manufacturer’s protocol. Forty–eight hours post transfection the double positive cells expressing GFP and mcherry, and the double negative cells lacking GFP and mcherry were sorted. The bulk sorted cells were grown in a complete media containing 20% FBS, assayed for deletion by PCR, as well as for Morrbid and Bcl2l11 transcript expression by qPCR. BM-derived eosinophils, or neutrophils or Ly6Chi monocytes sorted from mouse BM, were rested for 4–6 hours at 37 °C in complete media. Cells were subsequently stimulated with IL-3 (10 ng ml−1, Biolegend), IL-5 (10 ng ml−1, Biolegend), GM-CSF (10 ng ml−1, Biolegend), or G-CSF (10 ng ml−1, Biolegend) for 4–6 h. RNA was collected at each time-point using TRIzol (Life Technologies). Wild-type and Bcl2l11−/− BM-derived eosinophils were generated as previously described9. On day 8 of culture, the previously described IL-5 media was supplemented with the indicated concentrations of the EZH2-specific inhibitor GSK126 (Toronto Research Chemicals). Media was exchanged for fresh IL-5 GSK126 containing media every other day. Cells were assayed for numbers and cell death by flow cytometry every day for 6 days following GSK126 treatment. Total RNA was extracted from TRIzol (Life Technologies) according to the manufacturer’s instructions. Gycogen (ThermoFisher Scientific) was used as a carrier. Isolated RNA was quantified by spectophotemetry, and RNA concentrations were normalized. cDNA was synthesized using SuperScript II Reverse Transcriptase (ThermoFisher Scientific) according to the manufacturer’s instructions. Resulting cDNA was analysed by SYBR Green (KAPA SYBR Fast, KAPABiosystems) or Taqman-based (KAPA Probe Fast, KAPABiosystems) using indicated primers. Primer sequences are listed in Supplementary Table 1. All reactions were performed in duplicate using a CFX96 Touch instrument (BioRad) or ViiA7 Real-Time PCR instrument (ThermoFischer Scientific). Reads generated from mouse (Gr1+) granulocytes (previously published GSE53928), human neutrophils (previously published GSE70068), and bovine peripheral blood leukocytes (previously published GSE60265) were filtered, normalized, and aligned to the corresponding host genome. Reads mapping around the Morrbid locus were visualized. For visualization of the high level of Morrbid expression in short-lived myeloid cells, reads from sorted mouse eosinophils (previously published GSE69707), were filtered, aligned to mm9, normalized using RPKM, and gene expression was plotted in descending order. For each human sample corresponding to the indicated stimulation conditions, the number of reads mapping to the human MORRBID locus per total mapped reads was determined. For conservation across species, the genomic loci and surrounding genomic regions for the species analysed were aligned with mVista and visualized using the rankVista display generated with mouse as the reference sequence. Green highlights annotated mouse exonic regions and corresponding regions in other indicated species. Single molecule RNA fluorescence in situ hybridization (FISH) was performed as previously described. A pool of 44 oligonucleotides (Biosearch Technologies) were labelled with Atto647N (Atto-Tec). For validation purposes, we also labelled subsets consisting of odd and even numbered oligonucleotides with Atto647N and Atto700, respectively, and looked for colocalization of signal. We designed the oligonucleotides using the online Stellaris probe design software. Probe oligonucleotide sequences are listed in Supplementary Table 1. Thirty Z-sections with a 0.3-μm spacing were taken for each field of view. We acquired all images using a Nikon Ti-E widefield microscope with a 100× 1.4NA objective and a Pixis 1024BR cooled CCD camera. We counted the mRNA in each cell by using custom image processing scripts written in MATLAB. For nuclear and cytoplasmic fractionation, 5 × 106 BMDMs were stimulated with 250 ng ml−1 LPS for 4 hours. Cells were collected and washed once with cold PBS. Cells were pelleted, resuspended in 100 μl cold NAR A buffer (10 mM HEPES, pH 7.9, 10 mM KCl, 0.1 mM EDTA, 1× complete EDTA-free protease inhibitor, Sigma; 1 mM DTT, 20 mM β-glycerophasphate, 0.1 U μl−1 SUPERaseIn, Life Technologies), and incubated at 4 °C for 20 min. 10 μl 1% NP-40 was added, and cells were incubated for 3 min at room temperature. Cells were vortexed for 30 seconds, and centrifuged at 3,400g. for 1.5 min at 4 °C. Supernatant was removed, centrifuged at full speed for 90 min at 4 °C, and remaining supernatant was added to 500 μl Trizol as the cytoplasmic fraction. The original pellet was washed 4 times in 100 μl NAR A with short spins of 6,800g. for 1 min. The pellet was resuspended in 50 μl NAR C (20 mM HEPES, pH 7.9, 400 mM NaCl, 1 mM EDTA, 1× complete EDTA-free protease inhibitor, Sigma, 1 mM DTT, 20mM β-glycerophasphate, 0.1 U μl−1 SUPERaseIn, Life Technologies). Cells were vortexed every 3 min for 10 s for a total of 20 min at 4 °C. The sample was centrifuged at maximum speed for 20 min at room temperature. Remaining supernatant was added to 500 μl Trizol as the nuclear fraction. Equivalent volumes of cytoplasmic and nuclear RNA were converted to cDNA using gene specific primers and Super Script II RT (Life Technologies). Fraction was assessed by qPCR for Morrbid-exon1-1 and other known cytoplasmic and nuclear transcripts. Primer sequences are listed in Supplementary Table 1. For cytoplasmic, nuclear, and chromatin fractionation, cell fractions 5 × 106 to 10 × 106 immortalized macrophages were activated with 250 ng ml−1 LPS (Sigma) for 6 hours at 37 °C. Cells were washed 2× with PBS, and then resuspended in 380 μl ice-cold HLB (50 mM Tris-HCl, pH7.4, 50 mM NaCl, 3 mM MgCl , 0.5% NP-40, 10% glycerol), supplemented with 100 U SUPERase In RNase Inhibitor (Life Technologies). Cells were vortexed 30 s and incubated on ice for 30 min, followed by a final 30 s vortex and centrifugation at 4 °C for 5 min × 1000g. Supernatant was collected as the cytoplasmic fraction. Nuclear pellets were resuspended by vortexing in 380 μl ice-cold MWS (50 mM Tris-HCl, pH7.4, 4 mM EDTA, 0.3 M NaCl, 1 M urea, 1% NP-40) supplemented with 100 U SUPERase in RNase Inhibitor. Nuclei were lysed on ice for 10 min, vortexed for 30 s, and incubated on ice for 10 more min to complete lysis. Chromatin was pelleted by centrifugation at 4 °C for 5 min × 1000g. Supernatant was collected as the nucleoplasmic fraction. RNA was collected as described previously and cleaned up using the RNeasy kit (Qiagen). Equivalent volumes of cytoplasmic, nucleoplasmic, and chromatin-associated RNA were converted to cDNA using random hexamers and Super Script III RT (Life Technologies). Fraction was assessed by qPCR for Morrbid-exon1-2 and other known cytoplasmic and nuclear transcripts. Primer sequences are listed in Supplementary Table 1. Morrbid cDNA was cloned into reference plasmid (pCDNA3.1) containing a T7 promoter. The plasmid was linearized and Morrbid RNA was in vitro transcribed using the MEGAshortscript T7 kit (Life Technologies), according to the manufacturer’s recommendations, and purified using the MEGAclear kit (Life Technologies). RNA was quantified using spectrophotometry and serial dilutions of Morrbid RNA of calculated copy number were spiked into Morrbid-deficient RNA isolated from Morrbid-deficient mouse spleen. Samples were reverse transcribed in parallel with wild-type-sorted neutrophil RNA and B-cell RNA isolated from known cell number using gene-specific Morrbid primers, and the Morrbid standard curve and wild-type neutrophils and B cells were assayed using qPCR with Morrbid-exon 1 primer sets (Supplementary Table 1) Cohorts of mice were given a total of 4 mg bromodeoxyuridine (BrdU; Sigma Aldrich) in 2 separate intraperitoneal (i.p.) injections 3 h apart and monitored over the subsequent 5 days, unless otherwise noted. For analysis cells were stained according to manufacturer protocol (BrdU Staining Kit, ebioscience; anti-BrdU, Biolgend). A one-phase exponential curve was fitted from the peak labelling frequency to 36 h after peak labelling within each genetic background, and the half-life was determined from this curve. Study subjects were recruited and consented in accordance with the University of Pennsylvania Institutional Review Board. Peripheral blood was separated by Ficoll–Paque density gradient centrifugation, and the mononuclear cell layer and erythrocyte/granulocyte pellet were isolated and stained for fluorescence-associated cell sorting as previously described. Neutrophils (live, CD16+F4/80intCD3−CD14−CD19−), eosinophils (live, CD16−F4/80hiCD3−CD14−CD19−), T cells (live, CD3+CD16−), monocytes (live, CD14+CD3−CD16−CD56−), natural killer (NK) cells (live, CD56+CD3−CD16−CD14−), B cells (live, CD19+CD3−CD16−CD14−CD56−). Samples from human subjects were collected on NIAID IRB-approved research protocols to study eosinophilic disorders (NCT00001406) or to provide controls for in vitro research (NCT00090662). All participants gave written informed consent. Eosinophils were purified from peripheral blood by negative selection and frozen at –80 oC in TRIzol (Life Technologies). Purity was >97% as assessed by cytospin. RNA was purified according to the manufacturer’s instructions. Expression analysis by qPCR was performed in a blinded manner by an individual not involved in sample collection or coding of these of these samples. Plasma IL-5 levels were measured by suspension array in multiplex (Millipore). The minimum detectable concentration was 0.1 pg ml−1. RAW 264.7 cells were obtained from ATCC and were not authenticated, but were tested for mycoplasma contamination biannually. Immortalized C57/B6 macrophages were obtained as a generous gift from I. Brodsky. These cells were not authenticated, but were tested for mycoplasma contamination biannually. Samples sizes were estimated based on our preliminary phenotyping of Morrbid-deficient mice. Preliminary cell number analysis of eosinophils, neutrophils, and Ly6Chi monocytes suggested that there were very large differences between wild-type and Morrbid-deficient samples, which would allow statistical interpretation with relatively small numbers and no statistical methods were used to predetermine sample size. No animals were excluded from analysis. All experimental and control mice and human samples were run in parallel to control for experimental variability. The experiments were not randomized. Experiments corresponding to Fig. 3g–i and Fig. 4g–j were performed and analysed in a single-blinded manner. All other experiments were not blinded to allocation during experiments and outcome assessment. Correlation was determined by calculating the Spearman correlation coefficient. Half-life was estimated by calculating the one-phase exponential decay constant from the peak of labelling frequency to 36 h after peak labelling. P values were calculated using a two-way t-test, Mann–Whitney U-test, one-way ANOVA with Tukey post-hoc analysis, Kaplan–Meier Mantel–Cox test, and false discovery rate (FDR) as indicated. FDR was calculated using trimmed mean of M-values (TMM)-normalized read counts and the DiffBind R package as described in Extended Data Fig. 7c, d. All error bars indicate mean plus and minus the standard error of mean (s.e.m.).
News Article | October 12, 2016
No statistical methods were used to predetermine sample size. The experiments were not randomized, and investigators were not blinded to allocation during experiments and outcome assessment. An immortalized lymphoblastoid cell line was established from the AK1 individual through Epstein–Barr virus transformation of mononuclear cells (Seoul Clinical Laboratories Inc.). Full pathogen testing was performed and maintained in a mycoplasma-free facility. AK1 lymphoblastoid cell line was cultured in RPMI 1640 media containing 15% FBS at 37 °C in a humidified 5% CO environment. The approval number C-0806-023-246 for the AK1 individual was assigned based on the guidelines from the Institutional Review Board of Seoul National University. Genomic DNA was extracted from AK1 cells using the Gentra Puregene Cell Kit (Qiagen). Large-insert PacBio library preparation was conducted following the Pacific Biosciences recommended protocols. In brief, a total of 60 μg AK1 genomic DNA was sheared to ~20 kb targeted size by using Covaris g-TUBEs (Covaris). Each shearing processed 10 μg input DNA and a total of 6 shearings were performed. The sheared genomic DNA was examined by Agilent 2100 Bioanalyzer DNA12000 Chip (Agilent Technologies Inc.) for size distribution and underwent DNA damage repair/end repair, blunt-end adaptor ligation followed by exonuclease digestion. The purified digestion products were loaded onto pre-cast 0.6% agarose for 7–50 kb size selection using the BluePippin Size Selection System (Sage Science), and the recovered size-selected library products were purified using 0.5× pre-washed Agencourt AMPure XP beads (Beckman Coulter). The final libraries were examined by Agilent 2100 Bioanalyzer DNA12000 Chip for size distribution and the library concentration was determined with Qubit 2.0 Fluorometer (Life Technologies). We sequenced with the PacBio RSII instrument with P6 polymerase binding and C4 chemistry kits (P6C4). A total of 380 SMRT Cells were used to yield 101-fold whole-genome sequence data. AK1 cells were pelleted and washed with PBS; the final cell pellet was re-suspended in cell-suspension buffer using the CHEF Mammalian Genomic DNA Plug Kit (Bio-Rad). Cells were then embedded in CleanCut low-melt Agarose (Bio-Rad) and spread into a thin layer on a custom support (in development at BioNano Genomics). Cells were lysed using IrysPrep Lysis Buffer (BioNano Genomics), protease-treated with Puregene Proteinase K (Qiagen), followed by brief washing in Tris with 50mM EDTA and then washing in Tris with 1 mM EDTA before RNase treatment with Puregene RNase (Qiagen). DNA was then equilibrated in Tris with 50 mM EDTA and incubated overnight at 4 °C before extensive washing in Tris with 0.1 mM EDTA followed by equilibration in NEBuffer 3 (New England BioLabs) at 1× concentration. Purified DNA in the thin layer agarose was labelled following the IrysPrep Reagent Kit protocol with adaptations for labelling in agarose. In brief, 1.25 μg of DNA was digested with 0.7 U Nt.BspQI nicking endonuclease per microlitre of reaction volume in NEBuffer 3 (New England BioLabs) for 130 min at 37 °C, then washed with TE Low EDTA Buffer (Affymetrix), pH 8.0, followed by equilibration with 1× ThermoPol Reaction Buffer (New England BioLabs). Nick-digested DNA was then incubated for 70 min at 50 °C using the IrysPrep Labelling mix (BioNano Genomics) and Taq DNA Polymerase (New England BioLabs) at a final concentration of 0.4 U μl−1. Nick-labelled DNA was incubated for 40 min at 37 °C using the IrysPrep Repair mix (BioNano Genomics) and Taq DNA Ligase (New England BioLabs) at a final concentration of 1 U μl−1. Labelled-repaired DNA was then recovered from the thin layer agarose by digesting with GELase and counterstained with IrysPrep DNA Stain (BioNano Genomics) before data collection on the Irys System. The fragile site rescue process protects fragile sites by reducing the temperature of the labelling reaction and minimizes shear forces by restraining DNA in agarose until nicks are repaired. In this case, only the closest opposite-strand nick-pairs break. Sample indexing and partition barcoded libraries were prepared using GemCode Gel Bead and Library Kit (10× Genomics)4. Sequencing was conducted with Illumina Hiseq2500 to generate linked reads. Libraries were generated with PCR-free protocols. gDNA was sheared twice using Covaris S2 with cycling conditions of 10% duty cycle, Cycles/Burst 200, and Time 100 s. The sheared DNA was processed using the Illumina TruSeq DNA PCR-Free LT Library Kit protocol to generate 550 bp inserts, which includes end repair, SPRI bead size selection, A-tailing, and Y-adaptor ligation. Library concentration was measured by qPCR and loaded on HiSeq X Ten instruments (PE-150) to generate 72-fold sequence coverage. A total of 32,026 BAC clones were selected from the 252 384-well plates and re-plated into 96-well plates. Clones were grown overnight, and the cultures were used to prepare two additional replicates for the two 384-well plates that were stored at −80 °C in LB medium containing 20% glycerol. A total of 32,026 clone cultures with growth at ODs ranging from 0.6 to 1.0 were pooled, pelleted and the DNA was extracted using the standard alkaline lysis method. In this procedure, a cell pellet was resuspended in 150 μl of Qiagen buffer P1 with RNase and lysed with 150 μl of 0.2 M NaOH, 1% SDS solution for 5 min. Lysis was neutralized with the addition of 150 μl of 3 M sodium acetate, pH 4.8. Neutralized lysate was incubated on ice for 30 min, and DNA was collected by centrifugation for 15 min at 15.7g at 4 °C, concentrated by standard ethanol precipitation and resuspended in 25 μl of 10 mM Tris-HCl, pH 8.5. DNA from approximately 150 BAC clones with roughly equimolar concentration was combined into a single pool. A total of 10 μg from each pool DNA was sheared and fragments of insert size ranging from 10 to 15 kb were selected. Two libraries were prepared from the pooled DNA using a PacBio SMRTbell library preparation kit v1.0. The libraries were quantified using a Qubit 2.0 fluorometer and each library was sequenced using two SMRT cells with P6C4 chemistry. DNA from approximately 290 BAC clones with roughly equimolar concentration was combined into a single BAC pool. One nanogram of DNA from each pool was digested and fragments of insert size ranging from 500 to 550 bp were selected. In total, 109 libraries were prepared from the pooled DNA using Illumina-compatible Nextera XT DNA sample prep kit and sequenced with HiSeq2500. We extracted RNA from tissue using RNAiso Plus (Takara Bio), followed by purification using RNeasy MinElute (Qiagen). RNA was assessed for quality and was quantified using RNA 6000 Nano LabChip on a 2100 Bioanalyzer (Agilent). The RNA sequencing (RNA-seq) libraries were prepared as previously described20. RNA library was sequenced with Illumina TruSeq SBS Kit v3 on a HiSeq 2000 sequencer (Illumina) to obtain 100 bp paired-end reads. The image analysis and base calling were performed using the Illumina pipeline with default settings. Total RNA extracted from AK1 cells with RNA integrity number (RIN) > 8.0 was used for library preparation. The library was constructed following the Clontech SMARTer-PCR cDNA Synthesis Sample Preparation Guide. 1–2 kb, 2–3 kb, 3–6 kb and >5 kb libraries were selected by Sage, ELF purified, end-repaired and blunt-end SMRTbell adapters were ligated. The fragment size distribution was confirmed on a Bioanalyzer HS chip (Agilent) and quantified on a Qubit fluorometer (Life Technologies). The fragment size distribution was validated on a Bioanalyzer HS chip (Agilent) and quantified on a Qubit fluorometer (Life Technologies). The sequencing was carried out on the PacBio RSII instrument using P6C4. Around 31 million subreads were used for assembly with FALCON v0.3.0 (ref. 21) given length_cutoff parameter of 10 kb for initial mapping to build pre-assembled reads (preads), and preads over 15 kb were used (length_cutoff_pr) to maximize the assembled contig N50 (Extended Data Fig. 2). Primary and associated contigs were polished using Quiver5. Optical maps were de novo assembled into genome maps using BioNano assembler software (Irys System, BioNano Genomics). Single molecules longer than 150 kb with at least 8 fluorescent labels were used to find possible overlaps (P < 1 × 10−10). Next, these maps were constructed to consensus maps by recursively refining and extending them by mapping single molecules (P < 1 × 10−5). The consensus maps were compared and merged into genome maps when patterns matched (P < 1 × 10−10). A second set of optical maps was obtained thereafter, and generated into genome maps with the same criteria. Primary contigs were in silico digested into cmaps and were compared with genome maps for scaffolding. The scaffolding was visualized and performed with the Irys Viewer. When conflict occurred, the contigs were edited with the guidance of genome map. Paired-end reads from Illumina platform were aligned to the assembly using bwa22 mem, followed with duplication removal using Picard tools23. Base-pair correction of the assembly was performed using Pilon24. Pilon mostly corrected single insertions and deletions in regions enriched with homopolymer. Contigs or scaffolds shorter than 10 kb were excluded from the overall analysis to avoid results from spurious misassembly. Scaffolding accuracy of the AK1 assembly was assessed using the AK1 BAC library1. AK1 BAC end sequences (BES) were aligned to GRCh37, GRCh38 and AK1 assemblies using BWA. The BES placements were categorized by the alignment, orientation and separation of BES with respect to the assembly. The BES placement was determined to be concordant: (1) if the BES placement was placed in the same assembly unit; (2) if the paired end sequences were properly oriented; and (3) if the in silico insert size was between 50,000 and 250,000 bp. If the BES placements did not meet these conditions, the BES placement was defined to be discordant. In addition, if only one of the paired end sequences were aligned to the assembly, the BES placement was defined to be an orphan placement. If both paired-end sequences were unaligned to the assembly, the BES was defined to be unmapped. If either of the paired-end sequences were aligned to different positions of the assembly multiple times, the BES was defined to have multiple placements. To identify the precise genomic location of each assembly unit, we used LASTZ25 with parameters (-gapped -gap = 600,150,-hspthresh = 4500,-seed = 12of19 -notransition -ydrop = 15000-chain) to align each assembly unit to each chromosome in the human reference genome. Chaining procedure was followed to join the neighbouring local alignments into a single cohesive alignment. The chained alignments of each assembly unit were processed to obtain a single alignment with the best alignment score. If the selected alignment was not fully representative of the assembly unit, we selected a set of alignments that was better representative of the assembly unit. A netting procedure was then followed with the selected chained alignments. The chaining and netting procedures were applied using UCSC Kent tools26 and parallel processing was used when possible to increase computational speed. Gaps were classified into telomeric, centromeric, heterochromatic, acrocentric and euchromatic region according to the agp file and cytoband information provided by the Genome Reference Consortium (GRC) and UCSC genome browser. In total, 190 euchromatic gaps were targeted for gap closure with AK1 assembly. The gaps that could not be closed or extended with the AK1 assembly were subjected to closure through local assembly using Canu27 or a contiguous subread. Subreads mapped 10 kb upstream or downstream of the gap were chosen for local assembly. Alignment was performed with BLASR28 -bestn 3, and primary aligned reads with mapping quality of 254 were used. The assembled contigs were thereafter aligned to their respective gap position to precisely identify the added sequences. Subreads used to close the gaps were chosen following criteria described in the Supplementary Information. The alignments of the assembly to the reference genome were parsed to obtain SNPs, indels and SVs, which we defined as insertion, deletion, inversion and complex variants with event size equal to or greater than 50 bp. The complex SVs are the same as ‘double-sided insertion’ defined previously29. We used GRCh37 instead of GRCh38 for the main analysis for compatibility and comparison with previously reported structural variations. Repeat elements were annotated using RepeatMasker (-species human -no_is) and tandem repeat finder (TRF) (2 7 7 80 10 50 2000 -f -m -h -d). SVs are classified accordingly if it is masked by at least 70% with a single type. Complex is defined as the SVs having either several annotated repeat elements, or at least 30% of the remaining sequence not annotated as repeat. Novelty was identified by comparing the breakpoints with 50% reciprocal overlap criterion. Functional annotation was performed using both GENCODE release v19 (GRCh37) and v21 (GRCh38)30 and the Ensembl Regulatory Build31. For those SVs that occurred within gene regulatory domains, we annotated with the nearest gene name. SV located within pericentromeric regions (5 Mb flanking annotated centromeres) and subtelomeric regions (150 kb from the annotated telomeric sequence) were annotated as heterochromatin. Both pilot and strict accessibility genome mask regions (version 20141020) were downloaded from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/supporting/accessible_genome_masks/. Segmental duplication sites were downloaded from the UCSC table browser. To simplify categorization of the SVs that lie within multiple functional regions, they were classified according to the order of priority as follow: coding sequence, untranslated region, intron, transcription-factor-binding site, promoter, enhancer, CTCF (transcriptional repressor), and intergenic. To annotate whether the SVs called from GRCh37 were also shared with GRCh38 SV sets, we compared each AK1 breakpoints with 50% reciprocal overlap criterion. In addition, we assessed whether the SVs called from GRCh38 were also represented in the alternative contigs by measuring the concordance against the SV regions including the surrounding 50 bp from the breakpoints. Population allele frequency of SVs was obtained by aligning reads from 38 high-coverage samples from five different ancestral backgrounds (African, American, European, East Asian, and South Asian) to the AK1 assembly. We obtained whole-genome sequencing data of 23 individuals from the 1000 Genomes Project and we additionally sequenced 15 East Asian individuals (5 Japanese, 5 Chinese and 5 Koreans). Analysis candidates were selected from the insertions with less than 70% of repeats. We excluded any duplications among the insertions that are mapped to GRCh37 using BLAST (-evalue 1e-10 -perc_identity 90 -qcov_hsp_perc 90). The regions that have been recognized as mobile element or tandem repeat by RepeatMasker and TRF softwares were masked for analysis. Normalized read depth within the unique sequence was achieved by dividing the read depth, which was calculated using samtools bedcov, by the median genome coverage. The insertions were determined to be highly polymorphic if there were greater than or equal to 0.3 variant frequency differences across the different populations. Asian-specific insertions were chosen by selecting the insertions with equal or above 0.3 allele frequency difference between Asian and non-Asian population as well as non-Asian allele frequency with equal or below 0.5. Asian linkage disequilibrium blocks were obtained from East Asian samples in the 1,000 Genomes Project phase 3 using S-MIG++ algorithm32 (-maf 0.05 -ci AV -probability 0.95). Linkage disequilibrium blocks with below 0.8 haplotype diversity index were excluded. We performed phasing against the de novo assembly. SNPs and short indels called from whole-genome sequencing (72×) of short reads were phased with linked reads. The non-redundant set of PacBio subreads were aligned to the assembly, and corrections were applied by calculating the maximum likely variant allele for the phased variants based on the read depth. A phased block was defined as the region spanning two markers which had a subread or linked read information providing phasing. Similar to the linked reads, Illumina sequenced BAC phase information was used to correct phasing markers and extend phased blocks. Correction and other bioinformatics methods were performed using an in-house script, described in the Supplementary Information. Long-range switch error measurements were obtained using BAC end sequences. The end sequences were aligned to the AK1 assembly with bwa mem, and the base allele of the phasing marker site was called with the corresponding BAC information. When switching occurred for more than two marker sites in a phased block, it was defined as a long range switch. The long-range switch error rate was calculated as: no. of long range switches/no. of phasing markers. Using the final set of phasing markers, subreads were classified into sets of haplotype A or B when >85% of the phasing markers agreed. When a subread contained no marker, it was classified as homozygous. Through the read depth, phasing markers that were missed in previous steps were additionally called for homozygous regions adjacent to phased blocks. Subreads in haplotype A or homozygous regions were assembled into haplotig A, and haplotype B into haplotig B with Canu27. Haplotigs for MHC class I and II were assembled separately to avoid misassemblies owing to high sequence homology between HLA genes. In this case, subreads phased as homozygous were used with subreads of haplotype A and B. Homozygously phased subreads flanked on each side of a sequencing gap belonged on haplotype A and B, respectively, and were re-classified during assembly. Haplotype-specific variants were called following the assembly-based variation calling method. Owing to possibilities of false positives introduced by misassembly, phased variants that agreed with initial variants called with whole genome sequencing reads were used for further analysis. After functional annotation using GENCODE v19 (ref. 30), disease risk alleles were screened using ClinVar33. Haplotyping of CYP2D6 was done by comparing haplotigs to M33388 following CYP2D6 nomenclature. BACs identified to be discordant in size (>1 kb) were pooled and sequenced with the SMRT platform. The subreads were assembled using Canu27 after screening and removing Escherichia coli or vector sequences with CrossMatch34. The assembled BAC contigs were polished with Quiver. The BAC contigs were, thereafter, used to validate AK1 assembly-based or phase-specific SVs by assessing the concordance between the assembly and the BAC contig at sites of detected SVs. On the basis of the alignments of haplotigs to GRCh37, haplotig A and B were localized to compare partner sequences. The number of different bases were summed in every 5 Mb distance, and percentiled to draw in the Fig. 3a. RNA-seq reads were trimmed and aligned to GRCh37 using STAR aligner35 with the two-pass mapping strategy as recommended. Duplicates were removed using Picard tools, and variants were called using HaplotypeCaller and VariantFiltration following GATK best practices on RNA-seq36. Sites with supportive evidence of altered variation in RNA-seq have been extracted from the final vcf file, and ASEReadCounter37 was applied to remove reads with low base quality. Read counts are annotated to the phase-specific variants called from haplotigs using in-house scripts. When read depth for one allele was over 30, it was considered as ‘expressed’.
News Article | February 24, 2017
A senior energy official at the U.S. Chamber of Commerce recently warned that there will be “hell to pay” if the Trump administration tries to rescind the EPA’s science-based endangerment finding for greenhouse gas emissions. In typical U.S. Chamber fashion, Christopher Guith dismissed current concerns about climate change as based on “religion” — not “scientific facts” — while speaking at a January 26th event in the coal state of Kentucky. Guith is the senior vice president for policy at the U.S. Chamber’s Institute for 21st Century Energy. But Guith conceded that carbon dioxide emissions are likely to ultimately be regulated under the Clean Air Act. He also said that “soccer moms and soccer dads” will make the Trump administration pay if it goes after the EPA’s endangerment finding. Guith’s comments belie the U.S. Chamber of Commerce’s official policy priorities for 2017, which include plans to, “Oppose EPA efforts to regulate greenhouse gases under the existing Clean Air Act, including the endangerment finding.” His remarks came last last month during a question and answer session on the future of energy policy under the Trump administration at an event hosted by the Kentucky Chamber of Commerce. Guith’s comments were captured by a representative of the Energy and Policy Institute who attended the event. The U.S. Chamber’s position on climate has put the powerful trade group at odds with some leading members who support EPA limits on carbon dioxide emissions, including board member Florida Power & Light. Other board members, including Peabody Energy and Southern Company, oppose EPA action on climate change. The U.S. Chamber is also a top contributor to the Republican Attorneys General Association, which counted Oklahoma attorney general Scott Pruitt among its leading members before he was nominated by President Trump to serve as the next EPA administrator. Pruitt and other RAGA members sided with the U.S. Chamber on legal challenges targeting the EPA’s endangerment finding and, more recently, the Clean Power Plan. His bid to lead the EPA has been backed by the Chamber. Audience Question: You mentioned the endangerment finding earlier. There’s some thought that revisiting the science behind the endangerment finding, which you probably know was highly dependent on the IPCC models, and that enough time has now passed to potentially argue that the models the IPCC came up with have flaws and need to be revisited. Is there any momentum behind that thought? Guith: I think there absolutely is momentum, but the one thing I’ll say is that rescinding the endangerment finding, and this is something Ted Cruz talked about quite a bit when he ran for president. I think people here can appreciate how much political capital that would cost. It’s not … climate has never been, well at least in the last 10 years, about scientific fact. It’s been about religion. And if you are going to go out there and say, “We’re going to pull this back,” I mean there is going to be hell to pay, not just from those people out there who are protesting those plants. There’s going to be hell to pay from, you know, soccer moms and soccer dads all throughout the country. People who probably voted for Donald Trump. [emphasis added] And I don’t put that past them, but what I will say is that will turn into a huge, huge buzzsaw, when perhaps a more elegant solution of slow-rolling the implementation would be only slightly more onerous that actually rescinding that, but would take much less political capital. Guith: This goes back to my point about Congress actually repealing the Clean Power Plan. I firmly believe that sometime in the next 10 years we are going to see another stage of Clean Air Act Amendments, and that’s ultimately because we’re sort of at this the point where carbon is not going to go away. Because of the endangerment finding it has to get regulated, unless Congress actually repeals that. And I don’t see a Congress saying, “No we’re not going to regulate carbon” because I don’t think there’s the votes there, nor do I anticipate it being there. The reality is there is an absolute incentive for the environmentalists to cut a compromise because they need some sort of codified regulation. Right now you have the sort of fiat of the Clean Power Plan, and you’ll see what happens when the White House changes over and it’ll just “Thpppt!” … go away. Or you have one bad court ruling, and it just goes away. Also, you have industry. Industry, utilities specifically wants to know, “What are the rules of the road going to be over the 20 years?” And so having that certainty of what it’s going to look like, there is something in it from both parties. And I think there is a way to build CO2 into the Clean Air Act. I am not necessarily arguing that we should do it, but it’s likely to happen in an incremental way that gives the utility sector and the manufacturing sector decades and decades to plan around. But it’s not going to happen this Congress. NextEra Energy told the U.S. Court of Appeals for the District of Columbia Circuit in 2015 that the company’s “interests will be impaired” if opponents of the Clean Power Plan prevail in their legal challenges. Those opponents include the U.S. Chamber and some members of its board of directors, including Southern Company and Peabody Energy. Eric Silagy, the president of Nextera subsidiary Florida Power & Light (FPL), serves on the U.S. Chamber’s board of directors. FPL said last year that its transition from coal-fired power plants to cleaner sources of electricity benefits both the climate and its customers. Nonetheless, those same FPL customers are on tap to pay a total of $816,518 for the utility’s funding of the Chamber for 2015–2018. Utility shareholders are not immune to the damage that funding the controversial political activities of industry trade groups can do to a company’s public image. NextEra faced a near revolt at its annual meeting in Oklahoma last spring, where 42 percent of shareholders voted in favor of full disclosure of the company’s political spending — including industry association dues used for political purposes. Several other major electric utilities quit the U.S. Chamber several years ago after a spokesperson for the group called on the EPA to hold a “Scopes Monkey Trial of the 21st Century” and “put climate science on trial.” This is a guest post by Dave Anderson, cross-posted from Energy and Policy Institute. Main image: Scientists and supporters held a “Rally to Stand up for Science” at the American Geophysical Union Annual Meeting in San Francisco in December 2016. Credit: Ashley Braun
News Article | November 16, 2016
No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment. Gas-rich hydrothermally-heated sediments covered with dense mats of Beggiatoa were obtained in the Guaymas Basin vent area (27° 00.437′ N, 11° 24.548′ W; 2,000 m water depth). Samples were collected by push coring using the submersible Alvin (dive 4570) on RV Atlantis during November/December 2009. The sediments were stored anaerobically in butyl rubber stopper-sealed glass vials. In the laboratory sediments were 1:4 diluted with anoxic artificial seawater (ASW) medium49, initially provided with methane as a substrate, and incubated at 50 °C. These incubations showed immediate methane-dependent sulfate reduction. After 3 months a subsample was incubated with butane (0.2 MPa). Initially, we did not detect butane-dependent sulfide production. However, after 2 months of incubation sulfide production set in. When sulfide concentrations exceeded 15 mM the culture was diluted (1:5) in fresh ASW medium (semi-continuous cultivation) and resupplied with butane. This procedure was repeated several times and resulted in a virtually sediment-free culture after 2 years of cultivation. For quantitative growth experiments, cultures were set up in 150 ml serum bottles containing 80 ml ASW medium, and inoculated with a 20 ml aliquot of a grown culture. Parallel cultures with different starting amounts of butane (5 and 7.5 ml butane in the culture headspace) were prepared. As controls, we used sterile cultures receiving butane, and inoculated cultures lacking butane. All cultures were incubated at 50 °C without shaking. Measurements of sulfide production and butane were performed in triplicate. Sulfide concentrations were determined by transferring 0.1 ml culture into 4 ml acidified copper sulfate (5 mM) solution. The formation of colloidal copper sulfide was determined photometrically at 480 nm (ref. 50). To quantify butane concentration, volumes of 0.1 ml headspace gas were withdrawn using N -flushed, gas-tight syringes. The gas samples were injected without a split into a Shimadzu GC-14B gas chromatograph, equipped with a Supel-Q PLOT column (30 m × 0.53 mm, 30 μm film thickness; Supelco, Bellefonte, USA) and a flame ionization detector. The oven temperature was maintained at 140 °C, and the injection and detection temperatures were maintained at 150 °C and 280 °C, respectively. The carrier phase was N at a flow rate of 3 ml min−1. Samples were analysed in triplicates. Butane concentrations were calculated based on an external calibration curve. Consortia from the Butane50 culture were visualized by Confocal Laser Scanning Microscopy (LSM 780, Zeiss, Germany) with an excitation light of 405 nm and an emission filter >463 nm, and by recording the maximum autofluorescence at 470 nm wavelength. Total DNA was extracted from 10 ml of the Butane50 culture pelleted via centrifugation (4,000 r.p.m. for 15 min; Eppendorf Centrifuge 5810R) using the FastDNA Spin Kit for Soil (MP Biomedicals) following the manufacturer’s protocols. Bacterial and archaeal 16S rRNA gene fragments were amplified using the primer pairs GM3/GM451 and Arch20F52/1492R53. Furthermore genes encoding canonical anaerobic hydrocarbon-activating enzymes including assA/masD (primer pairs 7757F-1, 7757F-2/8543R54) and bssA (primer pair 1213F/1987R55) were targeted for amplification. For amplification of assA, a mixture of forward primers was applied to improve diversity coverage54. Polymerase chain reactions (PCR) were performed in 20-μl volumes containing 0.5 μM of each primer solution, 7.5/6 μg bovine serum albumin solution, 250 μM deoxynucleoside triphosphate (dNTP) mixture, 1 × PCR reaction buffer (5Prime, Germany), 0.25U Taq DNA polymerase (5Prime) and 1 μl DNA template (25–50 ng). PCR reactions (Mastercycler; Eppendorf) included an initial denaturation step of 95 °C for 5 min followed by 34 cycles of denaturation (95 °C for 1 min.), annealing (1.5 min at 44 °C for bacterial 16S primers, or at 58 °C for archaeal 16S primers), and extension (72 °C for 3 min) and a final 72 °C step for 10 min. For amplification of genes encoding canonical hydrocarbon-activating enzymes, the protocol consisted of an initial denaturation step (95 °C for 5 min) followed by 34 cycles of denaturation (96 °C for 1 min), annealing (58 °C for assA primers and 55 °C for bssA primers, both for 1 min) and extension (72 °C for 2 min) ending with a final extension (72 °C for 10 min). All products were checked on 1% agarose gels, stained with ethidium bromide and visualized with UV light. Amplicons (archaeal and bacterial 16S rRNA gene) were purified (QIAquick PCR Purification Kit; Qiagen) and cloned in Escherichia coli (TOPO TA cloning Kit for sequencing; Invitrogen). Clones were screened by standard PCR procedure and positive inserts were sequenced using Taq cycle sequencing with ABI BigDye Terminator chemistry and an ABI377 sequencer (Applied Biosystems, Foster City, CA, USA). Representative full-length sequences were used for phylogenetic analysis using the ARB software package56 and the SSURef_NR99_115 SILVA database57. Phylogenetic trees of 16S rRNA genes were constructed with RAxML (version 7.7.2) using a 50% similarity filter and the GTRGAMMA model. An extended phylogenetic tree is provided as Supplementary Fig. 3. Branch support values were determined using 100 bootstrap replicates. From the Butane50 culture no masD/assA and bssA genes could be amplified. Cell aliquots were fixed for 2 h in 2% formaldehyde, washed and stored in phosphate buffered saline (PBS; pH = 7.4): ethanol 1:1. Samples were sonicated (30 s; Sonoplus HD70; Bandelin) and incubated in 0.1 M HCl (1 min) to remove potential carbonate precipitates. Aliquots were filtered on GTTP polycarbonate filters (0.2 μm pore size; Millipore, Darmstadt, Germany). CARD-FISH was performed according to Pernthaler et al.58 including the following modifications: cells were permeabilized with a lysozyme solution (0.5 M EDTA pH 8.0, 1 M Tris-HCl pH 8.0, 10 mg ml−1 lysozyme; Sigma-Aldrich) at 37 °C for 30 min and with a proteinase K solution (0.5 M EDTA, 1 M Tris/HCl, 5 M NaCl, 7.5 μM of proteinase K; Merck, Darmstadt, Germany) for 5 min at room temperature; endogenous peroxidases were inactivated by incubation in a solution of 0.15% H O in methanol for 30 min at room temperature. Specific 16S rRNA-targeting oligonucleotide probes used were SYNA-407 and HotSeep-1-145626, both applied at 20% formamide concentration. SYNA-407 was developed during this project using the probe design tool within the ARB software package to specifically detect Ca. Syntrophoarchaeum. The probe is highly specific for Ca. Syntrophoarchaeum and has at least one mismatch to non-target group sequences in the current database. The stringency of probe SYNA-407 was experimentally tested on the Butane50 culture using 10% to 40% formamide in the hybridization buffer. The sequence of the probe is: 5′-AGTCGACACAGGTGCCGA-3′. Three helpers were necessary: hSYNA-388 (5′-ACTCGGAGTCCCCTTATC-3′), hSYNA-369 (5′-CACTTGCGTGCATTGTAA-3′) and hSYNA-426 (5′-TATCCGGACAGTCGACAC-3′). Probes were purchased from Biomers (Ulm, Germany). In case of double hybridization, the peroxidases from the first hybridization were inactivated by incubating the filters in 0.30% H O in methanol for 30 min at room temperature. The hybridized archaeal and bacterial cells were stained by addition of the fluorochromes Alexa Fluor 594 and Alexa Fluor 488 for the two target organisms. Finally the filters were stained with DAPI (4′,6′-diamino-2-phenylindole) and analysed by epifluorescence microscopy (Axiophot II Imaging, Zeiss, Germany). Selected filters were analysed by confocal laser scanning microscopy (LSM 780, Zeiss, Germany). Genomic DNA was extracted from 15 ml of the Butane50 culture using the FastDNA Spin Kit for Soil (MP Biomedicals, Illkirch, France). For paired-end library preparation the TruSeq DNA PCR-Free Sample Prep Kit (Illumina) was used including the following modifications of the manufacturer’s guidelines. A total amount of 700 ng DNA (in 50 μl volume) was fragmented in 500 μl nebulization buffer (50% glycerol v/v, 35 mM Tris-HCl, 5 mM EDTA), using a Nebulizer (Roche), with a fragmentation time of 3 min, and applied pressure of 32 p.s.i. The fragmented DNA was purified via a MinElute purification column (Qiagen). Following end repair, the first size-selection step (removal of large DNA fragments) was done with a sample purification bead/H O mixture of 6/5 (v/v). For mate-pair library construction, genomic DNA was extracted from 35 ml Butane50 culture following the protocol after Zhou et al.59 with the following modifications: cells were collected by centrifugation of the culture aliquot (3,000g for 5 min). The pellet was resuspended in 450 μl of extraction buffer, homogenized in a tissue grinder and the mixture was freeze–thawed three times. Subsequently 1,350 μl of fresh extraction buffer and 60 μl of Proteinase K were added. In total, 1,370 ng of DNA were obtained and used for mate-pair library construction with the Illumina Nextera Mate Pair Sample Preparation Kit following the manufacturer’s guidelines with the following modifications: a total amount of 1.3 μg DNA was used and the fragmentation time was reduced to 15 min. Fragments of lengths between 4 kb and 9 kb were obtained on an agarose gel which were then used for further library preparation. Sequencing of both libraries was performed on a MiSeq 2500 instrument (Illumina; 2 × 300 cycles) using v3 sequencing chemistry. In total 4,460,548 and 21,182,518 reads were obtained for the paired-end and mate-pair library respectively. The paired-end Illumina reads were quality-trimmed after adaptor and contaminant removal using the bbduk tool in BBMap (version 34; http://sourceforge.net/projects/bbmap; minimum quality value of 20; minimum read length ≥50 bp). Overlapping paired-end reads were merged using bbmerge when overlap exceeded 20 bases without mismatches for reads ≥150 bp. The 16S rRNA based phylogenetic composition of the paired-end library was estimated using the software phyloFlash (https://github.com/HRGV/phyloFlash), which classifies reads taxonomically by mapping reads against the SSU SILVA 119 database using bbmap. For quantification, only unambiguously mapped reads were counted. For the mate-pair library, junctions, contaminants and external adaptors were removed using bbduk. Afterwards, the reads were quality trimmed (quality value ≥20 and minimum sequence length 50 bp). Bulk assembly of processed libraries was done with SPAdes (version 3.5.0 (ref. 60)) including the BayesHammer error correction step and using default k-mer size recommended for the read length (21, 33, 55, 77, 99, 127). The resulting scaffolds were analysed and binned using the Metawatt software (version 2.1 (ref. 61)), which analyses the GC content, coverage, open reading frames (ORF) and tetranucleotide pattern for each scaffold. The subsequent binning of the scaffolds was based on three different criteria: highly similar tetranucleotide frequency (98% confidence level), coherent taxonomic classification according to BlastP search of the translated ORFs and similar GC content and read coverage in the metagenome. Using the software RNAmmer62, the 16S rRNAs present in the bulk assembly were extracted to classify the different bins of the bulk assembly phylogenetically. Bins corresponding to the GoM-Arch87 group were selected and refined. The refinement started with a mapping of the raw reads (from complete libraries) to the selected bins (with a minimum identity of 90% the first time and 97% the next ones) using the bbmap tool from the BBMap package. The mapped reads were reassembled using SPAdes (same settings as for the bulk assembly), followed by binning in Metawatt. Contigs smaller than 1 kb were removed from the bin. The mate-pair read mapping information of the bin was used to create connectivity graphs using Cytoscape63, 64 and to remove poorly connected contigs. After bin refinement, its completeness was checked using AMPHORA265, which screens for 104 archaeal single copy genes; CheckM66, which analyses completeness and contamination based on lineage-specific marker sets, in our case Euryarchaeota and tRNAscan67, which screens for the different tRNA sequences. The final bins were used as draft genome of Ca. S. butanivorans and Ca. S. caldarius for automated gene annotation in RAST68 and genDB69 after gene prediction using Glimmer3.02 (ref. 70). After selecting the best annotation for each ORF using the automated annotation tool MicHanThi71, the GenDB results were visualized using the JCoast frontend72. All presented genes were manually curated afterwards. A HotSeep-1 bin was retrieved and annotated as described above for Ca. Syntrophoarchaeum. To compare our HotSeep-1 bin and the published draft genome of Ca. D. auxilii (CP013015), JSpecies1.2.1 (ref. 73) was used, which analyses the average nucleotide identity and the tetranucleotide frequency between two genomes. This method was also used to compare the two genome bins of Ca. Syntrophoarchaeum. Furthermore, the two HotSeep-1 strains were compared by checking the identity of the following genes: 16S rRNA, 23S rRNA, sulfate adenylyltransferase (sat), adenylylsulfate reductase subunit alpha (apr alpha), adenylylsulfate reductase subunit beta (apr beta) and dissimilatory sulfite reductase subunit alpha (dsr alpha) and of the internal transcribed spacer (ITS) region. To study genes encoding pili and cytochromes of HotSeep-1, genes of interest were identified. This selection was manually curated using Blastp and Pfam search. The subcellular localization of cytochromes was predicted using PSORTb (version 3.0.2 (ref. 74)). To search for canonical genes of hydrocarbon oxidation in the metagenome and the bins of Ca. S. butanivorans and Ca. S. caldarius, a protein database of anaerobic hydrocarbon oxidation genes was constructed. Full-length sequences from hydrocarbon degrading enzymes present in the Uniprot database were combined with recently published masD sequences47. These enzymes were AssA, BssA, MasD, the alpha subunit from naphtylmethylsuccinate synthase (Nms), the alpha subunit from a ring cleaving hydrolase (BamA), and pyruvate formate lyase (Pfl). The bulk assembly and the Ca. Syntrophoarchaeum draft genomes were searched against this database using Blastx with an E-value of 10−5. Triplicate Butane50 cultures and duplicates of Ca. D. auxilii cultures were grown on their respective substrates (butane or hydrogen). Two active Butane50 cultures were incubated with bromoethanesulfonate (BES, 5 mM final concentration) and as growth control, one culture remained untreated. To check the effect of BES on the bacterial partner alone, hydrogenotrophic grown Ca. D. auxilii cultures were also treated with 5 mM of BES. Sulfate-reducing activity was determined by sulfide measurements as described above. The McrA amino acid sequences in the genomes of Ca. S. butanivorans and Ca. S. caldarius were extracted from the genomic data, and used for a phylogenetic reconstruction. 124 reference McrA protein sequences longer than 450 amino acids from public databases were aligned with Muscle3.7 (ref. 75), accession numbers of these sequences are provided in the Supplementary Table 4. After manual refinement of the alignment a masking filter accounting the alignment ambiguity of each column was designed using the ZORRO software65. Phylogenetic trees were calculated using maximum likelihood algorithm RAxML (version 8.2.6 (ref. 76)) with the masking filter and the PROTGAMMA model with LG as amino acid substitution model and empirical base frequencies. These were the best-fitting conditions according to RAxML using both Akaian and Bayesian information criterion. To find the optimal tree topology 149 bootstraps were calculated according to the bootstrap convergence criterion of RAxML. To verify results of the presented phylogenetic affiliation, the phylogenetic analyses were repeated using IQ-TREE77 with LG+I+F+C20 as substitution model on the same alignment (Supplementary Fig. 1a). To avoid the possibility of long branch attraction, further partial McrA sequences of Bathyarchaeota (Supplementary Table 4) were included and only the McrA sequence regions common between the partial McrAs of Bathyarchaeota and our previous set of full-length sequences (>300 residues) was considered for phylogenetic analysis. First, it was confirmed that using these regions for phylogenetic analysis resulted in similar tree topology as using the full-length sequences by calculating a phylogenetic tree using RAxML (PROTGAMMALG+I+F) with the respective parts of all full-length sequences (the data set used in the previous phylogenetic analysis; Supplementary Fig. 1b). Then the partial sequences of Bathyarchaeota were included into the set to perform phylogenetic analysis of the common McrA sequence parts using both RAxML (PROTGAMMALG+I+F, Supplementary Fig. 1c) and IQ-tree (LG+I+F+C20, Supplementary Fig. 1d). Finally, to check if the overall tree topology was influenced by the deeply-branching SCAL_000352 sequence, a tree using RAxML (PROTGAMMALG+I+F) with only full-length sequences but excluding the SCAL_000352 sequence was constructed (Supplementary Fig. 1e). All resulting trees were plotted using the iTol webserver78. To test the correct genome assembly and to confirm the presence of four mcrA genes per bin, an mcrA clone library was constructed. For each of the eight mcrA genes found in the two Ca. Syntrophoarchaeum bins primer sets were developed, which were used for PCR amplification from Butane50 culture DNA (Supplementary Table 5). PCR reactions (20 μl volume) were performed containing 1 μM primer each, 200 μM dNTPs, 1 × PCR buffer, and 0.5 U DNA polymerase (TaKaRa Taq, TaKaRa Bio Europe, France) under the following conditions: initial denaturation at 95 °C for 5 min, followed by 39 cycles of denaturation (96 °C, 1 min), annealing for 1 min, elongation (72 °C, 2 min), and a final elongation step (72 °C, 10 min). For two primer sets, amplification was done with Phusion High-Fidelity DNA Polymerase (Thermo Fischer Scientific, Germany) using 50 μl reactions containing 1.5 mM MgCl , 3% (v/v) DMSO, 0.4 μM primer each, 50 μM dNTPs, 1 × PCR buffer, and 1 U DNA polymerase under the following conditions: initial denaturation at 98 °C for 30 s, followed by 39 cycles of denaturation (98 °C, 10 s), annealing for 30 s, elongation (72 °C, 50 s), and a final elongation step (72 °C, 10 min). For annealing temperatures for the individual primer sets see Supplementary Table 5. PCR resulted in multiple bands, therefore amplicons of expected size were excised from an 1% agarose gel and purified using the MinElute Gel extraction kit (Qiagen, Germany). DNA was ligated in a pGEM T-Easy vector (Promega, Madison, WI) and transformed into E. coli TOP10 cells (Invitrogen, Carlsbad, CA) according to the manufacturer’s recommendations. Sequencing was performed by Taq cycle sequencing using a vector-specific primer (M13F or M13R) with a model ABI377 sequencer (Applied Biosystems). Sequence data were analysed with the ARB software package56. Total RNA was extracted from 100 ml of an active Butane50 culture, which was kept at 50 °C during the whole procedure: first most medium (>90%) was replaced by butane gas, whereas the biomass remained at the bottom of the bottle. Then RNA was preserved by adding 90 ml preheated RNAlater (Sigma-Aldrich; 10:1 RNAlater vs sample) for 1 h. Subsequently this mixture was filtered through an RNA-free cellulose nitrate filter (pore size 0.45 μm; Sartorius; Göttingen, Germany). The filter was extracted in an RNase-free tube with glass beads and 600 μl of RNA Lysis Buffer (Quick-RNA MiniPrep, Zymoresearch, USA) applying bead beating (2 cycles of 6 m s−1 for 20 s). The lysate was cleared by centrifugation (10,000g; 1 min) and the supernatant was used for RNA extraction with the Quick-RNA MiniPrep Kit (Zymoresearch, Irvine, CA, USA) according to the manufacturer’s guidelines but omitting the on-column DNase treatment step. The RNA extract was cleaned from DNA by incubating it at 37 °C for 40 min with 10 μl of DNase I (DNase I recombinant, RNA-free; Roche Diagnostics, Mannheim, Germany), 7 μl of 10 × incubation buffer (Roche) and 2 μl of RNase-Inhibitor (Protector RNase Inhibitor, Roche Diagnostics, Mannheim, Germany). DNases were inactivated by heating for 10 min to 56 °C. Subsequently the RNA was purified with the RNeasy MinElute Cleanup Kit (QIAGEN, Hilden, Germany). In total, 450 ng of high-quality RNA was obtained. The TruSeq Stranded Total RNA Kit (Illumina) was used for RNA library preparation. The rRNA depletion step was omitted. Of the total RNA, 80 ng (in 5 μl volume) was combined with 13 μl of ‘Fragment, Prime and Finish mix’, for the RNA fragmentation step according to the Illumina TruSeq stranded mRNA sample preparation guide. Subsequent steps were performed as described in the sample preparation guide. The library was sequenced on a MiSeq instrument; with v3 sequencing chemistry in 2 × 75 cycles paired-end runs. The resulting reads were pre-processed including removal of adaptors and contaminants and quality trimming to Q10 using bbduk v34 from the BBMAP package. Trimmed reads were used to quantify the 16S rRNA gene based phylogenetic composition of the library by phyloFlash as described above for the DNA paired-end library. Trimmed reads were also mapped to the bins of interest (Ca. S. butanivorans, HotSeep-1) using bbmap with a minimum identity of 97%. The expression level of each gene was quantified by counting the number of unambiguously mapped reads per gene using featureCount79 with the –p option to count fragments instead of reads. To compare expression levels between genes, absolute fragment counts per genes were converted into fragments per kilobase of transcript per million mapped reads (FPKM80) as follows: where i denotes any specific gene, j denotes the sum of all the transcribed genes, C denotes counts and L denotes length (bp). For total protein analysis, the cells from 50 ml of grown (approximately 10 mM sulfide) Butane50 enrichment culture were harvested by centrifugation, frozen in liquid nitrogen and stored at −20 °C until analysis. The cell pellets were suspended in 30 μl of 50 mM ammonium bicarbonate buffer, and lysed by three 60 s freeze–thaw cycles between liquid nitrogen and +40 °C (thermal shaker, 1,400 r.p.m.). The cell lysate was incubated with 50 mM dithiothreitol at 30 °C for 1 h, followed by alkylation with 200 mM iodacetamide for 1 h at room temperature, in the dark, and trypsin digestion (0.6 μg trypsin, Promega) overnight at 37 °C. Peptides were desalted using C18 Zip Tip columns (Millipore), and analysed by nLC–MS/MS using an LTQ-Orbitrap mass spectrometer (Thermo Fisher Scientific) equipped with a nanoUPLC system (nanoAquity, Waters) as described previously81. Peptide identification was conducted by Proteome Discoverer (version 184.108.40.206, Thermo Fisher Scientific) using the Mascot search engine with the annotated metagenome of Ca. Syntrophoarchaeum as a database81. Peptides were considered to be identified by Mascot when a probability of 0.05 (probability-based ion score threshold of 40) was achieved. emPAI values calculated by Mascot for identified proteins were used as semi-quantitative measure to estimate the abundance of proteins in the analysed sample82. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium83 via the PRIDE partner repository84. To synthesize 1-butyl-CoM and 2-butyl-CoM, 5 g of coenzyme M (Na 2-mercaptoethanesulfonate, purity 98%; Sigma Aldrich) were dissolved in 40 ml of a 30% (v/v) ammonium hydroxide solution, in serum vials. Twice the molar amount of 1-bromobutane (purity 99%; Sigma Aldrich) or 2-bromobutane (purity 98%; Sigma Aldrich) were added, the serum bottles were closed with butyl rubber septa and incubated at room temperature with vigorous shaking (500 r.p.m.) for 4 h. The aqueous phase was separated from the excess hydrophobic 1- or 2-bromobutane via separatory funnels. Residual, dissolved 1- or 2-bromobutane was removed by bubbling with nitrogen. The solutions were analysed for the presence of 1-butyl-CoM or 2-butyl-CoM by FT-ICR-MS analysis without further purification. Both solutions contained a major m/z peak at 197.0311; no m/z peaks were indicative of free CoM, CoM dimers, 1- or 2-bromobutane being detected. Both standards were stable and no interconversion of isomers was observed. For preparation of cell extracts, volumes of 20 ml were collected from grown Butane50 cultures (sulfide concentrations of 14–15 mM) under anoxic conditions. The cells were harvested by centrifugation (10 min, 10,000 r.p.m., 4 °C), washed twice with a 100 mM ammonium bicarbonate solution, and finally suspended in 1 ml of acetonitrile/methanol/water solution (40:40:20 v/v). Glass beads (0.1 mm diameter, Roth) were added (0.3 g per tube), and the cells were lysed with a PowerLyzer 24 bench top bead-based homogenizer (MO BIO Laboratories, Carlsbad, CA) using 5 cycles of 2,000 r.p.m. for 50 s, with a 15 s pause between cycles. Prior to use, the glass beads were treated with 1N HCl solution and washed twice with deionized water. Glass beads and cell debris were removed by centrifugation, and the aqueous cell extracts were stored in glass vials at 4 °C until analysis. Authentic standards and cell extract samples were measured with ultra-high-resolution mass spectrometry (SolariX XR 12T Fourier transform ion cyclotron resonance mass spectrometer, Bruker Daltonics Inc., Billerica, MA) with negative electrospray ionization (capillary voltage: 4.5 kV) in direct infusion mode (4 μl min−1 and 0.1 s accumulation time). Spectra were recorded with a 2 MWord time domain (0.42 s transient length) between m/z 74 and 3,000 resulting in a mass resolution of approximately 250,000 at m/z 200. Instrument mass accuracy was linearly calibrated with low-molecular mass fatty acids (C4–C12) between 88 and 199 Da, resulting in an average root-mean square error of the calibration masses of 39 p.p.b. (n = 7). For each measurement, 64 (Butane50 samples), or 128 (controls) spectra were co-added (lock mass: 143.10775 m/z) and internally recalibrated with naturally present fatty acids. Collision induced fragmentation of m/z 197 was carried out after quadrupole isolation (10 Da window) with 12 V collision energy and 128 scans per measurement (lock mass: 199.17035 m/z). The 1-butyl-CoM and 2-butyl-CoM standards were diluted to approximately 10 μg ml−1 and checked for appropriate collision energy and fragment pattern. Fragment masses 89.0430 (C H S−) and 80.9652 (HSO −) were then used as indicative fragment for butyl-CoM in the cell extracts. The formation of an even-electron fragment HSO − from bisulfite is favoured when a beta H atom is present85. However, SO −• (m/z = 79.9674) was also produced upon fragmentation of the standards. Fragmentation information of the butyl-CoM standards was used to implement a UPLC–MS/MS method to validate the isomeric form of m/z 197.031 in the samples. A triple quadrupole mass spectrometer (Xevo TQ-S, Waters Cooperation, Manchester, UK) in negative electrospray ionization mode was used in multiple reaction monitoring (MRM) mode. Indicative butyl-CoM transitions (m/z 197 > 89 and m/z 197 > 81) were initially optimized (cone voltage and collision energy) by direct infusion of standard solutions into the mass spectrometer. The mass spectrometer was coupled to a UPLC (ACQUITY I-Class, Waters Cooperation Milford, MA, USA) equipped with a reversed phase column (HSS T3, 25 cm, Waters) and run with a binary gradient (1% methanol in water to 90% methanol) at a flow rate of 0.3 ml min−1. For each analysis, 10 μl were injected into the UPLC. Retention time, presence of both MRM transitions and relative ion ratios as compared to the standards were used as quality criteria. Hydrogen production in the Butane50 culture was measured by analysing the headspace of replicate incubations which were constantly agitated on a shaking table in a 50 °C incubator. The butane-dependent sulfide production (and therefore potential hydrogen production) was determined by tracking the sulfide production (as above) for 4 weeks. Gas phase (1 ml) was sampled with a gas-tight syringe to determine hydrogen concentrations (i) before changing the headspace, (ii) after exchanging the headspace in 30 min intervals for 6 h (iii) the next day, before and after addition of sodium molybdate solution (10 mM final concentration) to the culture to stop potential hydrogen-dependent sulfate reduction. Gas phase was immediately injected into a Peak Performer 1 gas chromatograph (Peak Laboratories, Palo Alto, CA) equipped with a reducing compound photometer. Development of hydrogen concentrations were converted into hydrogen production rates and compared with potential hydrogen production rates according to a stoichiometry of 4:1 (H production vs sulfate reduction). A 100 ml grown Butane50 culture was concentrated by centrifugation at 2,000 r.p.m. using a Stat Spin Microprep 2 table-top centrifuge. Aliquots were placed in aluminium platelets of 150 μm depth containing 1-hexadecen86. The platelets were frozen using a Leica EM HPM100 high-pressure freezer (Leica Mikrosysteme, Wetzlar, Germany). The frozen samples were transferred to an Automatic Freeze Substitution Unit (Leica EM AFS2) and substituted at −90 °C in a solution containing anhydrous acetone, 0.1% tannic acid for 24 h and in anhydrous acetone, 2% OsO , 0.5% anhydrous glutaraldehyde (Electron Microscopy Sciences, Ft. Washington, USA) for additional 8 h. After a further incubation over 20 h at −20 °C samples were warmed up to +4 °C and washed with anhydrous acetone subsequently. The samples were embedded at room temperature in Agar 100 (Epon 812 equivalent) at 60 °C over 24 h. Thin sections (80 nm) were examined using a Philips CM 120 BioTwin transmission electron microscope (Philips Inc. Eindhoven, The Netherlands). Images were recorded with a TemCam F416 CMOS camera (TVIPS, Gauting, Germany), for additional images see Supplementary Fig. 4. All sequence data are archived in NCBI database under the BioSample number SAMN05004607. Representative full-length 16S rRNA gene sequences of the clone library of the Butane50 culture have been submitted to NCBI under accession numbers KX812780– KX812802. Draft genomes of the Ca. Syntrophoarchaeum organisms can be found under the BioProject accession numbers PRJNA318983 (Ca. S. butanivorans) and PRJNA319143 (Ca. S. caldarius). Metagenomic and metatranscriptomic reads have been submitted to the short read archive under accession number SRS1505411. The mass spectra of the proteomic data set have been deposited to the ProteomeXchange Consortium with the data set identifier PXD005038.
News Article | November 10, 2016
All strains were grown at 21–22 °C on nematode growth-medium plates seeded with E. coli OP50 bacteria32. For OP50 cultures, a single colony was inoculated into 100 ml of LB and grown for 48 h at 21–22 °C. Transgenic lines were generated by standard injection methods and included the desired transgene, a fluorescent co-injection marker and an empty vector, bringing the total DNA concentration up to 100 ng μl−1. For each transgene, three independent extrachromosomal lines that propagated the transgene at high rates were tested in parallel to account for variability typical of such strains. All mutagenized strains were back-crossed 5–7 times before characterization. Strain CX12311 bears ancestral alleles of the npr-1 and glb-5 genes, which affect oxygen sensitivity and are mutated in the N2 laboratory strain17; it is therefore used as a comparison strain for wild-type strains bearing the ancestral alleles. CX14697-CX14712, CX14731-CX14748, CX14750-CX14757, CX14783, CX14784, CX14786-CX14820, CX14822-CX14839. Genotypes inferred from low-coverage genomic sequence and behavioural data are included as Supplementary Table 1. DNAs are N2-derived unless otherwise noted. CX16884 kyIR163 V; kyEx5851 (Psrx-43N2::srx-43N2::sl2::GFP, 2.5 ng μl−1; Pmyo3::mcherry, 5 ng μl−1), CX17202 kyIR163 V; kyEx6012 (Psrx-43N2::srx-43N2(nonsense)::sl2::GFP, 2.5 ng μl−1, Pmyo3::mcherry, 5 ng μl−1), CX16881 srx-43(gk922634) V; kyEx5848 (srx-43N2, 2.5 ng μl−1, Pmyo3::mcherry, 5 ng μl−1); gk922634 changes R160 to an opal stop codon, CX17204 kyEx6013 (Psrx-43N2::srx-43N2::GFP, 50 ng μl−1, Pelt-2::GFP, 5 ng μl−1), CX16943 kyIR163 V; kyEx5894 (Psrx-43MY14::srx-43MY14::sl2::GFP, 2.5 ng μl−1, Pmyo3::mcherry, 5 ng μl−1), CX16425 kyIs602 (Psra-6::GCaMP3.0, 75 ng μl−1; Pcoel::GFP, 10 ng μl−1); kyEx5594 (Psra-6::srx-43N2, 50 ng μl−1, Pmyo3::mcherry, 5 ng μl−1), CX16931 kyIs602; kyEx5885 (Psra-6::srx-43MY14, 50 ng μl−1, Pmyo3::mcherry, 5 ng μl−1), CX17196 kySi66 (MosSCI Psrx-43N2::srx-43N2) II; srx-43(gk922634) V, outcrossed 4×, CX17198 kySi68 (MosSCI Psrx-43MY14::srx-43MY14) II; srx-43(gk922634) V, outcrossed 4×, CX17201 kySi71 (MosSCI Psrx-43N2::srx-43MY14) II; srx-43(gk922634) V, outcrossed 4×,CX17203 kySi72 (MosSCI Psrx-43MY14::srx-43N2) II; srx-43(gk922634) V, outcrossed 4×, FK181 ksIs2 (Pdaf-7::GFP + rol-6(su1006)), CX16958 kyIR163 V; ksIs2 CX16849 srx-43(gk922634) V, outcrossed 5× to N2. gk922634 is R160opal. This mutation was provided by the Million Mutation Project23. CX16935 kyIR163 srx-43(ky1019) V. ky1019 is a CRISPR/Cas9-induced indel mutation that causes a frame-shift mutation after the first transmembrane domain (insertion (TCACTGAGTTCGAAT), deletion (CCCCG), final sequence TCGCAGCTCTCAAGT TTCGGAATTCTC; insertion is underlined). We used the coCRISPR protocol described previously33. Young adults were injected with a mix of plasmids containing Cas9, guide RNA targeting rol-6, and guide RNA targeting the location of the desired mutation, as well as a ssDNA template for inducing a dominant rol-6(su1006) mutation. F1 animals with a roller phenotype were isolated and allowed to lay eggs before secondary screening for the target mutation by Sanger sequencing. Exploration assays8 were conducted on 35-mm Petri dishes evenly seeded with 100 μl of OP50 E. coli bacteria 24 h before the start of the assay. Individual two-day-old L4 hermaphrodites were placed in the centre of the plate. After 16 h, plates were placed on a grid containing 86 squares and the number of full or partial squares that contained tracks was quantified by an investigator blinded to the genotype. Pheromones or control solvent were mixed into the agar. A pheromone response for each animal on an ascaroside plate was determined with respect to the behaviour of control animals that were tested on ascaroside-free plates on the same day. Individual pheromone response was defined as the mean number of squares entered by controls tested on the same day, subtracted by the number of squares entered by the test animal. Group pheromone response was defined as the mean pheromone response of all individuals tested across days. For statistical analysis, n was defined as the total number of animals in the ascaroside group. N2-derived strains were tested in 21% oxygen (Figs 1, 2e, f and Extended Data Figs 1, 5). All naturally isolated strains and CX12311-derived strains bearing ancestral alleles of npr-1 and glb-5 were tested in 8% oxygen to suppress the oxygen-dependent roaming behaviour of ancestral npr-1 alleles12, 34 (Figs 2a, b, d and Extended Data Fig. 2), Direct examination of roaming and dwelling was modified from previously reported techniques8. At 14.5 h before testing, 25 L4 larvae were picked and placed onto 150-mm test plates thinly seeded with 1.5 ml of OP50 bacteria with or without synthetic pheromone. Video recording was conducted under red light to minimize behavioural response to imaging conditions. 1.5-h-long videos were recorded at 3 frames s−1 using Streampix software (Norpix Inc.) and a 6.6 MP PL-B781F CMOS camera (PixeLINK.). Custom Matlab scripts8 were employed to determine worm trajectories and conduct a two-state hidden Markov model determining the most probable state path for each animal and thereby measure roaming- and dwelling-state durations. The low basal exploration rate in daf-7(lf) mutants7 prevented a direct assessment of the effect of icas#9 on foraging behaviour. Instead, we examined daf-7(lf) daf-3(lf) double mutants, as daf-7 canonically acts by antagonizing daf-3, which encodes a co-SMAD transcriptional regulator. We found that daf-3 mutations suppressed the low basal exploration rate of daf-7 mutants. N2 daf-7(lf) daf-3(lf) animals explored control plates to a greater degree than the wild type, so larger (10 cm) exploration assay plates were used to score these strains. No statistical methods were used to predetermine sample size. Most experiments were repeated on three separate days. For exploration assays, the standard group size on a single day was six; this ensured sufficient power to detect moderate effects, while also limiting the influence of daily variation. All plates with a healthy adult animal at the end of the assay were scored and included in the analysis. Randomization was ensured using the following or similar approach: at the start of each exploration assay, six animals were placed on a pick at a time. They were then transferred individually to three control plates and then to three icas#9 plates in the order the animals came off the pick. Assays were scored by an experimenter blinded to the condition or genotype. Most statistical comparisons were performed using ANOVA with Dunnett’s correction for multiple comparisons or using a (two-sided) t-test, as noted in the figure legends. The normality of the data was tested with D’Agostino–Pearson omnibus test. Bartlett’s test was used to check for differences in variance between groups being statistically compared. N2 groups in Fig. 3c and Extended Data Figure 5e did not pass a normality test. As the n number was large (>15), ANOVA was still an appropriate test. Moreover, the findings were still significant when a nonparametric test was used (Kruskal–Wallis with Dunn multiple comparison test). We grew 150 ml unsynchronized worm cultures for 9 days and fed on E. coli (HB101 or OP50), as described27. Extracts were generated from the culture medium and analysed by liquid chromatography–tandem mass spectroscopy (LC–MS/MS), as described previously27, and analysed on a Thermo Scientific TSQ Quantum Access MAX, with the collision gas pressure set to 1 mTorr. Ascaroside concentrations present in the culture were quantified using the corresponding synthetic standards, with the following exceptions: synthetic ascr#18 was used to quantify ascr#22 and ascr#26 and synthetic icas#3 was used to quantify icas#1 and icas#10. The MY14–CX12311 recombinant inbred lines were generated by crossing MY14 males with CX12311 hermaphrodites and CX12311 males with MY14 hermaphrodites, to ensure the mitochondrial DNA from both strains were equally represented in the RILs. In total, 94 F2 animals were individually picked, placed onto plates and inbred through self-fertilization for 10 generations. RIL genotyping was conducted by low-coverage shotgun sequencing14. Genomic DNA was fragmented and attached to sequencing adapters with a Nextera DNA Library Prep Kit (Illumina). Samples were pooled and sequenced on an Illumina HiSeq 2000. Sequencing reads from each strain were mapped to the WS235 release of the C. elegans genome using the Burrows–Wheeler Aligner to create bam files for further analysis35. The set of MY14/N2 single nucleotide variants (SNVs) identified in the Million Mutation Project were used for genotyping purposes23. Each genetic variant was genotyped in each strain. Owing to the low coverage, the majority of SNVs were not genotyped. To improve the data coverage, we grouped 200 neighbouring SNV genotypes together to create a consensus genotype for 540 bins (either N2, MY14 or heterozygous). These genotypes were used for QTL mapping. The pheromone response index was used as the phenotype in combination with the 540 genotype bins from above. R/qtl was used to perform a one-dimensional scan using marker regression on all 540 markers. The significance threshold was determined using 1,000 permutation tests. The effect-size of the roam-1 locus was estimated using the fitqtl function with a single QTL. The peak of the roam-1 locus (chromosome V: 16,451,686–16,579,457) was used as an additive and interactive covariate for additional one-dimensional scans, assuming a normal model. The significance threshold for these two tests was also determined using 1,000 permutation tests. Before the detailed QTL mapping by sequencing described above, the roam-1 QTL was localized to 2.5 Mb (chrV: 14.3–16.8 Mb) by examining 14 high-confidence phenotypically extreme RILs (Supplementary Table 1). This result, which was confirmed by the full analysis, guided the initial generation of NILs. The NIL kyIR142 was produced by backcrossing the RIL CX14816 nine times to MY14, maintaining N2 alleles at chrV: 14.3 and chrV: 16.8 Mb at each generation. The NIL kyIR139 was produced by backcrossing the RIL CX14708 nine times to CX12311, maintaining MY14 alleles at chrV: 14.3 and chrV: 16.8 Mb. The NIL: kyIR144 was produced by crossing kyIR139 with N2 and isolating recombinants with the N2 allele of glb-5 (chrV: 5.56 Mb), the MY14 alleles at chrV:14.3 and 16.8 Mb, and the N2 allele of npr-1 on chrX. The NILs kyIR147 and kyIR153 were created by crossing kyIR144 with N2 and identifying progeny with the N2 allele at chrV: 14.3 Mb and the MY14 allele at chrV: 16.8 Mb. We crossed kyIR147 with males from CX16290, a N2 strain with an integrated fluorescent marker at chrV: 15.83 Mb. F1 progeny were identified by fluorescence, picked to growth plates, and allowed to lay eggs for 12 h. Following 3 days of growth, ~2,600 non-fluorescent F2 animals were sorted individually into wells of 96-well plates by a worm sorter (COPAS Biosort Systems, Union Biometrica). These F2 were grown in 200 μl of S Basal buffer (5.85 g NaCl, 1 g K HPO , 6 g KH PO , 5 mg cholesterol per litre) with cholesterol, supplemented with OP50 bacteria. A fraction of the F3 progeny from each isolate were lysed and genotyped at chrV: 16.069 Mb. Those with an N2 allele at chrV: 16.069 Mb were genotyped at chrV: 15.861 Mb. Twelve recombinants with an N2 allele at chrV: 16.069 Mb and a MY14 allele at chrV: 15.861 Mb were isolated and characterized behaviourally, among which were kyIR163 and kyIR157 (Fig. 2f). The N2 NIL with kyIR163 (182 kb of MY14 sequence) is referred to as roam-1 . Calcium imaging experiments were performed and analysed as described previously36. In brief, young adult animals were placed into custom-made 3 mm2 microfluidic polydimethylsiloxane devices that permit rapid changes in stimulus solution. Each device contains two arenas, allowing for simultaneous imaging of two genotypes with approximately ten animals each. Animals were transferred to the arenas in S-Basal buffer and paralyzed for 80–100 min in 1 mM (−)-tetramisole hydrochloride. Experiments consisted of four pulses of 10 s of stimulus separated by 30 s of buffer, with an additional 60 s between stimulus types. Tiff stacks were acquired at 10 frames s−1 at 5× magnification (Hamamatsu Orca Flash 4 sCMOS), with 10 ms pulsed illumination every 100 ms (Sola, Lumencor; 470/40 nm excitation). Fluorescence levels were analysed using a custom ImageJ script that integrates and background-subtracts fluorescence levels of the ASH neuron cell body (4 × 4 pixel region of interest). Using MATLAB, the calcium responses were normalized for each stimulus type by dividing fluorescence levels by the baseline fluorescence, defined as the average fluorescence of the 10 s preceding the first pulse of the stimulus. Each experiment was performed a total of four times over two separate days. Animals were pooled together by strain to calculate population mean and standard error (N2 srx-43 allele, 23 animals; MY14 srx-43 allele, 30 animals; array negative control, 19 animals). Experiments were conducted on two days. For GFP expression studies, live adult animals were mounted on 2% agarose pads containing 5 mM sodium azide. Images were collected with a 100× objective on a Zeiss Axio Imager.Z1 Apotome microscope with a Zeiss AxioCam MRm CCD camera. For daf-7 reporter studies, expression was quantified 16–24 h after L4 animals were placed on exploration assay plates. Images were processed in Metamorph and ImageJ to generate a maximum-intensity Z-projection. Reporter values were assessed as the mean grey value for a 16-pixel-radius circle centred over the cell body minus the mean background intensity. Both ASI neurons were analysed in each animal; experiments were performed over three days. Digital PCR was conducted on a QuantStudio 3D digital PCR platform (Thermo Fisher), and analysed on the QuantStudio 3D AnalysisSuite Cloud. The srx-43 mRNA expression studies were conducted on synchronized L4 worms 48 h after laying. RNA was collected on RNeasy Mini columns (Qiagen) and treated with DNase (Qiagen). SuperScript III First-Strand Synthesis System (Thermo Fisher) was used to create cDNA libraries. Custom TaqMan Expression Assays (Thermo Fisher) were used for srx-43 quantification, and the tubulin gene, tbb-1, was used for normalization of digital PCR. For quantitative analysis of the competition experiments, DNA was extracted with a standard phenol–chloroform protocol. Custom TaqMan SNP Genotyping Assays (Thermo Fisher) were used to determine the relative ratio of N2 versus roam-1 DNA by digital PCR. The assay was validated with known ratios of N2 to roam-1 DNA (Extended Data Fig. 9). To create the gene and organism phylogenies, we used SNV data downloaded from the Million Mutation Project (http://genome.sfu.ca/mmp/) or the CeNDR resource (http://www.elegansvariation.org). For the CeNDR dataset, MY14 was assumed to be clonal or near-clonal with MY23, as was suggested by RAD sequencing. Software was written in Python using the Biopython module to create a neighbour joining tree. For the roam-1 locus, SNVs on chrV between 16,010,000 and 16,030,000 were used. For the glc-1 locus, SNVs on chrV between 16,181,000 and 16,222,000 were used. All SNVs were used to construct the whole-genome strain tree. Number of genetic variants and Tajima’s D were calculated on 5-kb bins using vcftools37. d /d was calculated by counting using custom Python scripts analysing variants between MY23 and the N2 reference. Phylogenies of srx-43 and closely related genes were performed using protein sequences obtained previously38. Competition experiments consisted of three boom–bust cycles. During the boom phase, population growth led to the rapid depletion of food, initiating the bust phase, which lasted for two days. Simple lawn competition experiments were conducted on 100-mm NGM agar plates with a single lawn formed from 800 μl of saturated OP50 culture. Patchy lawn competition experiments were conducted on 150 mm NGM agar plates with a 200 μl ring-shaped OP50 lawn in the centre of the plate surrounded by 15 small 40-μl lawns (Fig. 5a); at the assay start and at transfers animals were placed in the centre of the plate. Populations were initiated from 20 N2-type and 20 roam-1 -type age-synchronized young adult animals. The initial population depleted food within 4 days, and on day 6 animals were washed into M9 media. 20% of the suspension was transferred to a new plate and the remainder was lysed for quantitative DNA analysis. For the second and third boom–bust cycle, food resources were depleted in 2 days and the plates were kept starved for an additional 2 days. Following the second bust phase, 20% of the animals were transferred to a new plate; following the third bust phase, the entire population was harvested for DNA extraction.
Nextera AS | Date: 2010-02-24
The present invention provides an alternative scaffold for peptides displayed on filamentous phages through novel fusion proteins primarily originating from pIX. Libraries of filamentous phages can be created from fusion proteins, and a phage display system comprising a phagemid and a helper phage is a part of the invention. An aspect of the invention is a kit containing a phage display system comprising a phagemid that contains a nucleic acid encoding the fusion protein of the invention and a helper phage.
Nextera As | Date: 2010-02-26
The present invention relates to a method for screening phage display libraries against each other. In particular, the invention relates to a method for screening at least two phage display libraries against each other to identify and/or select one or more interacting binding partners or binding molecules making up such interacting binding partners. Kits providing two bispecific phage display libraries are also provided.