Abcam | Date: 2014-05-20
The invention provides a rabbit-derived immortal B-lymphocyte capable of fusion with a rabbit splenocyte to produce a hybrid cell that produces an antibody. The immortal B-lymphocyte does not detectably express endogenous immunoglobulin heavy chain and may contain, in certain embodiments, an altered immunoglobulin heavy chain-encoding gene. A hybridoma resulting from fusion between the subject immortal B-lymphocyte and a rabbit antibody-producing cell is provided, as is a method of using that hybridoma to produce an antibody. The subject invention finds use in a variety of different diagnostic, therapeutic and research applications.
Abcam | Date: 2014-02-10
In certain embodiments, the method may comprise: a) obtaining the antibody sequences from a population of B cells; b) grouping the antibody sequences to provide a plurality of groups of lineage-related antibodies; c) testing a single antibody from each of the groups in a bioassay and, after the first antibody has been identified, d) testing further antibodies that are in the same group as the first antibody in a second bioassay. In another embodiment, the method may comprise: a) testing a plurality of antibodies obtained from a first portion of an antibody producing organ of an animal; b) obtaining the sequence of a first identified antibody; c) obtaining from a second portion of said antibody producing organ the sequences of further antibodies that are related by lineage to said first antibody; and, c) testing the further antibodies in a second bioassay.
Abcam | Date: 2014-01-16
A method for producing a library of engineered-antibody producing cells is provided. In certain cases, the method includes isolating nucleic acid sequences encoding IgH variable regions and IgL variable regions from a plurality of antibody producing cells, and introducing the nucleic acids into host cells to obtain cells that produce antibodies comprising non-naturally paired IgH and IgL variable chains.
The use and care of animals complied with the guideline of the Biomedical Research Ethics Committee at the Shanghai Institutes for Biological Science (CAS), which approved the application entitled ‘Reproductive physiology of cynomolgus monkey and establishment transgenic monkey’ (#ER-SIBS-221106P). Laparoscopy was used for oocyte collection. Oocytes were aspirated from follicles 2–8 mm in diameter, about 32–36 h after hCG stimulation31. The collected oocytes were cultured in the pre-equilibrated maturation medium32. Metaphase II arrested oocytes were selected for perivitelline space injection32 of lentiviruses and ICSI. The lentivirus concentration for injection was 1 × 1010 viral genome (vg) per ml. After microinjection, the oocytes were cultured in the maturation medium at 37 °C (in 5% CO ) for about 1 h, until fertilization by ICSI. Monkey semen was collected by penile electro-ejaculation. For ICSI, a single sperm was immobilized and aspirated with the tail first. A single oocyte was fixed by the holding pipette, and the injection pipette was pushed through the zona pellucida and subsequently through the oolemma to release the spermatozoon32. After ICSI, the oocytes were cultured in pre-equilibrated Hamster Embryo Culture Medium 9 (HECM-9) at 37 °C (in 5% CO ) until the next morning33, 34. Menstrual cycles of females were recorded daily. To synchronize the developmental stage of embryos with the recipient, monkeys were chosen for tubal embryo transfer at 0–3 days after ovulation, and a stigma or a new corpus luteum on the ovary could be observed by laparoscopy. About 2–3 pronuclear-stage embryos were selected for tubal transfer to each surrogate female31. Hair-root samples collected from newborn monkey pups were used to extract DNA. Samples were digested by proteinase K overnight at 65 °C and precipitated for DNA and PCR with specific primers again GFP and mCherry were used for initial genotyping analysis as follows: mCherry-R: 5′-TGCTTGATCTCGCCCTTCAG-3′, mCherry-F: 5′-GCCATCATCAAGGAGTTCATGC-3′; GFP-F: 5′-AAGTTCATCTGCACCACCG-3′, GFP-R: 5′-TCCTTGAAGAAGATGGTGCG-3′. A total of 15 μg of genomic DNA was prepared and digested with BamHI and EcoRI, which released transgenes. Genomic DNAs were separated with 1% agarose gel and transferred to Nippon N+ membrane (GE). DNA probes from hMECP2-2a-GFP was prepared using ready-to-go DNA label kit (279240D-20, GE Life Sciences). 32P-labelled probes were hybridized with blots of genomic DNAs and exposed to phosphor-imager after extensively washing. Decisions of whether euthanasia procedures would be carried out for sick or aborted newborn monkeys are made by veterinarians, after consulting with principal investigators and followed the approved protocol (#ER-SIBS-221106P). Aborted or sick MECP2 TG and WT monkeys were deeply anaesthetized with ketamine hydrochloride (5–10 mg kg−1) to avoid possible pain and then perfused with 0.9% saline with 2–4% paraformaldehyde (PFA) for further immunohistochemistry experiments. The procedure is approved by the Biomedical Research Ethics Committee at the Shanghai Institutes for Biological Science (CAS), described in the protocol entitled ‘Reproductive physiology of cynomolgus monkey and establishment transgenic monkey’ (#ER-SIBS-221106P). After perfusion, the hemispheres of the brain were dissected, cut in to small blocks, fixed with 4% PFA in phosphate buffer, and equilibrated in 30% sucrose. Fixed and equilibrated brain tissue blocks were cut into 30-μm cortical sections with a Microm HM525 cryostat. Sections were washed for 5 min in PBS containing 5% bovine serum albumin (BSA) and 0.3% Triton X-100, and incubated with primary antibodies (in PBS with 3% BSA and 0.3% Triton X-100) overnight at 4 °C and subsequently with corresponding secondary antibodies (Alexa-Fluor-conjugated, Invitrogen, at 1:1,000). DAPI was used to label the nuclei and sections were mounted with 75% glycerol. Other antibodies used: HA antibody (Covance, MMS-101R), NeuN antibody (Millipore, MAB377), MeCP2 antibody (Cell Signaling, 3456S) and GFP antibody (Abcam, ab6673). Four sets of primers targeted to MECP2 were designed. One set (mecp2_1) was a cross-intron primer targeted to transgenic cDNA fragments representing the copy number of transgenic DNA; the second (mecp2_2) was targeted to one exon of transgenic cDNA fragments representing the total MECP2 copy number; and the other two primer sets (mecp2IN_1 and mecp2IN_2) were targeted to introns of monkey MECP2 gene representing the endogenous MECP2 copy number. Two sets of EGFP primers (EGFP_1 and EGFP_2) were designed to verify the copy number of the transgene, and one set of mCherry primers was designed as negative control. The copy number of these DNA fragments was measured using custom-designed Multiplex AccuCopyTM Kit (Geneskies Biotechnologies, CN0105). The copy number of these target DNA fragments was measured using custom-designed Multiplex AccuCopy kit (Geneskies Biotechnologies, CN0105). For each DNA fragment amplified, a piece of synthesized competitive double-stranded DNA of known concentration and with insertions or deletions of a few base pairs was added to the PCR reaction mix. Each PCR reaction was carried out by mixing the synthesized competitive double-stranded DNAs for target and reference genes (POP1, RPP14 and POLR2A) together with a defined amount of sample DNAs. A multiplex competitive PCR was then performed to simultaneously amplify all reference and target genes from both sample and competitive DNAs using multiple fluorescence-labelled primer pairs. In brief, the 20-μl PCR reaction for each sample contained 1× AccuCopy PCR Master Mix, 1× Fluorescence Primer Mix, 1× Competitive DNA mix and ~10 ng sample DNA. The PCR program used was: 95 °C for 10 min; 11 cycles of 94 °C for 20 s, 65 °C–0.5 °C/cycle 40 s, 72 °C for 1.5 min; 24 cycles of 94 °C for 20 s, 59 °C for 30 s, 72 °C for 1.5 min; 60 °C for 60 min. PCR products were diluted 20-fold before loaded on ABI3730XL sequencer (Applied Biosystems) to separate amplicons of different sizes by capillary electrophoresis. Raw data were analysed using GeneMapper4.0, and the peak ratios of sample DNA to competitive DNA (S/C ratio) for all target and reference fragments were exported to Excel. The S/C ratio of each target fragment was first normalized to the S/C ratio of the reference genes, and then further normalized to the median copy number of the entire data set. The final normalized ratio was averaged for each MECP2 primer and EGFP primer, and the similarity between the two ratio further confirmed the copy number of the transgene. Lentiviruses were produced by standard protocols and provided at a titre of 1010 vg ml−1 by the Shanghai SBO Medical Biotechnology Co. Ltd. A total of 2 μg genomic DNA was used to construct a DNA library for each case35, 36, 37, 38. Sequencing linkers were further added onto genomic segments (length around 500–700 base pairs (bp)) (Extended Data Fig. 1b). After end repairing and 3′ A-adding, the fragmented DNAs were ligated with Y-shape adaptor. Amplification was performed with the adaptor primers. Asymmetry-primer PCR (APP) was used to enrich the viral integration sites in each library. The APP method includes two PCR systems. The first PCR system includes only LTR specific primer. After 12 cycles of linear amplification, adaptor specific primer was added in the PCR system followed by 12 cycles of exponential amplification. PCR products were purified using 0.7 × AMPure beads (Beckman, A63882). The second PCR system uses a pair of primers nest the primers in the first PCR system. After 12 cycles of linear amplification and 15 cycles of exponential amplification, the PCR products of 500–700 bp in size were isolated by agarose gel electrophoresis before being used to construct libraries with Illumina paired-end adapters according to the manufacturer protocol and sequenced by Illumina MiSeq V3 (2 × 300 base paired ends). Only the paired-end reads showing the fusions of viral sequences and the cynomolgus (Macaca fascicularis) genome segments were selected, in which two mismatches were allowed. The reads showing the same integration position were merged and treated as a unique integration site. Experiments were repeated three times independently with different sequencing linkers. Determination of insertion sites is under the following criteria: (1) total insert numbers are greater than 100 times after three experiments; (2) being detected at least twice after three experiments. Cynomolgus monkey genome is used in the following database: http://www.ncbi.nlm.nih.gov/genome/?term=crab+eating+monkey. Target sequences containing LTR of transgene cassettes and genomic segments flanking the transgenes were analysed (Supplementary Tables 2 and 6). Monkey brain tissues were homogenized in RIPA buffer (containing 50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 1% Triton X-100, 0.1% SDS, 1% sodium deoxycholate, protease inhibitor cocktail and phosphatase inhibitor cocktail) on ice and then centrifuged at 1,000g for 10 min at 4 °C. The supernatant was stored at −80 °C until use. Protein concentration was measured with BCA method. Approximately 30 μg protein of each sample was loaded in on 10% SDS–PAGE and run at 120 V constant voltage. A constant current of 0.36 mA was used for transblotting. Blots were probed with primary antibodies (1:1,000) overnight at 4 °C. After washing three times, blots were then incubated with goat anti-rabbit secondary antibody (1:3,000) at room temperature for 2 h. Chemiluminescence was used to visualize protein bands. Antibodies used: HA antibody (Abcam, ab9110), MeCP2 antibody (Cell Signaling, 3456S) and GFP antibody (Invitrogen, A11122). Fresh whole blood from upper arms of monkeys was taken by a professional veterinarian in the morning before feeding. Whole blood (200–400 μl) was dropped onto filter paper immediately. After air drying, filter papers were store at −20 °C before mass spectrometry analysis. API2000 from AB SCIEX were used for analysing fatty acid and amino acids. Data were obtained from three rounds of blood collections independently. Behaviour observation and analysis were performed by two independent trained observers, with demonstrated inter-observer reliability of at least 80%. All observers were blinded to the genotypes of the monkeys. The dimension of cages using for living and behaviour monitoring is 1.5 × 1 × 1.1 m. Monkeys individually, were observed alone in an observation cage (1.5 × 1 × 1.1 m) after they had been accustomed to community living following weaning. The observation cage was similar to their home cage. All locomotion behaviours were video-record without interruption for 20 min each day for 5 days. Data from 5 days were pooled. Social behaviours of TG and WT monkeys with familiar and unfamiliar monkeys were studied by examining the interactions of monkey from the same and different home cage, respectively. To study the interaction with familiar monkeys, we housed three groups of monkeys, each consisting three WT and two TG monkey of the same age, in three separate cages for 6 months before the observation (at about 1.5 years old). In this analysis, the observer followed the time each monkey spent sitting together with another monkey for a duration of 1 h each day for 5 consecutive days. We defined that two monkeys sat together by obvious interactions between the two for more than 3 s, during which the monkeys may exhibit touching and grooming behaviours or lean against each other. To study interaction with unfamiliar monkeys, we regrouped the females from same cohorts after the above observation for another 8 months in four separate cages (see Supplementary Table 4a, b). (Males were kept together separately owing to their proximity to sexual maturity, thus not used for observation). For each observation of social interaction, we paired two monkeys from different group and observation was made in the same manner as that described above for the interaction between familiar monkeys. To study the interaction with F TG monkeys, we housed two groups of monkeys (group info see Supplementary Table 7), each consisting of three WT and two TG F monkeys of similar age (at 10–11 months old), in two separate cages before the observation. In this analysis, the observer followed the time each monkey spent sitting together with another monkey for a duration of 1 h each day for 5 consecutive days. The TAD behavioural model was used to assay the monkey’s response to human gaze (Extended Data Fig. 5a). In each session of observation, an individual monkey from either the transgenic or WT group was placed in an observation cage (1.5 × 1 × 1.1 m), and allowed to adapt to the cage alone for 9 min. An observer then sat in front of the cage at a distance of 2 m, showing the face profile to the monkey without eye contact for 9 min (‘non-gaze period’). This was followed by the relaxation period (3 min) without the human presence, and the ‘gaze period’ (9 min) in which the observer sat in front of the cage and gazed at monkey with a neutral face. Behaviour and vocalizations were recorded on videotape39, 40, 41. WGTA tests were performed on 8 TG and 6 WT monkeys at the age of 1.5 years, in accordance to WGTA protocol25, 26, by trained technicians. The WGTA apparatus includes a testing box that for observing subject’s activity, a presentation board with food wells for reward placing, a trial door and an access door connected by pulley cord to separate the subject and presentation board, and a camera for recording. All tests were carried out in a quiet and standard lighted room. This test includes three stages: adaptation, discrimination and reversal. For the adaptation step, each monkey was tested for the ability to take the food reward on the presentation board that was placed by experimenter. Before the adaptation step, the monkey needs to pass several pre-test steps: the reward was placed in front of the food well, in the food well, in the food well next to the adaptation block, and in the food well with half covered by the adaptation block. Finally for adaptation step, the monkey had to take the food in the food well with the block covered completely. Each monkey received a maximum of 25 trials per day, and was considered to be passed when showing correct responses on 23 out of 25 trials. During the discrimination step, each monkey needed to choose the only reward in the food well that was covered by either a black or white block with an empty well covered the opposite colour. The same monkey was always rewarded with either black or white but with random location, with assignment of monkeys by the Gellerman order. Each monkey received 25 trials per day and was considered to be passed when showing correct responses on 23 out of 25 trials. For the reversal step, the procedure was the same as discrimination step, except that the monkey was rewarded black if white was rewarded during the discrimination step, and vice versa. This test includes four steps: adaptation, Hamilton search, Hamilton search set-breaking, and Hamilton search forced set-breaking. The adaptation step was similar to that for black/white test, with the same criterion for passing. For the Hamilton search step, four little boxes that represented the different positions from experimenter’s left to right on the presentation board were used for testing. The only reward was randomly placed in one of the four closed boxes in each trial. Monkey was allowed to find the reward from these four closed boxes. One trial was terminated when the monkey open the correct box. Each monkey performed 25 trials per day for 5 consecutive days. For the Hamilton search set-breaking step, the box that was the least preferred was first determined from the above step, and was always rewarded when chosen by the monkey. One trial was terminated when the subject open the correct box. Each monkey performed 25 trials per day for 5 days. For the Hamilton search forced set-breaking step, the procedure was the same as the set-breaking test, except that the monkey was allowed to make only one choice for finding the reward that placed in the least preferred box. The monkey was scored for the rate of correct choice over 25 trials each day for 5 consecutive days. The monkeys were tested for the ability to distinguish 240 pairs of toys. The toys in each pair were labelled A or B to cover the two food wells, one of which had food. For each monkey, either A or B was always rewarded. Each pair of toys was presented for 6 trials and 6 pairs were tested each day. Six different pairs were used for different days, with the test lasting 8 weeks until all 240 pairs were used. The monkey was scored for the rate of correct choice, averaged over 180 trials (5 days). Total RNA was extracted from three independent pieces of cortical tissues from brains of T05, T07, T09 and T14 and four WT monkeys by Trizol reagent (Invitrogen) separately. The RNA quality was checked by Bioanalyzer 2200 (Aligent) and kept at −80 °C. The RNA with RIN (RNA integrity number) > 8.0 is acceptable for cDNA library construction. RNA-seq and bioinformatic data analysis were performed by Shanghai Novelbio Ltd. The cDNA libraries for single-end sequencing were prepared using Ion Total RNA-Seq Kit v2.0 (Life Technologies) according to the manufacturer’s instructions. The cDNA libraries were then processed for the proton sequencing process according to the commercially available protocols. Samples were diluted and mixed, the mixture was processed on a OneTouch 2 instrument (Life Technologies) and enriched on a OneTouch 2 ES station (Life Technologies) for preparing the template-positive Ion PI Ion Sphere Particles (Life Technologies) according to Ion PI Template OT2 200 Kit v2.0 (Life Technologies). After enrichment, the mixed template-positive Ion PI Ion Sphere Particles of samples was loaded on to 1 P1v2 Proton Chip (Life Technologies) and sequenced on Proton Sequencers according to Ion PI Sequencing 200 Kit v2.0 (Life Technologies). Before read mapping, clean reads were obtained from the raw reads by removing the adaptor sequences, reads with >5% ambiguous bases (noted as N) and low-quality reads containing more than 20% of bases with qualities of <13. The clean reads were then aligned to crab eating macaque genome (version: Mfa5.0) using the MapSplice program (v2.1.6). In alignment, preliminary experiments were performed to optimize the alignment parameters (-s 22 -p 15–ins 6–del 6–non-canonical) to provide the largest information on the AS events42. Dif-Gene-Find er-t. We applied DEseq algorithm to filter the differentially expressed genes, after the significant analysis and false discovery rate (FDR) analysis under the following criteria: (1) fold change > 1.5 or < 0.667; (2) FDR < 0.05 (ref. 43). A Volcano plot was drawn by P value based on the differential gene analysis, and the colour was determined by the filtering criteria (red, log (P value) > 1.5; blue, log (P value) < 1.5; black, log (FC(TG/WT)) < ±0.5). The F offspring was generated by ICSI using sperms obtained from testicular tissue xenografts of the T07 monkey. The method of testicular xenografting greatly shortened the time required for sexual maturation of TG monkey44.
No statistical methods were used to predetermine sample size. Experiments were not randomized, and investigators were not blinded to allocation during experiments and outcome assessment. The recombineering technique21 was adapted to construct all targeting vectors for homologous recombination in ES cells. Retrieval vectors were obtained by combining 5′ miniarm (NotI/SpeI), 3′ miniarm (SpeI/BamHI) and the plasmid PL253 (NotI/BamHI). SW102 cells21 containing a BAC encompassing the carboxy-terminal part of the gene encoding the remodeller, were electroporated with the SpeI-linearized retrieval vector. This allowed the subcloning of genomic fragments of approximately 10 kilobases (kb) comprising the last exon of the gene encoding each remodeller. The next step was the insertion of a TAP-tag into the subcloned DNA, immediately 3′ to the coding sequence. The TAP-tag was (Flag) -TEV-HA for Chd1, Chd2, Chd4, Chd6, Chd8, Ep400, Brg1 and 6His-Flag-HA for Chd9. We first inserted the TAP-tag and an AscI site into the PL452 vector, to clone 5′ homology arms as SalI/AscI fragments into the PL452TAP-tag vector. 46C ES cells were electroporated with NotI-linearized targeting constructs and selected with G418. In all cases, G418-positive clones were screened by Southern blot. Details on the Southern genotyping strategy, as well as sequences of primers and plasmids used in this study are available on request. Correctly targeted ES cell clones were karyotyped, and the expression of each tagged remodeller was controlled by western blot analysis, using antibodies against Flag and haemagglutinin (HA) epitopes (see Extended Data Fig. 6). We also verified by immunofluorescence, using monoclonal antibodies anti-Flag (M2, Sigma F1804) and anti-HA (HA.11, Covance MMS-101P) epitopes, that each tagged remodeller was properly localized in the nucleus of ES cells. ES cell lines expressing a tagged remodeller were all indistinguishable in culture from their mother cell line (46C). Pluripotency of tagged ES cell lines was verified by detecting alkaline phosphatase activity on ES cell colonies 5 days after plating, using the Millipore alkaline detection kit, following manufacturer’s instructions. In addition, we verified by immunofluorescence using an antibody against Oct4 (also known as Pou5f1) (Abcam ab19857, lot 943333) that expression of this pluripotency-associated transcription factor was uniform in each tagged ES cell line. Mouse 46C ES cells have been described previously22. 46C ES cells and their tagged derivatives were cultured at 37 °C, 5% CO , on mitomycin C-inactivated mouse embryonic fibroblasts, in DMEM (Sigma) with 15% fetal bovine serum (Invitrogen), l-glutamine (Invitrogen), MEM non-essential amino acids (Invitrogen), penicillin/streptomycin (Invitrogen), β-mercaptoethanol (Sigma), and a saturating amount of leukaemia inhibitory factor (LIF), as described previously23. Mouse ES nucleosomal tags were acquired from a published MNase-seq data set7 to make the reference map shown in Fig. 2. Reference nucleosomes were called using MACS 2.0 before assigning the first MNase-resistant nucleosome upstream and downstream of TSSs as −1 and +1, respectively. Because long NFRs may actually contain MNase-sensitive nucleosome-like structures or histone-containing complexes, defining the first downstream MNase-resistant nucleosome as ‘+1’ is problematic, and so we refer to it as the ‘first stable nucleosome’. Regions between the associated −1 and +1 (or first stable) nucleosomes were defined as NFRs. We further defined narrow and wide NFR categories, which have the median width of 28 bp and 808 bp, respectively. We define HFRs as lacking histones as defined by ChIP-seq. The list of 14,623 genes used in Figs 1 and 2 was obtained by filtering all mm9 RefSeq genes24. We removed redundancies (that is, genes having the same start and end sites), unmappable genes, blacklisted genomic regions (those with artefact signal regardless of which NGS techniques were used), and genes shorter than 2 kb. The purpose of this last filtering step was to unambiguously distinguish the promoter region from the end of the genes in heat maps. Lists of genes defined as having H3K4me3 and bivalent promoters: we first defined, among the 14,623 RefSeq genes, those with a promoter that was positive for H3K4me3 (accession number: GSM590111). This was accomplished by operating with the seqMINER platform. Tag densities from this data set were collected in a −500/+1,000-bp window around the TSS, and subjected to three successive rounds of k-means clustering, to remove all genes with a promoter that was clustered with low H3K4me3. We next conducted on this series of H3K4me3-positive promoters three successive rounds of k-means clustering, using several published data sets for H3K27me3. The genes with a promoter positive for H3K27me3 in four distinct H3K27me3 data sets (accession numbers: GSM590115, GSM590116, GSM307619 and GSM392046/GSM392047) were considered as bivalent. We eventually obtained a list of 6,481 genes with H3K4me3-only promoters, and a list of 3,411 bivalent genes. A detailed version of this protocol is available on the protocol exchange website: http://dx.doi.org/10.1038/protex.2014.040. In brief, about 400 million ES cells were fixed either with formaldehyde, or with a combination of disuccinimidyl glutarate (DSG) and formaldehyde (Supplementary Table 1), then permeabilized with IGEPAL, and incubated with 2,800 units of micrococcal nuclease (MNase, New England Biolabs) in order to fragment the genome into mononucleosomes (Extended Data Fig. 1). This nucleosome preparation was next incubated with agarose beads coupled with an antibody anti-HA or anti-Flag. Anti-HA-agarose (ref. A2095) and anti-Flag-agarose (ref. A2220) beads were purchased from Sigma. After a series of washes, tagged remodeller–nucleosome complexes were eluted, either by TEV protease cleavage or by peptide competition (Supplementary Table 1). The eluted complexes were then subjected to a second immunopurification step, using beads coupled to the antibody specific of the second HA or Flag epitope. After elution, DNA was extracted from the highly purified mononucleosome fraction, and processed for high-throughput sequencing (see below). As a negative control, chromatin from untagged ES cells was subjected to the same protocol to define background signal. Two biological replicates were used for each tagged and control ES cell line, using independent cell cultures and chromatin preparations. After crosslink reversion, phenol–chloroform extraction and ethanol precipitation, the DNA from remodeller–nucleosome complexes was quantified using the picogreen method (Invitrogen) or by running 1/20 of the ChIP material on a high sensitivity DNA chip on a 2100 Bioanalyzer (Agilent). Approximately 5–10 ng of ChIP DNA was used for library preparation according to the Illumina ChIP-seq protocol (ChIP-seq sample preparation kit). Following end-repair and adaptor ligation, fragments were size-selected on an agarose gel in order to purify nucleosome-sized genomic DNA fragments between 140 and 180 bp. Purified fragments were next amplified (18 cycles) and verified on a 2100 Bioanalyzer before clustering and single-read sequencing on an Illumina Genome Analyzer (GA) or GA II, according to manufacturer’s instructions. Sequencing characteristics are shown in Supplementary Table 1. Chd1, Chd2, Chd4, Chd6, Chd8, Chd9, Ep400 and Brg1 MNase remodeller ChIP-seq short reads were mapped to mouse mm9 genome using Bowtie 0.12.7 with the followings settings: -a -m1–best–strata -v2 -p3. Data sets were next converted to BED format files, and data analysis was performed using the seqMINER platform25 (Fig. 1c). To examine the distribution of remodellers at individual genes, we used WigMaker3 (default settings) to convert BED files into wig files, which were uploaded onto the IGV genome browser (Extended Data Fig. 2). Nucleosome calls were made from MNase remodeller ChIP-seq tags using GeneTrack26 with the following parameters: sigma = 20, exclusion = 146. We then globally shifted tags to the median value of half distances of all nucleosome calls. GRO-seq tags10 sharing the same or opposite orientation with the TSS were assigned as ‘sense’ and ‘divergent’ tags, respectively. The orientation of each NFR was arranged so that sense transcription proceeds to the right. ES nucleosomal tags, globally shifted tags from MNase remodeller ChIP-seq (this current study), tags from DHS regions (Mouse ENCODE), GRO-seq oriented tags from transcriptionally engaged Pol II and CpG islands (UCSC, mm9 build) were then aligned to the midpoint of each NFR. Promoter regions were then sorted by NFR length and visualized by Java TreeView (Fig. 2a, b). CpG island information was retrieved from UCSC (mm9 build) and assigned to the closest TSS by using bedtools. We noticed that promoters with wide NFRs were mostly CpG island (CpGI)-rich, while those with narrow NFRs were globally CpGI-poor, in agreement with a previous report showing that CpGIs induce nucleosome exclusion9 (Fig. 2b). Tags from reference nucleosomes7, remodeller-interacting nucleosomes (this study) and transcriptionally engaged Pol II (GRO-seq)10 were aligned to nucleosome −1 and +1 (or the first stable nucleosome) dyad positions. The direction of each dyad was assigned according to the orientation of its associated TSS, the orientation of which was arranged so that the transcription proceeds to the right. After normalization to the gene count in the two different NFR subclasses, tags were plotted from the NFR midpoint to 500 bp distal to the reference nucleosome. An x axis gap in the NFR was introduced to normalize variations in NFR length inside each class. We used DNaseI-Seq data from the mouse ENCODE consortium (GSM1004653) for the identification of DHS regions in the mouse ES cell genome. DHS regions were defined using MACS 2.0 (ref. 27) (default setting), which resulted in the identification of 139,454 DHS regions. Each of these DHS regions was represented as a 500-bp window (−250 bp/+250 bp) centred on the midpoint of the DHS peak. DHS regions overlapping with the blacklisted (high background signal) genomic areas (mm9) were removed, resulting in a final list of 138,582 DHS regions. Tags from each tested ChIP-seq data set were summed up for each DHS region before pair-wise Pearson correlation comparison. The R2 value from each pair-wise Pearson correlation was then visualized by heat map (Fig. 1a). Pearson correlation analysis at promoter-like DHS regions. Operating with the seqMINER platform, we retrieved, from the 138,582 DHS regions list, those positive for H3K4me3, TBP and Pol II S5ph. We obtained 16,300 promoter-like DHS regions befitting the criteria. Pair-wise Pearson correlation was performed and plotted (Fig. 1b) as described for Fig. 1a. We used the pHYPER shRNA vector for remodeller depletion in ES cells, as previously described28. shRNA design was performed using DSIR software (http://biodev.extra.cea.fr/DSIR/DSIR.html). Below are the shRNAs selected for each remodeller. The sense strand sequence is given; the rest of the shRNA sequence is as described previously28. Chd1 shRNA 1: 5′-GCAAAGACGGCGACTAGAAGA-3′; Chd1 shRNA 2: 5′-GACAGTGCTTAATCAAGATCG-3′; Chd4 shRNA 1: 5′-GGACGACGATTTAGATGTAGA-3′; Chd4 shRNA 2: 5′-GCTGACGTCTTCAAGAATATG-3′; Chd6 shRNA 1: 5′-GTACTATCGTGCTATCCTAGA-3′; Chd6 shRNA 2: 5′-CAGTCAGAACCCACAATAACT-3′; Chd8 shRNA 1: 5′-GCAGTTACACTGACGTCTACA-3′; Chd8 shRNA 2: 5′-GACTTTCTGTACCGCTCAAGA-3′; Chd9 shRNA 1: 5′-TATACCAATTGAACAAGAGCC-3′; Chd9 shRNA 2: 5′-AGTTAAAGTCTACAGATTAGT-3′; Ep400 shRNA 1: 5′-GGTAAAGAGTCCAGATTAAAG-3′; Ep400 shRNA 2: 5′-GGTCCACACTCAACAACGAGC-3′; Smarca4 shRNA 1: 5′-ACTTCTTGATAGAATTCTACC-3′; Smarca4 shRNA 2: 5′-CCTTCGAACAGTGGTTCAATG-3′. Each shRNA was transfected in its corresponding tagged ES cell line, to follow remodeller depletion by western blotting using monoclonal antibodies anti-Flag (M2, Sigma F1804), or anti-HA (H7, Sigma H3663) epitopes (Extended Data Fig. 6), in comparison with the signal obtained with a control antibody anti-Gapdh (Abcam ab9485). The pHYPER shRNA vectors were transfected in ES cell by electroporation, using an Amaxa nucleofector (Lonza). Twenty-four hours after transfection, puromycin (2 μg ml−1) selection was applied for an additional 48 h period, before cell collection and RNA preparation, except for Chd4, for which cells were collected after 30 h of selection. Total RNA was extracted using an RNeasy kit (Qiagen). Total RNA yield was determined using a NanoDrop ND-100 (Labtech). Total RNA profiles were recorded using a Bioanalyzer 2100 (Agilent). For each remodeller, RNA was prepared from three independent transfection experiments, and processed for transcriptome analysis. 46C ES cells were amplified on feeder cells except for the last passage, at which point cells were plated onto 60-mm dishes coated with gelatine, and grown to 70% confluence in D15 medium with LIF. Total RNA was extracted using an RNeasy Kit (Qiagen). The RNA quality was verified on a 2100 Bioanalyzer. Library preparation was performed using the Illumina mRNaseq sample preparation kit according to manufacturer’s instructions. Briefly, the total RNA was depleted of ribosomal RNA using the Sera-mag Magnetic Oligo (dT) Beads (Illumina) and after mRNA fragmentation, reverse transcription and second strand cDNA synthesis the Illumina specific adaptors were ligated. The ligation product was then purified and enriched with 15 cycles of PCR to create the final library for single-read sequencing of 75 bp carried out on an Illumina GAIIx. To keep only sequences of good quality, we retained the first 40 bp of each read and discarded all sequences with more than 10% of bases having a quality score below 20, using FASTX-Toolkit (http://hannonlab.cshl.edu/fastx_toolkit/). Mapping of these sequences onto the mm9 assembly of mouse genome and RPKM computation were then performed using ERANGE v3.1.0 (ref. 29) and bowtie v0.12.0 (ref. 30). In brief, a splice file was created with UCSC known genes and maxBorder = 36. We created an expanded genome containing genomic and splice-spanning sequences using bowtie-build and bowtie was used to map the reads onto this expanded genome. Then the ERANGE runStandardAnalysis.sh script was used to compute RPKM values following steps previously described29, using a consolidation radius of 20 kb. Random-primed reverse transcription was performed at 52 °C in 20 μl using Maxima First strand cDNA synthesis kit (Thermo Scientific) with 1 μg of total RNA isolated from ES cells (Qiagen), quantified with NanoDrop instrument (Thermo Scientific). Reverse transcription products were diluted 40-fold before use. Composition of quantitative PCR assay included 2.5 μl of the diluted RT reaction, 0.2–0.5 mM forward and reverse primers, and 1× Maxima SYBR Green qPCR Master Mix (Thermo Scientific). Reactions were performed in a 10 μl total volume. Amplification was performed as follows: 2 min at 95 °C, 40 cycles at 95 °C for 15 s and 60 °C for 60 s in the ABI/Prism 7900HT real-time PCR machine (Applied Biosystems). The real-time fluorescent data from qPCR were analysed with the Sequence Detection System 2.3 (Applied Biosystems). Each qPCR reaction was performed using the set of primer pairs listed in Supplementary Table 2, validated for their specificity and efficiency of amplification. All reactions were performed in triplicates, using RNA prepared from three independent cell transfection experiments. Control reactions without enzyme were verified to be negative. Relative expression was calculated after normalization with three reference genes (Actb, Nmt1 and Ddb1), validated for this study. cRNA was synthesized, amplified and purified using the Illumina TotalPrep RNA Amplification Kit (Life Technologies) following Manufacturer’s instructions. In brief, 200 ng of RNA were used to prepare double-stranded cDNA using a T7 oligonucleotide (dT) primer. Second-strand synthesis was followed by in vitro transcription in the presence of biotinylated nucleotides. cRNA samples were hybridized to the Illumina BeadChips Mouse WG-6v2.0 arrays. These BeadChips contain 45,281 unique 50-mer oligonucleotides in total, with hybridization to each probe assessed at 30 different beads on average. A total of 26,822 probes (59%) are targeted at RefSeq transcripts, and the remaining 18,459 (41%) are for other transcripts. BeadChips were scanned on the Illumina iScan scanner using Illumina BeadScan image data acquisition software (version 2.3). Data were then normalized using the ‘normalize quantiles’ function in the GenomeStudio Software (version 1.9.0). Following analyses were done using Genespring software (version 13.0-GX). For Brg1, we used a previously published transcriptome data set, in which loss of Brg1 function was obtained by genetic ablation18. All array analyses were undertaken using the Limma package from the R/Bioconductor software (R-Development-Core-Team, 2007). Microarray spot intensities were normalized using the RMA method as implemented in the R affy package. Normalized measures served to compute the log ratios for each gene between the wild-type strain and the Brg1 knockout mutant. Then, to identify genes with a log ratio significantly different between the mutant and wild- type strain, P values were calculated for each gene using a moderated t-test. The moderated t-test applied here was based on an empirical Bayes analysis and was equivalent to shrinkage (or expansion) of the estimated sample variances towards a pooled estimate, resulting in a more stable inference. Finally, adjusted P values were calculated using the false discovery rate (FDR)-controlling procedure of Benjamini and Hochberg. We identified deregulated genes using the thresholds of 0.05 for the P value, and 1.5 for the fold change (FC 1.5). This FC 1.5 threshold was chosen based on a previous study on Brg1 (ref. 18), and also because it was compatible with the analysis of the remodellers more modestly involved in transcriptional control in ES cells such as Chd1, Chd6 and Chd8. Note that seemingly modest fold changes might arise from many sources including a response lag, residual remodelling activity, and relatively high experimental background. Using a FC 2 threshold, we could, however, confirm that Ep400, Chd4 and Brg1 are important transcriptional regulators in ES cells, with 535, 293 and 570 genes deregulated, respectively. This level of deregulation is indicative of a context-specific function of remodellers in transcriptional activation or repression, which is distinct from the function of general transcription factors, whose depletion is expected to affect most genes. Statistical analysis of the differences in transcriptional activation and repression by remodellers was performed using a two-sample test for equality of proportions with continuity correction. For the generation of GC-content-based lists of promoters, we used the list of promoters defined in figure 3 of ref. 15, which we crossed with the 14,623 promoter list, to obtain a list of 6,317 promoters rank ordered according to GC content. In Fig. 3b, we compared the percentages of genes either down- or upregulated by loss of function of each remodeller in the following two groups: (1) NFR length classes: genes from the narrow and wide NFR classes shown in Fig. 2a were each further divided into two subclasses, which resulted in the following four categories: narrow NFR subclass 1 (NFR < 15 bp), narrow NFR subclass 2 (15–115 bp NFR), wide NFR subclass 1 (116–504 bp) and wide NFR subclass 2 (505–1,500 bp). Genes in these groups were further subdivided into H3K4me3 and bivalent subgroups. (2) GC content classes: genes were divided into four quartiles based on GC content at promoters and further subdivided into H3K4me3 and bivalent subclasses. The number of genes analysed in Fig. 3b is indicated in brackets for the following subgroups. H3K4me3 genes: narrow NFR subclass 1 (739), subclass 2 (1,829), wide NFR subclass 1 (2,613), subclass 2 (1,253), GC content quartile 1 (low GC content) (450), quartile 2 (719), quartile 3 (644), quartile 4 (high GC content) (430). Bivalent genes: narrow NFR subclass 1 (271), subclass 2 (866), wide NFR subclass 1 (2,266), subclass 2 (1,184), GC content quartile 1 (220), quartile 2 (485), quartile 3 (750) and quartile 4 (1149). FAIRE was performed as described31 with modifications. 46C ES cells were amplified as described above for RNA preparation. Formaldehyde was added directly to the growth media (final concentration 1%), and cells were fixed for 5 min at room temperature. After quenching with glycine (125 mM) and several washes, cells were collected, resuspended in 500 μl of cold lysis buffer (2% Triton X-100, 1% SDS, 100 mM NaCl, 10 mM Tris-HCl, pH 8.0 and 1 mM EDTA) and disrupted using glass beads for five 1-min sessions with 2-min incubations on ice between disruption sessions. Samples were then sonicated for 16 sessions of 1 min (30 s on/30 s off) using a bioruptor (Diagenode) at max intensity, at 4 °C. After centrifugation, the supernatant was extracted twice with phenol–chloroform. The aqueous fractions were collected and pooled, and a final phenol–chloroform extraction was performed before DNA precipitation. FAIRE experiments were realized in triplicate, using independent ES cell cultures. Before sequencing, FAIRE DNA was analysed and quantified by running 1/25 of the FAIRE material on a high sensitivity DNA chip on a 2100 Bioanalyzer (Agilent, USA). Approximately 20 ng of FAIRE DNA was used for library preparation according to manufacturer’s instructions using the ChIP-seq sample preparation kit (Illumina). Single-read sequencing (36 bp) was performed on a Genome Analyzer II (Illumina). ES cells were grown and transfected with shRNA vectors as described for RNA analysis. Biological replicates were obtained by performing two independent transfection experiments for each shRNA vector. ATAC-seq libraries were constructed by adapting a published protocol20. In brief, 50,000 cells were collected, washed with cold PBS and resuspended in 50 μl of ES buffer (10 mM Tris, pH 7.4, 10 mM NaCl, 3 mM MgCl ). Permeabilized cells were resuspended in 50 μl transposase reaction (1× tagmentation buffer, 1.0–1.5 μl Tn5 transposase enzyme (Illumina)) and incubated for 30 min at 37 °C. Subsequent steps of the protocol were performed as previously described20. Libraries were purified using a Qiagen MinElute kit and Ampure XP magnetic beads (1:1.6 ratio) to remove remaining adapters. Libraries were controlled using a 2100 Bioanalyzer, and an aliquot of each library was sequenced at low depth onto a MiSeq platform to control duplicate level and estimate DNA concentration. Each library was then paired-end sequenced (2 × 100 bp) on a HiSeq instrument (Illumina). As ATAC-seq libraries are composed in large part of short genomic DNA fragments, reads were cropped to 50 bp using trimmomatic-0.32 to optimize paired-end alignment. Reads were aligned to the mouse genome (mm9) using Bowtie with the parameters -m1-best-strata -X2000, with two mismatches permitted in the seed (default value). The -X2000 option allows the fragments <2 kb to align and -m1 parameter keeps only unique aligning reads. Duplicated reads were removed with picard-tools-1.85. To perform differential analysis, libraries were adjusted to 33 million aligned reads using samtools-1.2 and by making a random permutation of initial input libraries (shuf linux command line). Adjusted BAM data sets were next converted to BED. We used the seqMINER platform with the lists of 6,481 H3K4me3-only and 3,411 bivalent genes described above, to collect tag densities from ATAC-seq data sets, in a window of −2 kb/+2 kb around the TSS. Output tag density files were analysed using R software to establish average ATAC-seq signal profiles shown in Extended Data Fig. 8. ES cells were grown and transfected with shRNA vectors as described above. Biological replicates were obtained by performing two independent transfection experiments for each shRNA vector. For each experiment, 1 million cells were fixed 10 min in ES cell culture medium containing 1% formaldehyde, quenched with glycine (125 mM), washed with PBS buffer, collected in 175 μl of solution I (15 mM Tris-HCl, pH 7.5, 0.3 M sucrose, 60 mM KCl, 15 mM NaCl, 5 mM MgCl and 0.1 mM EGTA), and stored on ice. Cells were permeabilized by adding 175 μl of solution II (solution I with 0.8% Igepal CA-630 (Sigma)) and incubating for 15 min on ice. We next added 700 μl of MNase digestion buffer (50 mM Tris-HCl, pH 7.5, 0.3 M sucrose, 15 mM KCl, 60 mM NaCl, 4 mM MgCl and 2 mM CaCl2), 4 U of MNase, and incubated for 10 min at 37 °C. MNase digestion was stopped by adding 10 mM EDTA (final concentration), and storing on ice. Cells were then disrupted by 15 passages through a 25 G needle, followed by a 10 min centrifugation at 18,000g. The supernatant was collected and incubated for 1 h at 65 °C with 15 μg of RNase A. We next added 10 μg of proteinase K, adjusted each sample to 0.1% SDS (final concentration) and incubated for 2 h at 55 °C. NaCl concentration was then adjusted to 200 mM and the samples were incubated overnight at 65 °C for crosslink reversal. DNA was purified from each sample by phenol–chloroform extraction followed by ethanol precipitation. Purified DNA (20 ng) was used for library preparation according to manufacturer’s instructions, using Ultralow ovation library system (Nugen). Following end-repair and adaptor ligation, fragments were size-selected onto an agarose gel in order to purify genomic DNA fragments between ~60 and 220 bp. Libraries were verified using a 2100 Bioanalyzer before clustering and paired-read sequencing. Sequencing of each sample was performed in a single lane of a HiSeq instrument (Illumina). The midpoint of each paired-end sequencing read was used to represent dyad location of each nucleosomal tag. We assumed that remodeller depletion has no bulk effect on nucleosome occupancy, hence the total reads of control and remodeller-depleted cells were adjusted to be the same. The adjusted tags were aligned to −1 nucleosome dyads (determined by the first MNase-defined peak upstream of annotated RefSeq TSS), or the first stable (MNase-defined) nucleosome dyad position downstream of the TSS for different NFR categories. These tags were further normalized to the amount of genes involved in each NFR class. The normalized tags were then binned (5 bp) and smoothed (10-bin moving average) before plotting (Fig. 3c). Distances (bp) are indicated relative to these reference points. An x axis gap in the NFR was introduced to normalize variations in NFR length inside each class. ES cells were grown and transfected with shRNA vectors as described above. Biological replicates were obtained by performing two independent transfection experiments for each shRNA vector. Following a 10 min fixation with 1% formaldehyde in ES cell culture medium, chromatin was prepared from 5–10 million cells and sonicated as described32. ChIP-exo experiments were carried out essentially as described33. This included an immunoprecipitation step using antibodies against Pol II (sc-899, Santa Cruz Biotechnology) attached to magnetic beads, followed by DNA polishing, A-tailing, Illumina adaptor ligation (ExA2), and lambda and recJ exonuclease digestion on the beads. After elution, a primer was annealed to EXA2 and extended with phi29 DNA polymerase, then A-tailed. A second Illumina adaptor was then ligated, and the products PCR-amplified and gel-purified. Sequencing was performed using NextSeq500. Uniquely aligned sequence tags were mapped to the mouse genome (mm9) using BWA-MEM (version 0.7.9a-r786)34. The uniquely aligned sequence tags were used for the downstream analysis. The 5′ end of mapped tags, representing exonuclease stop sites, were consolidated into peak calls (sigma = 5, exclusion = 20) using GeneTrack26, and peak pairs were matched when found on opposite strands and 0–100 bp apart in the 3′ direction. Tags were globally shifted to the median value of half distance between all peak pairs. These global shifted tags were then aligned relative to the annotated RefSeq TSSs for H3K4me3-only and bivalent promoters separately before further carved out remodeller-affected genes. We assumed that having remodeller deletion bore no bulk change on Pol II occupancy, and hence total tags among wild type and all remodeller mutants were normalized to be the same. To make direct comparison between different gene groups, we further normalized tags to the amount of genes within the group. These normalized tags were then smoothed (5 bp binned before 10-bin moving average) before plotting (Extended Data Fig. 9a). To examine Pol II occupancy change in remodeller mutants among different promoter groups, we first calculated total Pol II occupancy by summing up tags from transcript start to end sites (annotated RefSeq TSS and TES, respectively24) for the tested genes. Change in Pol II occupancy was calculated by dividing the total Pol II occupancy of mutant by that of wild type before log transformation and bargraph plotting (Extended Data Fig. 9b). Genes were rank-ordered according to reads per kb of transcript per million mapped reads (rpkm) and divided in four quartiles (highest: Q4, second: Q3, third: Q2 and lowest: Q1). Operating with the k-means clustering function of seqMINER, genes in each quartile were further subdivided in H3K4me3-only and bivalent genes, as described above. Using these lists of genes, tag densities from remodeller ChIP-seq data sets were collected in a window of −2 kb/+2 kb around the TSS, except for Chd2, for which densities were collected from the TSS until +4 kb. Output tag density files were first analysed using R software to establish average binding profiles. Statistical comparisons were performed between remodeller distributions at H3K4me3 promoters, to assess a significant increasing trend among distributions. Differences between successive pairs of quartiles (Q4 − Q3, Q3 − Q2 and Q2 − Q1) were compared against a null distribution using a one side t-test. The respective P values are reported for each remodeller: Chd1, Q4 − Q3 P = 1.371138 × 10−27; Q3 − Q2 P = 1.728126 × 10−16; Q2 − Q1 P = 7.985217 × 10−23. Chd2, Q4 − Q3 P = 7.543473 × 10−33; Q3 − Q2 P = 1.115223 × 10−25; Q2 − Q1 P = 3.283427 × 10−38. Chd4, Q4 − Q3 P = 0.2094255; Q3 − Q2 P = 0.1081455; Q2 − Q1 P = 0.07202865. Chd6, Q4 − Q3 P = 0.4168748; Q3 − Q2 P = 0.1534144; Q2 − Q1 P = 0.01138035. Chd8, Q4−Q3 P = 4.031959 × 10−15; Q3 − Q2 P = 1.231527 × 10−6; Q2 − Q1 P = 1.34455 × 10−9. Chd9, Q4 − Q3 P = 9.484578 × 10−44; Q3 − Q2 P = 1.059783 × 10−14; Q2 − Q1 P = 4.646352 × 10−28. Ep400, Q4 − Q3 P = 3.046796 × 10−20; Q3 − Q2 P = 1.215304 × 10−14; Q2 − Q1 P = 6.462667 × 10−11. Brg1, Q4 − Q3 P = 3.512021 × 10−24; Q3 − Q2 P = 2.515217 × 10−7; Q2 − Q1 P = 0.977422. We concluded from this analysis that Chd1, Chd2, Chd9 and Ep400 binding at promoters is tightly linked to gene expression level. By contrast, Brg1, Chd4 and Chd6 deposition showed little correlation with gene expression level (statistical test failed for at least one comparison for these remodellers). While statistical analysis of Chd8 distributions concluded to significant differences between quartiles, inspection of distributions in Extended Data Fig. 3 showed that Chd8 binding profile was intermediate between these two categories.