Freedman L.P.,Global Biological Standards Institute |
Gibson M.C.,Global Biological Standards Institute |
Ethier S.P.,Medical University of South Carolina |
Soule H.R.,Milken Institute |
And 2 more authors.
Nature Methods | Year: 2015
Quality control of cell lines used in biomedical research is essential to ensure reproducibility. Although cell line authentication has been widely recommended for many years, misidentification, including crosscontamination, remains a serious problem. We outline a multi-stakeholder, incremental approach and policy-related recommendations to facilitate change in the culture of cell line authentication. Source
News Article | September 8, 2016
When I was a student working in a cancer research lab, I ate, breathed and slept cell culture. This is the art of growing tissue (in my case, human skin cells) on the bottoms of plastic petri dishes, where they are kept alive in a specially-concocted brew of nutrients and growth factors. I became obsessed with the technique because I had to: many types of human tissue cells are fickle to grow and require near-constant attention. They can also easily become contaminated. Because tissue culture cells grow at human body temperature, invasion by bacteria was a constant annoyance. To avoid this, I constantly bathed my hands and forearms in disinfecting isopropyl alcohol. All the work was done inside a ventilation device called a fumehood. In short: you have to keep things very clean. Despite my fastidious efforts, many experiments were ruined, either because the tissue culture plate became contaminated with bacteria or mould, or because a rogue cell from another experiment slipped into the confines of the petri dish. Because cells grow exponentially in culture, any cells of dubious provenance can easily take over the whole dish. Suddenly, you’re dealing with a whole new type of culture, without even knowing it. Such mistakes may seem like schoolboy errors, but it turns out that having the wrong kind of cells in your culture dish is distressingly common. In August, the journal Science Translational Medicine released a paper showing that a line of brain cancer cells that’s been a stalwart of biological research for more than 50 years does not consist of the type of brain tissue they thought it was made from—it’s a completely different kind of brain cell. This is a huge problem, because different types of cells can react differently to the various drugs and chemicals they’re used to test. Experts worried it could invalidate results. And this isn’t the first time a problem like this has been identified. “[A] substantial proportion of cell lines is mislabeled or replaced by cell lines derived from a different individual, tissue or species,” notes a 2010 report, which states that this glaring problem has been known to the research community since 1950. Science is already being wracked by a reproducibility crisis, in which published and peer-reviewed results just can’t be replicated. A recent survey by more than 1,500 researchers in Nature, for example, revealed that more than 70 per cent tried and failed to reproduce another scientist’s experiments. Read More: A Huge Study Found Less than Half of Psychology Findings Were Reproducible To understand the scope of this, you need to step back and look at how these cell lines are made. They’re usually taken from a tumour, then replicated over and over, sometimes for decades. They are workhorses of biological research. Many of them are part of a library of cell cultures called the American Type Culture Collection (ATCC). They are used as test beds for everything from cancer drugs to new cosmetics. “[Misidentification of cell lines] is a somewhat common thing,” Vuk Stambolic, a professor in the Department of Medical Biophysics at the University of Toronto, who uses cell lines extensively in his research, told me in an interview. “There is a famous ‘triple-negative breast cancer’ cell line MDA-MB-435 that has been in the literature forever, that turns out to be a melanoma [skin cancer] line.” Problems with MDA-MB-435 have cropped up since 2000. Unwilling to give up on this very necessary form of research, scientists are now hunting for new methods to maintain the integrity of a cell line. Recently, the standard is SNP (single nucleotide polymorphism) genotyping: a genetic fingerprinting technique that looks for unique, single-letter differences in the genetic code as they exist between cell types to provide a reliable identification system for their origins. That technique is also serving to underline the scope of the problem. “Only with SNP genotyping are we beginning to see the gravity of these errors,” said Stambolic. A consortium uses SNP genotyping to monitor the quality of cell lines: the International Cell Line Authentication Committee, or ICLAC. It hosts a database that lists more than 400 misidentified lines. But this is a voluntary organization and can’t hope to genetically fingerprint each cell line in existence. The solution may lie in self-regulation. Since 2013, some Nature journals require verification of a cell line’s identity through DNA fingerprinting before research results based on work done on them can be published. “People don’t know,” said Stephane Angers, a professor in the Faculty of Pharmacy at the University of Toronto. Angers uses cell lines to study how cells communicate with one another. “What people call a particular cell line in their lab is often very different from another lab. Now there’s more and more guidelines about running a few basic tests to validate what people are working with. We just got asked that for one of our papers. They asked that we provide identification for each cell line we used.” Although cells grown in dishes continue to be used extensively in biological research, their days are likely numbered. The advent of personalized medicine means that, increasingly, doctors and scientists will be performing research on a patient’s own cells, instead of anonymous (and possibly mislabeled) cell lines that have been the standard for decades. “While we are by no means abandoning cell lines,” said Stambolic, “most labs, including mine, are growing tissue directly out of patient biopsies and surgical specimens in the form of organoids [miniature, lab-grown organs] and find them more representative of the cancer problem, albeit much harder to manipulate than cell lines. This is an evolution of the cell line idea, and deeply rooted in the tremendous work done in cell lines so far.” Want more Motherboard in your life? Then sign up for our daily newsletter.
No statistical methods were used to predetermine sample size. The experiments were not randomized. The investigators were blinded to allocation of mice for assessment of histopathology and readouts of inflammation. E. coli strains were routinely cultured aerobically at 37 °C in lysogeny broth (LB) and on LB agar plates. B. abortus was cultured in tryptic soy broth or on tryptic soy agar (TSA) plates,. Chlamydia muridarum strain Nigg II was purchased from ATCC (Manassas, VA). Bacteria were cultured in HeLa 229 cells in DMEM supplemented with 10% FBS. Elementary bodies (EBs) were purified by discontinuous density gradient centrifugations as described previously23 and stored at −80 °C. The HEK293 cell line was maintained in Dulbecco’s modified Eagle’s medium (DMEM) containing 10% FBS at 37 °C in a 5% CO atmosphere. HEK293 cells (ATCC CRL-1573) were obtained from ATCC and were grown in a 48-well tissue culture plates in DMEM containing 10% FBS until ~40% of confluency was reached. HEK293 cells were transfected with a total of 250 ng of plasmid DNA per well, consisting of 25 ng of the reporter construct pNF-κB-luc, 25 ng of the normalization vector pTK-LacZ, and 200 ng of the different combinations of mammalian expression vectors carrying the indicated gene (empty control vector, pCMV-HA-VceC5, pCMV-HA-TRAF2DN (this study), hNOD1-3×Flag, hNOD2-3×Flag, pCMV-HA-hRip2, hNOD1DN-3×Flag, hNOD2DN-3×Flag or pCMV-HA-Rip2DN24 and pCMV-myc-CDC42DN25. The dominant-negative form of TRAF2, lacking an amino-terminal RING finger domain26, was PCR amplified from cDNA prepared from HEK293 cells and cloned into the mammalian expression vector pCMV-HA (BD Biosciences Clontech). Forty-eight hours after transfection, cells were lysed either without any treatment, or stimulated with C12-iE-DAP (1,000 ng ml−1, InvivoGen) and MDP (10 μg ml−1, InvivoGen). After five hours of treatment the cells were lysed and analysed for β-galactosidase and luciferase activity (Promega). FuGene HD (Roche) was used as a transfection reagent according to the manufacturer’s instructions. Cell lines were monitored for mycoplasma contamination. Bone-marrow-derived macrophages (BMDMs) were differentiated from bone marrow precursors from femur and tibiae of C57BL/6 mice obtained from The Jackson Laboratory (Bar Harbor, ME), Nod1+/−Nod2+/− (wild-type littermates) and Nod1−/−Nod2−/− (NOD1/NOD2-deficient) mice (generated at UC Davis) as described previously27. For BMDM experiments, 24-well microtitre plates were seeded with macrophages at a concentration of 5 × 105 cells per well in 0.5 ml of RPMI media (Invitrogen, Grand Island, NY) supplemented with 10% FBS and 10 mM l-glutamine (complete RPMI) and incubated for 48 h at 37 °C in 5% CO . BMDMs were stimulated with C12-iE-DAP (1,000 ng ml−1, InvivoGen), MDP (10 μg ml−1, InvivoGen), thapsigargin (1 μM and 10 μM, Sigma-Aldrich), dithiothreitol (DTT) (1 mM, Sigma-Aldrich), and LPS (10 ng ml−1, InvivoGen) with or without pre-treatment (30 min) of the cells with IRE1α kinase inhibitor KIRA6 (1 μM, Calbiochem), IRE1α endonuclease inhibitor STF-083010 (50 μM, Sigma-Aldrich), PERK inhibitor GSK2656157 (500 nM, Calbiochem) and tauroursodeoxycholate TUDCA (200 μM, Sigma-Aldrich) in the presence of 1 ng ml−1 of recombinant mouse IFNγ (BD Bioscience, San Jose, CA). After 24 h of stimulation, samples for ELISA and gene expression analysis were collected as described below. Preparation of the B. abortus wild-type strain 2308 and the ∆vceC mutant inoculum and BMDM infection was performed as previously described27. Approximately 5 × 107 bacteria in 0.5 ml of complete RPMI were added to each well containing 5 × 105 BMDMs. Microtitre plates were centrifuged at 210g for 5 min at room temperature in order to synchronize infection. Cells were incubated for 20 min at 37 °C in 5% CO , and free bacteria were removed by three washes with PBS, and the zero-time-point sample was taken as described below. After the PBS wash, complete RPMI plus 50 mg ml−1 gentamicin and 1 ng ml−1 of recombinant mouse IFNγ (BD Bioscience, San Jose, CA) was added to the cells, and incubated at 37 °C in 5% CO . For cytokine production assays, supernatant for each well was sampled at 24 h after infection. In order to determine bacterial survival, the medium was aspirated at the time point described above, and the BMDMs were lysed with 0.5 ml of 0.5% Tween 20, followed by rinsing each well with 0.5 ml of PBS. Viable bacteria were quantified by serial dilution in sterile PBS and plating on TSA. For gene expression assays, BMDMs were suspended in 0.5 ml of TRI-reagent (Molecular Research Center, Cincinnati) at the time points described above and kept at −80 °C until further use. At least three independent assays were performed with triplicate samples, and the standard error of the mean for each time point was calculated. All mouse experiments were approved by the Institutional Animal Care and Use Committees at the University of California, Davis, and were conducted in accordance with institutional guidelines. Sample sizes were determined based on experience with infection models and were calculated to use the minimum number of animals possible to generate reproducible results. C57BL/6 wild-type mice and Rip2−/− mice (The Jackson Laboratory), Nod1+/−Nod2+/− (wild-type littermates) and Nod1−/−Nod2−/− (NOD1/NOD2-deficient) mice (generated at UC Davis) were injected intraperitoneally (i.p.) with 100 μl of 2.5 mg per kg body weight of thapsigargin (Sigma-Aldrich) at 0 and 24 h, and 4 h after the second injection the mice were euthanized and serum and tissues collected for gene expression analysis and detection of cytokines. Where indicated, mice were treated i.p. at 12 h before the first thapsigargin dose and 12 h before the second thapsigargin dose with the ER stress inhibitor TUDCA (250 mg per kg body weight). Female and male C57BL/6, Nod1+/−Nod2+/−, Nod1−/−Nod2−/− mice, and Rip2−/− mice aged 6–8 weeks, were held in micro-isolator cages with sterile bedding and irradiated feed in a biosafety level 3 laboratory. Groups of five mice were inoculated i.p. with 0.2 ml of PBS containing 5 × 105 CFU of B. abortus 2308 or its isogenic mutant ∆vceC, as previously described28. At 3 days post-infection, mice were euthanized by CO asphyxiation and their serum and spleens were collected aseptically at necropsy. The spleens were homogenized in 2 ml of PBS, and serial dilutions of the homogenate were plated on TSA for enumeration of CFU. Spleen samples were also collected for gene expression analysis as described below. When necessary, mice were treated i.p. at day one and two post-infection with a daily dose of 250 mg per kg body weight of the ER stress inhibitor TUDCA (Sigma-Aldrich), or 10 mg per kg body weight of the IRE1α kinase inhibitor KIRA6 (Calbiochem) or vehicle control. For the placentitis mouse model, C57BL/6, Nod1+/−Nod2+/− and Nod1−/−Nod2−/− mice, aged 8–10 weeks, were held in micro-isolator cages with sterile bedding and irradiated feed in a biosafety level 3 laboratory. Female Nod1+/−Nod2+/− mice were mated with male C57BL/6 mice (control mice) and female Nod1−/−Nod2−/− mice were mated with male Nod1−/−Nod2−/− mice (NOD1/NOD2-deficient), and pregnancy was confirmed by presence of a vaginal plug. At 5 days of gestation, groups of pregnant mice were mock infected or infected i.p. with 1 × 105 CFU of Brucella abortus 2308 or its isogenic mutant ∆vceC (day 0). At 3, 7 and 13 days after infection mice were euthanized by CO asphyxiation and the spleen and placenta of dams were collected aseptically at necropsy. At day 13 after infection (corresponding to day 18 of gestation), viability of pups was evaluated based on the presence of fetal movement and heartbeat, and fetal size and skin colour. Fetuses were scored as viable if they exhibited movement and a heartbeat, visible blood vessels, bright pink skin, and were of normal size for their gestational period. Fetuses were scored as non-viable if fetal movement, heartbeat, and visible blood vessels were absent, skin was pale or opaque, and their size for gestational period or compared to littermates was small, or they showed evidence of fetal reabsorption. Percentage of viability was calculated as [(number viable pups per litter/total number pups per litter) × 100]. At each time point, the placenta samples were collected for bacteriology, gene expression analysis and blinded histopathological analysis (Extended Data Fig. 6d). When indicated, mice were treated i.p. at days 5, 7 and 9 post-infection with a daily dose of 250 mg per kg body weight of the ER stress inhibitor tauroursodeoxycholate TUDCA (Sigma-Aldrich) or vehicle control. RNA was isolated from BMDMs and mouse tissues using Tri-reagent (Molecular Research Center) according to the instructions of the manufacturer. Reverse transcription was performed on 1 μg of DNase-treated RNA with Taqman reverse transcription reagent (Applied Biosystems). For each real-time reaction, 4 μl of cDNA was used combined with primer pairs for mouse Actb, Il6, Hspa5 and Chop. Real time transcription-PCR was performed using Sybr green and an ABI 7900 RT–PCR machine (Applied Biosystems). The fold change in mRNA levels was determined using the comparative threshold cycle (C ) method. Target gene transcription was normalized to the levels of Actb mRNA. Cytokine levels in mouse serum and supernatants of infected BMDMs were measured using either a multiplex cytokine/chemokine assay (Bio-Plex 23-plex mouse cytokine assay; Bio-Rad), or via an enzyme-linked immunosorbent assay (IL-6 ELISA; eBioscience), according to the manufacturer’s instructions. Cytotoxicity was determined by using a LDH release assay in supernatant of BMDMs treated as described above. LDH release assay was performed using a CytoTox 96 Non-Radioactive Cytotoxicity Assay (Promega), following manufacturer’s protocol. The percentage of LDH release was calculated as follows: Percentage of LDH release = 100 × (absorbance reading of treated well − absorbance reading of untreated control)/(absorbance reading of maximum LDH release control − absorbance reading untreated control). The kit-provided lysis buffer was used to achieve complete cell lysis and the supernatant from lysis-buffer-treated cells was used to determine maximum LDH release control. HeLa 229 cells (ATCC CCL-2.1) were cultured in 96-well tissue culture plates at a concentration of 4 × 104 cells per well in Dulbecco’s Modified Eagle Medium (DMEM) (Life Technologies, Grand Island, NY) supplemented with 10% FBS. HeLa 229 cells were transfected with a total of 125 ng of pCMV-HA-Rip2DN or empty control vector per well. 24 h post-transfection HeLa 229 cells were treated with Dextran to enhance infection efficacy before they were infected with 1.7 × 105 Chlamydia bacteria per well. The plates were centrifuged at 2,000 r.p.m. for 60 min at 37 °C, then incubated for 30 min at 37 °C in 5% CO Supernatant was discarded and replaced with DMEM containing 1 μg ml−1 cyclohexine (Sigma Aldrich) and where indicated, 1 μM KIRA6, 10 μM thapsigargin or 10 μg ml−1 MDP, was added to cultures before incubation at 37 °C in 5% CO for 40 h. For gene expression assays, HeLa 229 cells were suspended in Tri-reagent (Molecular Research Center, Cincinnati) and RNA was isolated. Infection efficiency was confirmed in separate plates by staining Chlamydia-infected HeLa 229 cells with anti-Chlamydia MOMP antibody and counting bacteria under a fluorescent microscope. Four independent assays were performed and the standard error of the mean calculated. BMDMs stimulated where indicated with 10 μM thapsigargin for 24 h were lysed in lysis buffer (4% SDS, 100 mM Tris, 20% glycerol) and 10 μg of protein was analysed by western blot using antibodies raised against rabbit TRAF2 (C192, #4724, Cell Signaling), rabbit HSP90 (E289, #4875, Cell Signaling), mouse SGT1 (ab60728, Abcam) and rabbit α/β-tubulin (#2148, Cell Signaling). For tissue culture experiments, statistical differences were calculated using a paired Student’s t-test. To determine statistical significance in animal experiments, an unpaired Student’s t-test was used. To determine statistical significance of differences in total histopathology scores, a Mann–Whitney U-test was used. A two-tailed P value of <0.05 was considered to be significant.
Biomedical scientists are often urged to check that their cell lines are not contaminated or mislabelled. But as a new study shows, any effort to authenticate a cell line is only as good as the reference standard against which the cells are compared. A cell line that is widely used to study brain cancer does not match the cells used to create the line nearly 50 years ago, or the tumour purported to be its source, researchers report on 31 August in Science Translational Medicine1. In fact, no one is quite sure of the true provenance of the cell line distributed by most cell repositories. “It is a good cautionary tale to say, ‘Question your assumptions and do as many appropriate controls as you can to make sure you really have what you think you have,’” says Jon Lorsch, director of the US National Institute of General Medical Sciences in Bethesda, Maryland. And because few cell lines are ever verified against their primary-source material, “this paper is probably just the tip of the iceberg”, says Christopher Korch, a geneticist at the University of Colorado Denver. Many groups are trying to tackle the problem of misidentified cell lines to improve the reproducibility of research findings. This year, the US National Institutes of Health started requiring grant applicants to describe how they will authenticate their cell lines. And journals such as Nature have recently begun to ask authors to check their cells against a database of 475 lines (and counting) that are known to be mixed up. But no organizations have called for the kind of archival sleuthing that produced the new study. “It’s hard enough to get people to do the standard authentication,” says Leonard Freedman, president of the Global Biological Standards Institute, a non-profit organization in Washington DC that has found that most life scientists never authenticate their cells2. “This is much more elaborate.” The cell line in question, U87, was established in 1966 at Uppsala University in Sweden, using tissue from a 44-year-old woman with an aggressive brain cancer known as glioblastoma. U87 has since become a workhorse of brain-cancer research, subject to countless investigations that have yielded around 2,000 scientific papers. The enthusiasm for U87 initially puzzled Bengt Westermark, a tumour biologist at Uppsala. “I couldn’t understand why people would work with such boring cells,” he says. As a graduate student in the 1970s, Westermark studied eight different brain-cancer cell lines. U87 was “hopeless to work with”, he says, because it grew so much more slowly than the others. Years later, Westermark got his hands on the version of U87 that is distributed by the American Type Culture Collection (ATCC), a cell repository in Manassas, Virginia, that houses the world’s largest collection of biological materials. He could see from the cells’ growth properties that this U87 was clearly different from the cells that had given him so much grief in graduate school. Westermark decided to do a formal comparison. Fortunately, Uppsala still had the preserved tumour tissue that spawned the original cell line. This enabled Westermark’s team to verify the identity of the archival U87 sample in their freezer. The researchers then used DNA-fingerprinting techniques to show that the ATCC’s U87 was different — and that it didn’t match any other cell lines created at Uppsala, either. According to Mindy Goldsborough, ATCC’s chief science and technology officer, the repository acquired its U87 line in 1982 from the Memorial Sloan KetteringCancer Center in New York City, which itself had received the cell line from Uppsala in 1973. And by the time it arrived at the ATCC, U87 had a Y chromosome — despite the fact that it was supposed to have come from a female patient. This suggests that the mix-up probably happened at Sloan Kettering or during one of the hand-offs. In light of the new revelations, the ATCC now plans to update the background details in its listing for U87, which it describes as male. But the origin of the U87 line remains a mystery. A comparison of gene-expression profiles conducted by Westermark's team suggests that the ATCC cell line probably came from a brain tumour. “It’s bad news that it’s not what it should be,” Westermark says, “but it’s good news that it’s probably a glioblastoma.” This means that studies of U87 still reflect brain-cancer biology and don’t need to be tossed out, he adds. Still, many cancer researchers think that it is time to move beyond U87 and other “classical” cell lines — regardless of where they came from — because the culture conditions historically used to grow the cells change their biological nature. Westermark and others now favour newer cell lines that have been propagated on the types of growth medium that ensure genetic and epigenetic stability. Through its Human Glioma Cell Culture biobank, Uppsala provides these sorts of cells to other researchers for a small processing fee. “There is an increasing understanding that what we’ve historically used is so poorly representative of the human disease,” says Howard Fine, a neuro-oncologist at the Weill Cornell Brain Tumor Center in New York City. “So, any time someone can shoot down the [U87] cell line, I’m happy.”
No statistical methods were used to predetermine sample size. The investigators were not blinded to allocation during experiments and outcome assessment. A constitutively stabilized mutant of HIF2α (HIF2α-TM) was obtained from Christina Warnecke20. The HIF2α-TM (triple mutant) construct harbours the following mutations in the prolyl and asparagyl hydroxylation sites: P405A, P530G and N851A. Polypeptide fragments of DYRK1B were cloned into pcDNA3-HA and include DYRK1B N terminus, N-Ter (amino acids 1–110), DYRK1B kinase domain, KD (amino acids 111–431), and DYRK1B C terminus, C-Ter (amino acids 432–629). cDNAs for RBX1, Elongin B and Elongin C were kindly provided from Michele Pagano (New York University) and cloned into the pcDNA vector by PCR. HA-tagged HIF1α and HIF2α were obtained from Addgene. GFP-tagged DYRK1A and DYRK1B were cloned into pcDNA vector. pcDNA-HA-VHL was provided by Kook Hwan Kim (Sungkyunkwan University School of Medicine, Korea). Site-directed mutagenesis was performed using QuickChange or QuickChange Multi Site-Directed mutagenesis kit (Agilent) and resulting plasmids were sequence verified. Lentivirus was generated by co-transfection of the lentiviral vectors with pCMV-ΔR8.1 and pMD2.G plasmids into HEK293T cells as previously described9, 42. ShRNA sequences are: ID2-1: GCCTACTGAATGCTGTGTATACTCGAGTATACACAGCATTCAGTAGGC; ID2-2: CCCACTATTGTCAGCCTGCATCTCGAGATGCAGGCTGACAATAGTGGG; DYRK1A: CAGGTTGTAAAGGCATATGATCTCGAGATCATATGCCTTTACAACCTG; DYRK1B: GACCTACAAGCACATCAATGACTCGAGTCATTGATGTGCTTGTAGGTC. IMR-32 (ATCC CCL-127), SK-N-SH (ATCC HTB-11), U87 (ATCC HTB-14), NCI-H1299 (ATCC CRL-5803), HRT18 (ATCC CCL-244), and HEK293T (ATCC CRL-11268) cell lines were acquired through American Type Culture Collection. U251 (Sigma, catalogue number 09063001) cell line was obtained through Sigma. Cell lines were cultured in DMEM supplemented with 10% fetal bovine serum (FBS, Sigma). Cells were routinely tested for mycoplasma contamination using Mycoplasma Plus PCR Primer Set (Agilent, Santa Clara, CA) and were found to be negative. Cells were transfected with Lipofectamine 2000 (Invitrogen) or calcium phosphate. Mouse NSCs were grown in Neurocult medium (StemCell Technologies) containing 1× proliferation supplements (StemCell Technologies), and recombinant FGF-2 and EGF (20 ng ml−1 each; Peprotech). GBM-derived glioma stem cells were obtained by de-identified brain tumour specimens from excess material collected for clinical purposes at New York Presbyterian-Columbia University Medical Center. Donors (patients diagnosed with glioblastoma) were anonymous. Progressive numbers were used to label specimens coded in order to preserve the confidentiality of the subjects. Work with these materials was designated as IRB exempt under paragraph 4 and it is covered under IRB protocol #IRB-AAAI7305. GBM-derived GSCs were grown in DMEM:F12 containing 1× N2 and B27 supplements (Invitrogen) and human recombinant FGF-2 and EGF (20 ng ml−1 each; Peprotech). Cells at passage (P) 4 were transduced using lentiviral particle in medium containing 4 μg ml−1 of polybrene (Sigma). Cells were cultured in hypoxic chamber with 1% O (O Control Glove Box, Coy Laboratory Products, MI) for the indicated times or treated with a final concentration of 100–300 μM CoCl (Sigma) as specified in figure legends. Mouse neurosphere assay was performed by plating 2,000 cells in 35 mm dishes in collagen containing NSC medium to ensure that distinct colonies were derived from single cells and therefore clonal in origin43. We determined neurosphere formation over serial clonal passages in limiting dilution semi-solid cultures and the cell expansion rate over passages, which is considered a direct indication of self-renewing symmetric cell divisions44. For serial sub-culturing we mechanically dissociated neurospheres into single cells in bulk and re-cultured them under the same conditions for six passages. The number of spheres was scored after 14 days. Only colonies >100 μm in diameter were counted as spheres. Neurosphere size was determined by measuring the diameters of individual neurospheres under light microscopy. Data are presented as percent of neurospheres obtained at each passage (number of neurospheres scored/number of NSCs plated × 100) in three independent experiments. P value was calculated using a multiple t-test with Holm–Sidak correction for multiple comparisons. To determine the expansion rate, we plated 10,000 cells from 3 independent P1 clonal assays in 35 mm dishes and scored the number of viable cells after 7 days by Trypan Blue exclusion. Expansion rate of NSCs was determined using a linear regression model and difference in the slopes (P value) was determined by the analysis of covariance (ANCOVA) using Prism 6.0 (GraphPad). Limiting dilution assay (LDA) for human GSCs was performed as described previously45. Briefly, spheres were dissociated into single cells and plated into 96-well plates in 0.2 ml of medium containing growth factors at increasing densities (1–100 cells per well) in triplicate. Cultures were left undisturbed for 14 days, and then the percent of wells not containing spheres for each cell dilution was calculated and plotted against the number of cells per well. Linear regression lines were plotted, and we estimated the minimal frequency of glioma cells endowed with stem cell capacity (the number of cells required to generate at least one sphere in every well = the stem cell frequency) based on the Poisson distribution and the intersection at the 37% level using Prism 6.0 software. Data represent the means of three independent experiments performed in different days for the evaluation of the effects of ID2, ID2(T27A) in the presence or in the absence of DYRK1B. LDA for the undegradable HIF2α rescue experiment was performed by using three cultures transduced independently on the same day. To identify the sites of ID2 phosphorylation from IMR32 human neuroblastoma cells, the immunoprecipitated ID2 protein was excised, digested with trypsin, chymotrypsin and Lys-C and the peptides extracted from the polyacrylamide in two 30 μl aliquots of 50% acetonitrile/5% formic acid. These extracts were combined and evaporated to 25 μl for MS analysis. The LC–MS system consisted of a state-of-the-art Finnigan LTQ-FT mass spectrometer system with a Protana nanospray ion source interfaced to a self-packed 8 cm × 75 μm id Phenomenex Jupiter 10 μm C18 reversed-phase capillary column. 0.5–5 μl volumes of the extract were injected and the peptides eluted from the column by an acetonitrile/0.1 M acetic acid gradient at a flow rate of 0.25 μl min−1. The nanospray ion source was operated at 2.8 kV. The digest was analysed using the double play capability of the instrument acquiring full scan mass spectra to determine peptide molecular weights and product ion spectra to determine amino acid sequence in sequential scans. This mode of analysis produces approximately 1200 CAD spectra of ions ranging in abundance over several orders of magnitude. Tandem MS/MS experiments were performed on each candidate phosphopeptide to verify its sequence and locate the phosphorylation site. A signature of a phosphopeptide is the detection of loss of 98 daltons (the mass of phosphoric acid) in the MS/MS spectrum. With this method, three phosphopeptides were found to carry phosphorylations at residues Ser5, Ser14 and Thr27 of the ID2 protein. The anti-phospho-T27-ID2 antibody was generated by immunizing rabbits with a short synthetic peptide containing the phosphorylated T27 (CGISRSK-pT-PVDDPMS) (Yenzym Antibodies, LLC). A two-step purification process was applied. First, antiserum was cross-absorbed against the phospho-peptide matrix to purify antibodies that recognize the phosphorylated peptide. Then, the anti-serum was purified against the un-phosphorylated peptide matrix to remove non-specific antibodies. Cells were lysed in NP40 lysis buffer (50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 1 mM EDTA, 1% NP40, 1.5 mM Na VO , 50 mM sodium fluoride, 10 mM sodium pyrophosphate, 10 mM β-glycerolphosphate and EDTA-free protease inhibitor cocktail (Roche)) or RIPA buffer (50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 1 mM EDTA, 1% NP40, 0.5% sodium dexoycholate, 0.1% sodium dodecyl sulphate, 1.5 mM Na VO , 50 mM sodium fluoride, 10 mM sodium pyrophosphate, 10 mM β-glycerolphosphate and EDTA-free protease inhibitor cocktail (Roche)). Lysates were cleared by centrifugation at 15,000 r.p.m. for 15 min at 4 °C. For immunoprecipitation, cell lysates were incubated with primary antibody (hydroxyproline, Abcam, ab37067; VHL, BD, 556347; DYRK1A, Cell Signaling Technology, 2771; DYRK1B, Cell Signaling Technology, 5672) and protein G/A beads (Santa Cruz, sc-2003) or phospho-Tyrosine (P-Tyr-100) Sepharose beads (Cell Signaling Technology, 9419), HA affinity matrix (Roche, 11815016001), Flag M2 affinity gel (Sigma, F2426) at 4 °C overnight. Beads were washed with lysis buffer four times and eluted in 2× SDS sample buffer. Protein samples were separated by SDS–PAGE and transferred to polyvinyl difluoride (PVDF) or nitrocellulose (NC) membrane. Membranes were blocked in TBS with 5% non-fat milk and 0.1% Tween20, and probed with primary antibodies. Antibodies and working concentrations are: ID2 1:500 (C-20, sc-489), GFP 1:1,000 (B-2, sc-9996), HIF2α/EPAS-1 1:250 (190b, sc-13596), c-MYC (9E10, sc-40), and Elongin B 1:1,000 (FL-118, sc-11447), obtained from Santa Cruz Biotechnology; phospho-Tyrosine 1:1,000 (P-Tyr-100, 9411), HA 1:1,000 (C29F4, 3724), VHL 1:500 (2738), DYRK1A 1:1,000, 2771; DYRK1B 1:1,000, 5672) and RBX1 1:2,000 (D3J5I, 11922), obtained from Cell Signaling Technology; VHL 1:500 (GeneTex, GTX101087); β-actin 1:8000 (A5441), α-tubulin 1:8,000 (T5168), and Flag M2 1:500 (F1804) obtained from Sigma; HIF1α 1:500 (H1alpha67, NB100-105) and Elongin C 1:1,000 (NB100-78353) obtained from Novus Biologicals; HA 1:1000 (3F10, 12158167001) obtained from Roche. Secondary antibodies horseradish-peroxidase-conjugated were purchased from Pierce and ECL solution (Amersham) was used for detection. For in vitro binding assays, HA-tagged RBX1, Elongin B, Elongin C and VHL were in vitro translated using TNT quick coupled transcription/translation system (Promega). Active VHL protein complex was purchased from EMD Millipore. Purified His-VHL protein was purchased from ProteinOne (Rockville, MD). GST, GST–ID2 and Flag–ID2 proteins were bacterial expressed and purified using glutathione sepharose beads (GE healthcare life science). Active DYRK1B (Invitrogen) was used for in vitro phosphorylation of Flag-ID2 proteins. Biotinylated wild-type and modified (pT27 and T27W) ID2 peptides (amino acids 14–34) were synthesized by LifeTein (Somerset, NJ). In vitro binding experiments between ID2 and VCB–Cul2 were performed using 500 ng of Flag-ID2 and 500 ng of VCB–Cul2 complex or 500 ng VHL protein in binding buffer (50 mM Tris-Cl, pH 7.5, 100 mM NaCl, 1 mM EDTA, 10 mM β-glycerophosphate, 10 mM sodium pyrophosphate, 50 mM sodium fluoride, 1.5 mM Na VO , 0.2% NP40, 10% glycerol, 0.1 mg ml−1 BSA and EDTA-free protease inhibitor cocktail (Roche)) at 4 °C for 3 h. In vitro binding between ID2 peptides and purified proteins was performed using 2 μg of ID2 peptides and 200 ng of recombinant VCB–Cul2 complex or 200 ng recombinant VHL in binding buffer (50 mM Tris-Cl, pH 7.5, 100 mM NaCl, 1 mM EDTA, 10 mM β-glycerophosphate, 10 mM sodium pyrophosphate, 50 mM sodium fluoride, 1.5 mM Na VO , 0.4% NP40, 10% glycerol, 0.1 mg ml−1 BSA and EDTA-free protease inhibitor cocktail (Roche)) at 4 °C for 3 h or overnight. Protein complexes were pulled down using glutathione sepharose beads (GE Healthcare Life Science) or streptavidin conjugated beads (Thermo Fisher Scientific) and analysed by immunoblot. Cdk1, Cdk5, DYRK1A, DYRK1B, ERK, GSK3, PKA, CaMKII, Chk1, Chk2, RSK-1, RSK-2, aurora-A, aurora-B, PLK-1, PLK-2, and NEK2 were all purchased from Life Technology and ATM from EMD Millipore. The 18 protein kinases tested in the survey were selected because they are proline-directed S/T kinases (Cdk1, Cdk5, DYRK1A, DYRK1B, ERK) and/or because they were considered to be candidate kinases for Thr27, Ser14 or Ser5 from kinase consensus prediction algorithms (NetPhosK1.0, http://www.cbs.dtu.dk/services/NetPhosK/; GPS Version 3.0 http://gps.biocuckoo.org/#) or visual inspection of the flanking regions and review of the literature for consensus kinase phosphorylation motifs. 1 μg of bacterially purified GST-ID substrates were incubated with 10–20 ng each of the recombinant active kinases. The reaction mixture included 10 μCi of [γ-32P]ATP (PerkinElmer Life Sciences) in 50 μl of kinase buffer (25 mM Tris-HCl, pH 7.5, 5 mM β-glycerophosphate, 2 mM dithiothreitol (DTT), 0.1 mM Na VO , 10 mM MgCl , and 0.2 mM ATP). Reactions were incubated at 30 °C for 30 min. Reactions were terminated by addition of Laemmli SDS sample buffer and boiling on 95 °C for 5 min. Proteins were separated on SDS–PAGE gel and phosphorylation of proteins was visualized by autoradiography. Coomassie staining was used to document the amount of substrates included in the kinase reaction. In vitro phosphorylation of Flag– ID2 proteins by DYRK1B (Invitrogen) was performed using 500 ng of GST–DYRK1B and 200 ng of bacterially expressed purified Flag–ID2 protein. In vivo kinase assay in GSCs and glioma cells was performed using endogenous or exogenously expressed DYRK1A and DYRK1B. Cell lysates were prepared in lysis buffer (50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 1 mM EDTA, 1% NP40, 1.5 mM Na VO , 50 mM sodium fluoride, 10 mM sodium pyrophosphate, 10 mM β-glycerolphosphate and EDTA-free protease inhibitor cocktail (Roche)). DYRK1 kinases were immunoprecipitated using DYRK1A and DYRK1B antibodies (for endogenous DYRK1 proteins) or GFP antibody (for exogenous GFP–DYRK1 proteins) from 1 mg cellular lysates at 4 °C. Immunoprecipitates were washed with lysis buffer four times followed by two washes in kinase buffer as described above and incubated with 200 ng purified Flag–ID2 protein in kinase buffer for 30 min at 30 °C. Kinase reactions were separated by SDS–PAGE and analysed by western blot using p-T27-ID2 antibody. HIF2α half-life was quantified using ImageJ processing software (NIH). Densitometry values were analysed by Prism 6.0 using the linear regression function. Stoichiometric quantification of ID2 and VHL in U87 cells was obtained using recombinant Flag–ID2 and His-tagged-VHL as references. The chemiluminescent signal of serial dilutions of the recombinant proteins was quantified using ImageJ, plotted to generate a linear standard curve against which the densitometric signal generated by serial dilutions of cellular lysates (1 × 106 U87 cells) was calculated. Triplicate values ± s.e.m. were used to estimate the ID2:VHL ratio per cell. The stoichiometry of pT27-ID2 phosphorylation was determined as described46. Briefly, SK-N-SH cells were plated at density of 1 × 106 in 100 mm dishes. Forty-eight hours later 1.5 mg of cellular lysates from cells untreated or treated with CoCl during the previous 24 h were prepared in RIPA buffer and immunoprecipitated using 4 μg of pT27-ID2 antibody or rabbit IgG overnight at 4 °C. Immune complexes were collected with TrueBlot anti-rabbit IgG beads (Rockland), washed 5 times in lysis buffer, and eluted in SDS sample buffer. Serial dilutions of cellular lysates, IgG and pT27-ID2 immunoprecipitates were loaded as duplicate series for SDS–PAGE and western blot analysis using ID2 or p-T27-ID2 antibodies. Densitometry quantification of the chemiluminescent signals was used to determine (1) the efficiency of the immunoprecipitation using the antibody against p-ID2-T27 and (2) the ratio between efficiency of the immunoprecipitation evaluated by western blot for p-T27-ID2 and total ID2 antibodies. This represents the percent of phosphorylated Thr27 of ID2 present in the cell preparation. Cellular ID2 complexes were purified from the cell line NCI-H1299 stably engineered to express Flag-HA–ID2. Cellular lysates were prepared in 50 mM Tris-HCl, 250 mM NaCl, 0.2% NP40, 1 mM EDTA, 10% glycerol, protease and phosphatase inhibitors. Flag-HA–ID2 immunoprecipitates were recovered first with anti-Flag antibody-conjugated M2 agarose (Sigma) and washed with lysis buffer containing 300 mM NaCl and 0.3% NP40. Bound polypeptides were eluted with Flag peptide and further affinity purified by anti-HA antibody-conjugated agarose (Roche). The eluates from the HA beads were analysed directly on long gradient reverse phase LC–MS/MS. A specificity score of proteins interacting with ID2 was computed for each polypeptide by comparing the number of peptides identified from mass spectrometry analysis to those reported in the CRAPome database that includes a list of potential contaminants from affinity purification-mass spectrometry experiments (http://www.crapome.org). The specificity score is computed as [(#peptide*#xcorr)/(AveSC*MaxSC* # of Expt.)], #peptide, identified peptide count; #xcorr, the cross-correlation score for all candidate peptides queried from the database; AveSC, averaged spectral counts from CRAPome; MaxSC, maximal spectral counts from CRAPome; and # of Expt., the total found number of experiments from CRAPome. U87 cells were transfected with pcDNA3-HA-HIFα (HIF1α or HIF2α), pcDNA3-Flag–ID2 (WT or T27A), pEGFP-DYRK1B and pcDNA3-Myc-Ubiquitin. 36 h after transfection, cells were treated with 20 μM MG132 (EMD Millipore) for 6 h. After washing with ice-cold PBS twice, cells were lysed in 100 μl of 50 mM Tris-HCl pH 8.0, 150 mM NaCl (TBS) containing 2% SDS and boiled at 100 °C for 10 min. Lysates were diluted with 900 μl of TBS containing 1% NP40. Immunoprecipitation was performed using 1 mg of cellular lysates. Ubiquitylated proteins were immunoprecipitated using anti-Myc antibody and analysed by western blot using HA antibody. A previously described47, highly accurate flexible peptide docking method implemented in ICM software (Molsoft LLC, La Jolla CA) was used to dock ID2 peptides to VCB or components thereof. A series of overlapping peptides of varying lengths were docked to the complex of VHL and Elongin C (EloC), or VHL or EloC alone, from the recent crystallographic structure22 of the VHL-CRL ligase. Briefly, an all-atom model of the peptide was docked into grid potentials derived from the X-ray structure using a stochastic global optimization in internal coordinates with pseudo-Brownian and collective ‘probability-biased’ random moves as implemented in the ICM program. Five types of potentials for the peptide-receptor interaction energy — hydrogen van der Waals, non-hydrogen van der Waals, hydrogen bonding, hydrophobicity and electrostatics — were precomputed on a rectilinear grid with 0.5 Å spacing that fills a 34 Å × 34 Å × 25 Å box containing the VHL-EloC (V-C) complex, to which the peptide was docked by searching its full conformational space within the space of the grid potentials. The preferred docking conformation was identified by the lowest energy conformation in the search. The preferred peptide was identified by its maximal contact surface area with the respective receptor. ab initio folding and analysis of the peptides was performed as previously described48, 49. ab initio folding of the ID2 peptide and its phospho-T27 mutant showed that both strongly prefer an α-helical conformation free (unbound) in solution, with the phospho-T27 mutant having a calculated free energy almost 50 kcal-equivalent units lower than the unmodified peptide. Total RNA was prepared with Trizol reagent (Invitrogen) and cDNA was synthesized using SuperScript II Reverse Transcriptase (Invitrogen) as described42, 50. Semi-quantitative RT–PCR was performed using AccuPrime Taq DNA polymerase (Invitrogen) and the following primers: for HIF2A Fw 5′_GTGCTCCCACGGCCTGTA_3′ and Rv 5′_TTGTCACACCTATGGCATATCACA_3′; GAPDH Fw 5′_AGAAGGCTGGGGCTCATTTG_3′ and Rv 5′_AGGGGCCATCCACAGTCTTC_3′. The quantitative RT–PCR was performed with a Roche480 thermal cycler, using SYBR Green PCR Master Mix from Applied Biosystem. Primers used in qRT–PCR are: SOX2 Fw 5′_TTGCTGCCTCTTTAAGACTAGGA_3′ and Rv 5′_CTGGGGCTCAAACTTCTCTC_3′; NANOG Fw 5′_ATGCCTCACACGGAGACTGT_3′ and Rv 5′_AAGTGGGTTGTTTGCCTTTG_3′; POU5F1 Fw 5′_GTGGAGGAAGCTGACAACAA_3′ and Rv 5′_ATTCTCCAGGTTGCCTCTCA_3′; FLT1 Fw 5′_AGCCCATAAATGGTCTTTGC_3′ and Rv 5′_GTGGTTTGCTTGAGCTGTGT_3′; PIK3CA Fw 5′_TGCAAAGAATCAGAACAATGCC_3′ and 5′_CACGGAGGCATTCTAAAGTCA_3′; BMI1 Fw 5′_AATCCCCACCTGATGTGTGT_3′ and Rv 5′_GCTGGTCTCCAGGTAACGAA_3′; GAPDH Fw 5′_GAAGGTGAAGGTCGGAGTCAAC_3′ and Rv 5′_CAGAGTTAAAAGCAGCCCTGGT_3′; 18S Fw 5′_CGCCGCTAGAGGTGAAATTC_3′ and Rv 5′_CTTTCGCTCTGGTCCGTCTT_3′. The relative amount of specific mRNA was normalized to 18S or GAPDH. Results are presented as the mean ± s.d. of three independent experiments each performed in triplicate (n = 9). Statistical significance was determined by Student’s t-test (two-tailed) using GraphPad Prism 6.0 software. Mice were housed in pathogen-free animal facility. All animal studies were approved by the IACUC at Columbia University (numbers AAAE9252; AAAE9956). Mice were 4–6-week-old male athymic nude (Nu/Nu, Charles River Laboratories). No statistical method was used to pre-determine sample size. No method of randomization was used to allocate animals to experimental groups. Mice in the same cage were generally part of the same treatment. The investigators were not blinded during outcome assessment. In none of the experiments did tumours exceed the maximum volume allowed according to our IACUC protocol, specifically 20 mm in the maximum diameter. 2 × 105 U87 cells stably expressing a doxycycline inducible lentiviral vector coding for DYRK1B or the empty vector were injected subcutaneously in the right flank in 100 μl volume of saline solution (7 mice per each group). Mice carrying 150–220 mm3 subcutaneous tumours (21 days after injection) generated by cells transduced with DYRK1B were treated with vehicle or doxycycline by oral gavage (Vibramycin, Pfizer Labs; 8 mg ml−1, 0.2 ml per day)51; mice carrying tumours generated by cells transduced with the empty vector were also fed with doxycycline. Tumour diameters were measured daily with a caliper and tumour volumes estimated using the formula: width2 × length/2 = V (mm3). Mice were euthanized after 5 days of doxycycline treatment. Tumours were dissected and fixed in formalin for immunohistochemical analysis. Data are means ± s.d. of 7 mice in each group. Statistical significance was determined by ANCOVA using GraphPad Prism 6.0 software package (GraphPad). Orthotopic implantation of glioma cells was performed as described previously using 5 × 104 U87 cells transduced with pLOC-vector, pLOC-DYRK1B (WT) or pLOC-DYRK1B-K140R mutant in 2 μl phosphate buffer42. In brief, 5 days after lentiviral infection, cells were injected 2 mm lateral and 0.5 mm anterior to the bregma, 2.5 mm below the skull of 4–6-week-old athymic nude (Nu/Nu, Charles River Laboratories) mice. Mice were monitored daily for abnormal ill effects according to AAALAS guidelines and euthanized when neurological symptoms were observed. Tumours were dissected and fixed in formalin for immunohistochemical analysis and immunofluorescence using V5 antibody (Life technologies, 46-0705) to identify exogenous DYRK1B and an antibody against human vimentin (Sigma, V6630) to identify human glioma cells. A Kaplan–Meier survival curve was generated using the GraphPad Prism 6.0 software package (GraphPad). Points on the curves indicate glioma related deaths (n = 7 animals for each group, p was determined by log rank analysis). We did not observe non-glioma related deaths. Mice injected with U87 cells transduced with pLOC-DYRK1B(WT) that did not show neurological signs on day 70 were euthanized for histological evaluation and shown as tumour-free mice in Fig. 5g. Intracranial injection of H-Ras-V12-IRES-Cre-ER-shp53 lentivirus was performed in 4-week-old Id1Flox/Flox, Id2Flox/Flox, Id3−/− mice (C57Bl6/SV129). Briefly, 1.3 µl of purified lentiviral particles in PBS were injected 1.45 mm lateral and 1.6 mm anterior to the bregma and 2.3 mm below the skull using a stereotaxic frame. Tamoxifen was administered for 5 days at 9 mg per 40 g of mouse weight by oral gavage starting 30 days after surgery. Mice were killed 2 days later and brains dissected and fixed for histological analysis. Tissue preparation and immunohistochemistry on tumour xenografts were performed as previously described42, 50, 52. Antibodies used in immunostaining are: HIF2α, mouse monoclonal, 1:200 (Novus Biological, NB100-132); Olig2, rabbit polyclonal, 1:200 (IBL International, JP18953); human Vimentin 1:50 (Sigma, V6630), Bromodeoxyuridine, mouse monoclonal 1:500 (Roche, 11170376001), V5 1:500 (Life technologies, 46-0705). Sections were permeabilized in 0.2% tritonX-100 for 10 min, blocked with 1% BSA-5% goat serum in PBS for 1 h. Primary antibodies were incubated at 4 °C overnight. Secondary antibodies biotinylated (Vector Laboratories) or conjugated with Alexa594 (1:500, Molecular Probes) were used. Slides were counterstained with haematoxylin for immunohistochemistry and DNA was counterstained with DAPI (Sigma) for immunofluorescence. Images were acquired using an Olympus 1X70 microscope equipped with digital camera and processed using Adobe Photoshop CS6 software. BrdU-positive cells were quantified by scoring the number of positive cells in five 4 × 10−3 mm2 images from 5 different mice from each group. Blinding was applied during histological analysis. Data are presented as means of five different mice ± standard deviation (s.d.) (two-tailed Student’s t-test, unequal variance). To infer if ID2 modulates the interactions between HIF2α and its transcriptional targets we used a modified version of MINDy53 algorithm, called CINDy25. CINDy uses adaptive partitioning method to accurately estimate the full conditional mutual information between a transcription factor and a target gene given the expression or activity of a signalling protein. Briefly, for every pair of transcription factor and target gene of interest, it estimates the mutual information that is, how much information can be inferred about the target gene when the expression of the transcription factor is known, conditioned on the expression/activity of the signalling protein. It estimates this conditional mutual information by estimating the multi-dimensional probability densities after partitioning the sample distribution using adaptive partitioning method. We applied CINDy algorithm on gene expression data for 548 samples obtained from The Cancer Genome Atlas (TCGA). Since the activity level and not the gene expression of ID2 is the determinant of its modulatory function that is, the extent to which it modulates the transcriptional network of HIF2α, we used an algorithm called Virtual Inference of Protein-activity by Enriched Regulon analysis (VIPER) to infer the activity of ID2 protein from its gene expression profile26. VIPER method allows the computational inference of protein activity, on an individual sample basis, from gene expression profile data. It uses the expression of genes that are most directly regulated by a given protein, such as the targets of a transcription factor (TF), as an accurate reporter of its activity. We defined the targets of ID2 by running ARACNe algorithm on 548 gene expression profiles and use the inferred 106 targets to determine its activity (Supplementary Table 3). We applied CINDy on 277 targets of HIF2α represented in Ingenuity pathway analysis (IPA) and for which gene expression data was available (Supplementary Table 4). Of these 277 targets, 77 are significantly modulated by ID2 activity (P value ≤ 0.05). Among the set of target genes whose expression was significantly positively correlated (P value ≤ 0.05) with the expression of HIF2α irrespective of the activity of ID2, that is, correlation was significant for samples with both high and low activity of ID2, the average expression of target genes for a given expression of HIF2α was higher when the activity of ID2 was high. The same set of target gene were more correlated in high ID2 activity samples compared to any set of random genes of same size (Fig. 5a), whereas they were not in ID2 low activity samples (Fig. 5b). We selected 25% of all samples with the highest/lowest ID2 activity to calculate the correlation between HIF2α and its targets. To determine whether regulation of ID2 by hypoxia might impact the correlation between high ID2 activity and HIF2α shown in Fig. 5a, b we compared the effects of ID2 activity versus ID2 expression for the transcriptional connection between HIF2α and its targets. We selected 25% of all patients (n = 548) in TCGA with high ID2 activity and 25% of patients with low ID2 activity and tested the enrichment of significantly positively correlated targets of HIF2α in each of the groups. This resulted in significant enrichment (P value < 0.001) in high ID2 activity but showed no significant enrichment (P value = 0.093) in low ID2 activity samples. Moreover, the difference in the enrichment score (∆ES) in these two groups was statistically significant (P value < 0.05). This significance is calculated by randomly selecting the same number of genes as the positively correlated targets of HIF2α, and calculating the ∆ES for these randomly selected genes, giving ∆ES . We repeated this step 1,000 times to obtain 1,000 ∆ES that are used to build the null distribution (Extended Data Fig. 9b). We used the null distribution to estimate P value calculated as (number of ∆ES > ∆ES )/1,000. Enrichment was observed only when ID2 activity was high but not when ID2 activity was low, thus suggesting that ID2 activity directionally impacts the regulation of targets of HIF2α by HIF2α. Consistently, the significant ∆ES using ID2 activity suggests that ID2 activity is determinant of correlation between HIF2α and its targets. Conversely, when we performed similar analysis using ID2 expression instead of ID2 activity, we found significant enrichment of positively correlated targets of HIF2α both in samples with high expression (P value = 0.025) and low expression of ID2 (P value = 0.048). Given the significant enrichment in both groups, we did not observe any significant difference in the enrichment score in the two groups (P value of ∆ES = 0.338). Thus, while the determination of the ID2 activity and its effects upon the HIF2α-targets connection by VIPER and CINDy allowed us to determine the unidirectional positive link between high ID2 activity and HIF2α transcription, a similar analysis performed using ID2 expression contemplates the dual connection between ID2 and HIF2α. To test if expression of DYRK1A and DYRK1B is a predictor of prognosis, we divided the patients into two cohorts based on their relative expression compared to the mean expression of all patients in GBM. First cohort contained the patients with high expression of both DYRK1A and DYRK1B (n = 101) and the other cohort contained patients with low expression (n = 128). We used average expression for both DYRK1A and DYRK1B, which individually divide the patient cohort into half and half. However, when we use the condition that patients should display higher or lower average expression of both these genes, then we select approximately 19% for high expression and 24% for low expression. Selection of these patients was entirely dependent on the overall expression of these genes in the entire cohort rather than a predefined cutoff. Kaplan–Meier survival analysis showed the significant survival benefit for the patients having the high expression of both DYRK1A and DYRK1B (P value = 0.004) compared to the patients with low expression. When similar analysis was performed using only the expression of DYRK1A or DYRK1B alone, the prediction was either non-significant (DYRK1A) or less significant (DYRK1B, P value = 0.008) when compared to the predictions using the expression of both genes. Results in graphs are expressed as means ± s.d. or means ± s.e.m., as indicated in figure legends, for the indicated number of observations. Statistical significance was determined by the Student’s t-test (two-tailed, unequal variance). P value < 0.05 is considered significant and is indicated in figure legends.