Genome Analysis Center

Norwich, United Kingdom

Genome Analysis Center

Norwich, United Kingdom
Time filter
Source Type

An S.-Q.,University of Dundee | Chin K.-H.,National Chung Hsing University | Febrer M.,University of Dundee | McCarthy Y.,University College Cork | And 7 more authors.
EMBO Journal | Year: 2013

Cyclic guanosine 3',5'-monophosphate (cyclic GMP) is a second messenger whose role in bacterial signalling is poorly understood. A genetic screen in the plant pathogen Xanthomonas campestris (Xcc) identified that XC-0250, which encodes a protein with a class III nucleotidyl cyclase domain, is required for cyclic GMP synthesis. Purified XC-0250 was active in cyclic GMP synthesis in vitro. The linked gene XC-0249 encodes a protein with a cyclic mononucleotide-binding (cNMP) domain and a GGDEF diguanylate cyclase domain. The activity of XC-0249 in cyclic di-GMP synthesis was enhanced by addition of cyclic GMP. The isolated cNMP domain of XC-0249 bound cyclic GMP and a structure-function analysis, directed by determination of the crystal structure of the holo-complex, demonstrated the site of cyclic GMP binding that modulates cyclic di-GMP synthesis. Mutation of either XC-0250 or XC-0249 led to a reduced virulence to plants and reduced biofilm formation in vitro. These findings describe a regulatory pathway in which cyclic GMP regulates virulence and biofilm formation through interaction with a novel effector that directly links cyclic GMP and cyclic di-GMP signalling. © 2013 European Molecular Biology Organization.

Borrill P.,John Innes Center | Ramirez-Gonzalez R.,Genome Analysis Center | Uauy C.,John Innes Center
Plant Physiology | Year: 2016

The majority of transcriptome sequencing (RNA-seq) expression studies in plants remain underutilized and inaccessible due to the use of disparate transcriptome references and the lack of skills and resources to analyze and visualize these data. We have developed expVIP, an expression visualization and integration platform, which allows easy analysis of RNA-seq data combined with an intuitive and interactive interface. Users can analyze public and user-specified data sets with minimal bioinformatics knowledge using the expVIP virtual machine. This generates a custom Web browser to visualize, sort, and filter the RNA-seq data and provides outputs for differential gene expression analysis. We demonstrate expVIP’s suitability for polyploid crops and evaluate its performance across a range of biologically relevant scenarios. To exemplify its use in crop research, we developed a flexible wheat (Triticum aestivum) expression browser ( that can be expanded with user-generated data in a local virtual machine environment. The open-access expVIP platform will facilitate the analysis of gene expression data from a wide variety of species by enabling the easy integration, visualization, and comparison of RNA-seq data across experiments. © 2016 American Society of Plant Biologists. All rights reserved.

Ramirez-Gonzalez R.H.,Genome Analysis Center | Uauy C.,John Innes Center | Uauy C.,UK National Institute of Agricultural Botany | Caccamo M.,Genome Analysis Center
Bioinformatics | Year: 2015

Summary: The design of genetic markers is of particular relevance in crop breeding programs. Despite many economically important crops being polyploid organisms, the current primer design tools are tailored for diploid species. Bread wheat, for instance, is a hexaploid comprising of three related genomes and the performance of genetic markers is diminished if the primers are not genome specific. PolyMarker is a pipeline that generates SNP markers by selecting candidate primers for a specified genome using local alignments and standard primer design tools to test the viability of the primers. A command line tool and a web interface are available to the community. © The Author 2015. Published by Oxford University Press.

Leggett R.M.,Genome Analysis Center | Clavijo B.J.,Genome Analysis Center | Clissold L.,Genome Analysis Center | Clark M.D.,Genome Analysis Center | Caccamo M.,Genome Analysis Center
Bioinformatics | Year: 2014

Summary: Illumina's recently released Nextera Long Mate Pair (LMP) kit enables production of jumping libraries of up to 12 kb. The LMP libraries are an invaluable resource for carrying out complex assemblies and other downstream bioinformatics analyses such as the characterization of structural variants. However, LMP libraries are intrinsically noisy and to maximize their value, post-sequencing data analysis is required. Standardizing laboratory protocols and the selection of sequenced reads for downstream analysis are non-trivial tasks. NextClip is a tool for analyzing reads from LMP libraries, generating a comprehensive quality report and extracting good quality trimmed and deduplicated reads.Availability and implementation: Source code, user guide and example data are available from Contact: Supplementary information: Supplementary data are available at Bioinformatics online. © 2013 The Author.

West C.,UK Institute of Food Research | James S.A.,UK Institute of Food Research | Davey R.P.,UK Institute of Food Research | Davey R.P.,Genome Analysis Center | And 3 more authors.
Systematic Biology | Year: 2014

The ribosomal RNA encapsulates a wealth of evolutionary information, including genetic variation that can be used to discriminate between organisms at a wide range of taxonomic levels. For example, the prokaryotic 16S rDNA sequence is very widely used both in phylogenetic studies and as a marker in metagenomic surveys and the internal transcribed spacer region, frequently used in plant phylogenetics, is now recognized as a fungal DNA barcode. However, this widespread use does not escape criticism, principally due to issues such as difficulties in classification of paralogous versus orthologous rDNA units and intragenomic variation, both of which may be significant barriers to accurate phylogenetic inference. We recently analyzed data sets from the Saccharomyces Genome Resequencing Project, characterizing rDNA sequence variation within multiple strains of the baker's yeast Saccharomyces cerevisiae and its nearest wild relative Saccharomyces paradoxus in unprecedented detail. Notably, both species possess single locus rDNA systems. Here, we use these new variation datasets to assess whether a more detailed characterization of the rDNA locus can alleviate the second of these phylogenetic issues, sequence heterogeneity, while controlling for the first. We demonstrate that a strong phylogenetic signal exists within both datasets and illustrate how they can be used, with existing methodology, to estimate intraspecies phylogenies of yeast strains consistent with those derived from whole-genome approaches. We also describe the use of partial Single Nucleotide Polymorphisms, a type of sequence variation found only in repetitive genomic regions, in identifying key evolutionary features such as genome hybridization events and show their consistency with whole-genome Structure analyses. We conclude that our approach can transform rDNA sequence heterogeneity from a problem to a useful source of evolutionary information, enabling the estimation of highly accurate phylogenies of closely related organisms, and discuss how it could be extended to future studies of multilocus rDNA systems. [concerted evolution; genome hydridisation; phylogenetic analysis; ribosomal DNA; whole genome sequencing; yeast] © 2014 The Author(s) 2014.

Harper A.L.,John Innes Center | Trick M.,John Innes Center | Higgins J.,John Innes Center | Higgins J.,Genome Analysis Center | And 7 more authors.
Nature Biotechnology | Year: 2012

Association genetics can quickly and efficiently delineate regions of the genome that control traits and provide markers to accelerate breeding by marker-assisted selection. But most crops are polyploid, making it difficult to identify the required markers and to assemble a genome sequence to order those markers. To circumvent this difficulty, we developed associative transcriptomics, which uses transcriptome sequencing to identify and score molecular markers representing variation in both gene sequences and gene expression, and correlate this with trait variation. Applying the method in the recently formed tetraploid crop Brassica napus, we identified genomic deletions that underlie two quantitative trait loci for glucosinolate content of seeds. The deleted regions contained orthologs of the transcription factor HAG1 (At5g61420), which controls aliphatic glucosinolate biosynthesis in Arabidopsis thaliana. This approach facilitates the application of association genetics in a broad range of crops, even those with complex genomes. © 2012 Nature America, Inc. All rights reserved.

Veres D.V.,Semmelweis University | Gyurko D.M.,Semmelweis University | Thaler B.,Semmelweis University | Thaler B.,Budapest University of Technology and Economics | And 6 more authors.
Nucleic Acids Research | Year: 2015

Here we present ComPPI, a cellular compartmentspecific database of proteins and their interactions enabling an extensive, compartmentalized protein-protein interaction network analysis (URL: ComPPI enables the user to filter biologically unlikely interactions, where the two interacting proteins have no common subcellular localizations and to predict novel properties, such as compartment-specific biological functions. ComPPI is an integrated database covering four species (S. cerevisiae, C. elegans, D. melanogaster and H. sapiens). The compilation of nine protein-protein interaction and eight subcellular localization data sets had four curation steps including a manually built, comprehensive hierarchical structure of >1600 subcellular localizations. ComPPI provides confidence scores for protein subcellular localizations and protein-protein interactions. ComPPI has user-friendly search options for individual proteins giving their subcellular localization, their interactions and the likelihood of their interactions considering the subcellular localization of their interacting partners. Download options of search results, wholeproteomes, organelle-specific interactomes and subcellular localization data are available on its website. Due to its novel features, ComPPI is useful for the analysis of experimental results in biochemistry and molecular biology, as well as for proteome-wide studies in bioinformatics and network science helping cellular biology, medicine and drug design. © The Author(s) 2014.

Iqbal Z.,University of Oxford | Iqbal Z.,European Bioinformatics Institute | Caccamo M.,Genome Analysis Center | Turner I.,University of Oxford | And 2 more authors.
Nature Genetics | Year: 2012

Detecting genetic variants that are highly divergent from a reference sequence remains a major challenge in genome sequencing. We introduce de novo assembly algorithms using colored de Bruijn graphs for detecting and genotyping simple and complex genetic variants in an individual or population. We provide an efficient software implementation, Cortex, the first de novo assembler capable of assembling multiple eukaryotic genomes simultaneously. Four applications of Cortex are presented. First, we detect and validate both simple and complex structural variations in a high-coverage human genome. Second, we identify more than 3 Mb of sequence absent from the human reference genome, in pooled low-coverage population sequence data from the 1000 Genomes Project. Third, we show how population information from ten chimpanzees enables accurate variant calls without a reference sequence. Last, we estimate classical human leukocyte antigen (HLA) genotypes at HLA-B, the most variable gene in the human genome. © 2012 Nature America, Inc. All rights reserved.

Turner T.R.,John Innes Center | Turner T.R.,University of East Anglia | Ramakrishnan K.,John Innes Center | Walshaw J.,UK Institute of Food Research | And 7 more authors.
ISME Journal | Year: 2013

Plant-microbe interactions in the rhizosphere have important roles in biogeochemical cycling, and maintenance of plant health and productivity, yet remain poorly understood. Using RNA-based metatranscriptomics, the global active microbiomes were analysed in soil and rhizospheres of wheat, oat, pea and an oat mutant (sad1) deficient in production of anti-fungal avenacins. Rhizosphere microbiomes differed from bulk soil and between plant species. Pea (a legume) had a much stronger effect on the rhizosphere than wheat and oat (cereals), resulting in a dramatically different rhizosphere community. the relative abundance of eukaryotes in the oat and pea rhizospheres was more than fivefold higher than in the wheat rhizosphere or bulk soil. Nematodes and bacterivorous protozoa were enriched in all rhizospheres, whereas the pea rhizosphere was highly enriched for fungi. Metabolic capabilities for rhizosphere colonisation were selected, including cellulose degradation (cereals), H 2 oxidation (pea) and methylotrophy (all plants). Avenacins had little effect on the prokaryotic community of oat, but the eukaryotic community was strongly altered in the sad1 mutant, suggesting that avenacins have a broader role than protecting from fungal pathogens. Profiling microbial communities with metatranscriptomics allows comparison of relative abundance, from multiple samples, across all domains of life, without polymerase chain reaction bias. this revealed profound differences in the rhizosphere microbiome, particularly at the kingdom level between plants. © 2013 International Society for Microbial Ecology All rights reserved.

Greenman C.D.,University of East Anglia | Greenman C.D.,Genome Analysis Center | Chou T.,University of California at Los Angeles
Physical Review E - Statistical, Nonlinear, and Soft Matter Physics | Year: 2016

Classical age-structured mass-action models such as the McKendrick-von Foerster equation have been extensively studied but are unable to describe stochastic fluctuations or population-size-dependent birth and death rates. Stochastic theories that treat semi-Markov age-dependent processes using, e.g., the Bellman-Harris equation do not resolve a population's age structure and are unable to quantify population-size dependencies. Conversely, current theories that include size-dependent population dynamics (e.g., mathematical models that include carrying capacity such as the logistic equation) cannot be easily extended to take into account age-dependent birth and death rates. In this paper, we present a systematic derivation of a new, fully stochastic kinetic theory for interacting age-structured populations. By defining multiparticle probability density functions, we derive a hierarchy of kinetic equations for the stochastic evolution of an aging population undergoing birth and death. We show that the fully stochastic age-dependent birth-death process precludes factorization of the corresponding probability densities, which then must be solved by using a Bogoliubov - Born - Green - Kirkwood - Yvon-like hierarchy. Explicit solutions are derived in three limits: no birth, no death, and steady state. These are then compared with their corresponding mean-field results. Our results generalize both deterministic models and existing master equation approaches by providing an intuitive and efficient way to simultaneously model age- and population-dependent stochastic dynamics applicable to the study of demography, stem cell dynamics, and disease evolution. © 2016 American Physical Society.

Loading Genome Analysis Center collaborators
Loading Genome Analysis Center collaborators