Time filter

Source Type

Strasser K.,Center for Structural and Functional Genomics | McDonnell E.,Center for Structural and Functional Genomics | Nyaga C.,Center for Structural and Functional Genomics | Wu M.,Center for Structural and Functional Genomics | And 6 more authors.
Database | Year: 2015

Enzymes active on components of lignocellulosic biomass are used for industrial applications ranging from food processing to biofuels production. These include a diverse array of glycoside hydrolases, carbohydrate esterases, polysaccharide lyases and oxidoreductases. Fungi are prolific producers of these enzymes, spurring fungal genome sequencing efforts to identify and catalogue the genes that encode them. To facilitate the functional annotation of these genes, biochemical data on over 800 fungal lignocellulose-degrading enzymes have been collected from the literature and organized into the searchable database, mycoCLAP (http://mycoclap.fungalgenomics.ca). First implemented in 2011, and updated as described here, mycoCLAP is capable of ranking search results according to closest biochemically characterized homologues: this improves the quality of the annotation, and significantly decreases the time required to annotate novel sequences. The database is freely available to the scientific community, as are the open source applications based on natural language processing developed to support the manual curation of mycoCLAP. © The Author(s) 2015. Published by Oxford University Press.

Murphy C.,Center for Structural and Functional Genomics | Morgenstern I.,Center for Structural and Functional Genomics | Cantu C.,Center for Structural and Functional Genomics | Semarjit S.,Center for Structural and Functional Genomics | And 5 more authors.
CEUR Workshop Proceedings | Year: 2011

We present our ongoing development of a semantic infrastructure supporting biofuel research. Part of this effort is the automatic curation of knowledge from the massive amount of information on fungal enzymes that is available in genomics. Working closely with biologists who manually curate the existing literature, we developed ontological NLP pipelines, integrated through Web-based interfaces, to help them in two main tasks: spending less time to mine the literature for facts, while also being provided with richer and semantically linked information. An ongoing challenge is to measure precisely how much the developed semantic technologies benefit the end users and what their overall impact on the quality of the curated data is. We present preliminary evaluation results that show a significant reduction in manual curation time.

Triplet T.,Concordia University at Montréal | Triplet T.,Center for Structural and Functional Genomics | Butler G.,Concordia University at Montréal | Butler G.,Center for Structural and Functional Genomics
BMC Bioinformatics | Year: 2012

Background: In many laboratories, researchers store experimental data on their own workstation using spreadsheets. However, this approach poses a number of problems, ranging from sharing issues to inefficient data-mining. Standard spreadsheets are also error-prone, as data do not undergo any validation process. To overcome spreadsheets inherent limitations, a number of proprietary systems have been developed, which laboratories need to pay expensive license fees for. Those costs are usually prohibitive for most laboratories and prevent scientists from benefiting from more sophisticated data management systems.Results: In this paper, we propose the EnzymeTracker, a web-based laboratory information management system for sample tracking, as an open-source and flexible alternative that aims at facilitating entry, mining and sharing of experimental biological data. The EnzymeTracker features online spreadsheets and tools for monitoring numerous experiments conducted by several collaborators to identify and characterize samples. It also provides libraries of shared data such as protocols, and administration tools for data access control using OpenID and user/team management. Our system relies on a database management system for efficient data indexing and management and a user-friendly AJAX interface that can be accessed over the Internet. The EnzymeTracker facilitates data entry by dynamically suggesting entries and providing smart data-mining tools to effectively retrieve data. Our system features a number of tools to visualize and annotate experimental data, and export highly customizable reports. It also supports QR matrix barcoding to facilitate sample tracking.Conclusions: The EnzymeTracker was designed to be easy to use and offers many benefits over spreadsheets, thus presenting the characteristics required to facilitate acceptance by the scientific community. It has been successfully used for 20 months on a daily basis by over 50 scientists. The EnzymeTracker is freely available online at http://cubique.fungalgenomics.ca/enzymedb/index.html under the GNU GPLv3 license. © 2012 Triplet and Butler; licensee BioMed Central Ltd.

Murphy C.,Center for Structural and Functional Genomics | Powlowski J.,Center for Structural and Functional Genomics | Wu M.,Center for Structural and Functional Genomics | Butler G.,Center for Structural and Functional Genomics | And 2 more authors.
Database | Year: 2011

Fungi produce a wide range of extracellular enzymes to break down plant cell walls, which are composed mainly of cellulose, lignin and hemicellulose. Among them are the glycoside hydrolases (GH), the largest and most diverse family of enzymes active on these substrates. To facilitate research and development of enzymes for the conversion of cell-wall polysaccharides into fermentable sugars, we have manually curated a comprehensive set of characterized fungal glycoside hydrolases. Characterized glycoside hydrolases were retrieved from protein and enzyme databases, as well as literature repositories. A total of 453 characterized glycoside hydrolases have been cataloged. They come from 131 different fungal species, most of which belong to the phylum Ascomycota. These enzymes represent 46 different GH activities and cover 44 of the 115 CAZy GH families. In addition to enzyme source and enzyme family, available biochemical properties such as temperature and pH optima, specific activity, kinetic parameters and substrate specificities were recorded. To simplify comparative studies, enzyme and species abbreviations have been standardized, Gene Ontology terms assigned and reference to supporting evidence provided. The annotated genes have been organized in a searchable, online database called mycoCLAP (Characterized Lignocellulose-Active Proteins of fungal origin). It is anticipated that this manually curated collection of biochemically characterized fungal proteins will be used to enhance functional annotation of novel GH genes. © The Author(s) 2011.

Loading Center for Structural and Functional Genomics collaborators
Loading Center for Structural and Functional Genomics collaborators