Entity

Time filter

Source Type

Boston, MA, United States

Gene expression data are noisy due to technical and biological variability. Consequently, analysis of gene expression data is complex. Different statistical methods produce distinct sets of genes. In addition, selection of expression p-value (EPv) threshold is somewhat arbitrary. In this study, we aimed to develop novel literature based approaches to integrate functional information in analysis of gene expression data. Functional relationships between genes were derived by Latent Semantic Indexing (LSI) of Medline abstracts and used to calculate the function cohesion of gene sets. In this study, literature cohesion was applied in two ways. First, Literature-Based Functional Significance (LBFS) method was developed to calculate a p-value for the cohesion of differentially expressed genes (DEGs) in order to objectively evaluate the overall biological significance of the gene expression experiments. Second, Literature Aided Statistical Significance Threshold (LASST) was developed to determine the appropriate expression p-value threshold for a given experiment. We tested our methods on three different publicly available datasets. LBFS analysis demonstrated that only two experiments were significantly cohesive. For each experiment, we also compared the LBFS values of DEGs generated by four different statistical methods. We found that some statistical tests produced more functionally cohesive gene sets than others. However, no statistical test was consistently better for all experiments. This reemphasizes that a statistical test must be carefully selected for each expression study. Moreover, LASST analysis demonstrated that the expression p-value thresholds for some experiments were considerably lower (p < 0.02 and 0.01), suggesting that the arbitrary p-values and false discovery rate thresholds that are commonly used in expression studies may not be biologically sound. We have developed robust and objective literature-based methods to evaluate the biological support for gene expression experiments and to determine the appropriate statistical significance threshold. These methods will assist investigators to more efficiently extract biologically meaningful insights from high throughput gene expression experiments. Source


Chan W.H.,Bioinformatics Program | Ebner J.,Golisano Institute for Sustainability | Ramchandra R.,Golisano Institute for Sustainability | Ramchandra R.,New York State Pollution Prevention Institute | Trabold T.,Golisano Institute for Sustainability
ASME 2013 7th Int. Conf. on Energy Sustainability Collocated with the ASME 2013 Heat Transfer Summer Conf. and the ASME 2013 11th Int. Conf. on Fuel Cell Science, Engineering and Technology, ES 2013 | Year: 2013

Prior research conducted by our Institute has revealed the large quantities of food waste available in New York State, particularly in the Upstate corridor extending from Buffalo to Syracuse. The Finger Lakes region is heavily populated with agricultural operations, dairy farms and food processing plants, including those producing milk, yogurt, wine, and canned fruits and vegetables. The diverse supply of organic waste generated by these facilities offers the opportunity for sustainable energy production through one of three primary pathways: •Anaerobic digestion to produce methane •Fermentation to produce alcohols •Transesterification to produce biodiesel. Generally speaking, food wastes are better suited for biochemical conversion instead of thermo-chemical conversion (combustion, gasification, pyrolysis) due to their relatively high moisture content. The current paper provides an initial assessment of food wastes within the 9-County Finger Lakes region around Rochester, New York. Available databases were utilized to first identify all the relevant companies operating in one of four broad industry sectors: agriculture, food processing, food distribution and food services (including restaurants). Our analysis has demonstrated that anaerobic digestion can be a viable method for sustainable energy production from food waste in the Finger Lakes region, due to the dual economic benefits of effective disposal cost reduction and production of methane-rich biogas. Copyright © 2013 by ASME. Source


Eurich C.,Elizabethtown Area High School | Fields P.A.,Franklin And Marshall College | Rice E.,Bioinformatics Program
American Biology Teacher | Year: 2012

Proteomics is an emerging area of systems biology that allows simultaneous study of thousands of proteins expressed in cells, tissues or whole organisms. We have developed this activity to enable high school or college students to explore proteomic databases using mass spectrometry data files generated from yeast proteins in a college laboratory course. Students upload files of "unknown" proteins from our public website, enter them into a proteomics search engine (Mascot), identify the proteins, and use additional proteomics websites to learn about their functions and three-dimensional structures. This activity is suitable for use in units exploring protein structure and function, metabolism, or bioinformatics. © 2012 by National Association of Biology Teachers. All rights reserved. Source


Friese R.S.,University of California at San Diego | Ye C.,Bioinformatics Program | Schork A.J.,University of California at San Diego | Mahapatra N.R.,Indian Institute of Technology Madras | And 7 more authors.
Circulation: Cardiovascular Genetics | Year: 2012

Background: Essential hypertension, a common complex disease, displays substantial genetic influence. Contemporary methods to dissect the genetic basis of complex diseases such as the genomewide association study are powerful, yet a large gap exists betweens the fraction of population trait variance explained by such associations and total disease heritability. Methods and Results: We developed a novel, integrative method (combining animal models, transcriptomics, bioinformatics, molecular biology, and trait-extreme phenotypes) to identify candidate genes for essential hypertension and the metabolic syndrome. We frst undertook transcriptome profiling on adrenal glands from blood pressure extreme mouse strains: the hypertensive BPH (blood pressure high) and hypotensive BPL (blood pressure low). Microarray data clustering revealed a striking pattern of global underexpression of intermediary metabolism transcripts in BPH. The MITRA algorithm identified a conserved motif in the transcriptional regulatory regions of the underexpressed metabolic genes, and we then hypothesized that regulation through this motif contributed to the global underexpression. Luciferase reporter assays demonstrated transcriptional activity of the motif through transcription factors HOXA3, SRY, and YY1. We finally hypothesized that genetic variation at HOXA3, SRY, and YY1 might predict blood pressure and other metabolic syndrome traits in humans. Tagging variants for each locus were associated with blood pressure in a human population blood pressure extreme sample with the most extensive associations for YY1 tagging single nucleotide polymorphism rs11625658 on systolic blood pressure, diastolic blood pressure, body mass index, and fasting glucose. Meta-analysis extended the YY1 results into 2 additional large population samples with significant effects preserved on diastolic blood pressure, body mass index, and fasting glucose. Conclusions: The results outline an innovative, systematic approach to the genetic pathogenesis of complex cardiovascular disease traits and point to transcription factor YY1 as a potential candidate gene involved in essential hypertension and the cardiometabolic syndrome. © 2012 American Heart Association, Inc. Source


Wu H.,Bioinformatics Program | Palani A.,Bioinformatics Program
Proceedings - Frontiers in Education Conference, FIE | Year: 2015

The rapid advancement in biological data acquisition technologies has led to massive biological datasets, which requires the development and application of computational methods to analyze and interpret the information. Bioinformatics is the confluence of biology, computer science, and information technology. The Bioinformatics programs are offered by more than 100 universities in the United States, and much more worldwide. Different degree (including BS, MS, and PhD), and certificate programs in Bioinformatics have been performed. The current bioinformatics programs in the US have been studied, regarding their curriculum, program competencies, sizes of the faculty, and student enrollments. The job market is also explored for bioinformatics professional training and career planning. The bioinformatics skill requirements are analyzed. Systematical analysis is carried out by integrating the core competences and curriculum improvements in bioinformatics. The potential employers for bioinformatics professionals are analyzed according to the properties of the companies, such as the sizes, the focus areas, the locations, the skill requirements, and other information. The results provide guidance for bioinformatics curriculum development, such as the minimized courses to cover the basic required skill sets for a bioinformatics student to be a successful bioinformatician. In addition, the analytical results are applied to the redesign of the curriculum in our bioinformatics program which offers MS, PhD, and PhD Minor. In summary, the systematic study of the existing bioinformatics programs in the US and the current market needs for professionals in bioinformatics provide great insight for education in bioinformatics. It helps the curriculum development and reexamination. It also provides the students with the required knowledge for their future career. © 2015 IEEE. Source

Discover hidden collaborations