PubMed | Reykjavik University, University of Helsinki, Genome Scale Biology Research Program and Helsinki Institute for Information Technology HIIT
Type: | Journal: Bioinformatics (Oxford, England) | Year: 2016
While the position weight matrix (PWM) is the most popular model for sequence motifs, there is growing evidence of the usefulness of more advanced models such as first-order Markov representations, and such models are also becoming available in well-known motif databases. There has been lots of research of how to learn these models from training data but the problem of predicting putative sites of the learned motifs by matching the model against new sequences has been given less attention. Moreover, motif site analysis is often concerned about how different variants in the sequence affect the sites. So far, though, the corresponding efficient software tools for motif matching have been lacking.We develop fast motif matching algorithms for the aforementioned tasks. First, we formalize a framework based on high-order position weight matrices for generic representation of motif models with dinucleotide or general q-mer dependencies, and adapt fast PWM matching algorithms to the high-order PWM framework. Second, we show how to incorporate different types of sequence variants, such as SNPs and indels, and their combined effects into efficient PWM matching workflows. Benchmark results show that our algorithms perform well in practice on genome-sized sequence sets and are for multiple motif search much faster than the basic sliding window algorithm.Implementations are available as a part of the MOODS software package under the GNU General Public License v3.0 and the Biopython license (http://www.cs.helsinki.fi/group/pssmfind).firstname.lastname@example.org.
PubMed | Karolinska Institutet, University of Helsinki, The Broad Institute of MIT and Harvard, University of Cardiff and 18 more.
Type: Journal Article | Journal: Human molecular genetics | Year: 2016
To identify new risk loci for colorectal cancer (CRC), we conducted a meta-analysis of seven genome-wide association studies (GWAS) with independent replication, totalling 13 656 CRC cases and 21 667 controls of European ancestry. The combined analysis identified a new risk association for CRC at 2q35 marked by rs992157 (P = 3.15 10
Bonke M.,Genome Scale Biology Research Program |
Turunen M.,Genome Scale Biology Research Program |
Sokolova M.,Genome Scale Biology Research Program |
Vaharautio A.,Genome Scale Biology Research Program |
And 11 more authors.
G3: Genes, Genomes, Genetics | Year: 2013
In this work, we map the transcriptional targets of 107 previously identified Drosophila genes whose loss caused the strongest cell-cycle phenotypes in a genome-wide RNA interference screen and mine the resulting data computationally. Besides confirming existing knowledge, the analysis revealed several regulatory systems, among which were two highly-specific and interconnected feedback circuits, one between the ribosome and the proteasome that controls overall protein homeostasis, and the other between the ribosome and Myc/Max that regulates the protein synthesis capacity of cells. We also identified a set of genes that alter the timing of mitosis without affecting gene expression, indicating that the cyclic transcriptional program that produces the components required for cell division can be partially uncoupled from the cell division process itself. These genes all have a function in a pathway that regulates the phosphorylation state of Cdk1. We provide evidence showing that this pathway is involved in regulation of cell size, indicating that a Cdk1-regulated cell size checkpoint exists in metazoans. © 2013 Bonke et al.
Ngeow J.,Cleveland Clinic |
Heald B.,Cleveland Clinic |
Rybicki L.A.,Cleveland Clinic |
Orloff M.S.,Cleveland Clinic |
And 13 more authors.
Gastroenterology | Year: 2013
Background & Aims: Gastrointestinal polyposis is a common clinical problem, yet there is no consensus on how to best manage patients with moderate-load polyposis. Identifying genetic features of this disorder could improve management and especially surveillance of these patients. We sought to determine the prevalence of hamartomatous polyposis-associated mutations in the susceptibility genes PTEN, BMPR1A, SMAD4, ENG, and STK11 in individuals with ≥5 gastrointestinal polyps, including at least 1 hamartomatous or hyperplastic/serrated polyp. Methods: We performed a prospective, referral-based study of 603 patients (median age: 51 years; range, 2-89 years) enrolled from June 2006 through January 2012. Genomic DNA was extracted from peripheral lymphocytes and analyzed for specific mutations and large rearrangements in PTEN, BMPR1A, SMAD4, and STK11, as well as mutations in ENG. Recursive partitioning analysis was used to determine cutoffs for continuous variables. The prevalence of mutations was compared using Fisher's exact test. Logistic regression analyses were used to determine univariate and multivariate risk factors. Results: Of 603 patients, 119 (20%) had a personal history of colorectal cancer and most (n = 461 [76%]) had <30 polyps. Seventy-seven patients (13%) were found to have polyposis-associated mutations, including 11 in ENG (1.8%), 13 in PTEN (2.2%), 13 in STK11 (2.2%), 20 in BMPR1A (3.3%), and 21 in SMAD4 (3.5%). Univariate clinical predictors for risk of having these mutations included age at presentation younger than 40 years (19% vs 10%; P =.008), a polyp burden of ≥30 (19% vs 11%; P =.014), and male sex (16% vs 10%; P =.03). Patients who had ≥1 ganglioneuroma (29% vs 2%; P <.001) or presented with polyps of ≥3 histologic types (20% vs 2%; P =.003) were more likely to have germline mutations in PTEN. Conclusions: Age younger than 40 years, male sex, and specific polyp histologies are significantly associated with risk of germline mutations in hamartomatous-polyposis associated genes. These associations could guide clinical decision making and further investigations. © 2013 AGA Institute.