Time filter

Source Type

Stanberry L.,Seattle Childrens Research Institute SCRI | Higdon R.,Seattle Childrens Research Institute SCRI | Haynes W.,Seattle Childrens Research Institute SCRI | Kolker N.,DELSA | And 11 more authors.
ECMLS 2012 - 3rd International Emerging Computational Methods for the Life Sciences Workshop | Year: 2012

Modern biology is experiencing a rapid increase in data volumes that challenges our analytical skills and existing cyberinfrastructure. Exponential expansion of the Protein Sequence Universe (PSU), the protein sequence space, together with the costs and complexities of manual curation creates a major bottleneck in life sciences research. Existing resources lack scalable visualization tools that are instrumental for functional annotation. Here, we describe a multi-dimensional scaling (MDS) implementation to create a 3D embedding of the PSU that allows visualizing the relationships between large numbers of proteins. To demonstrate the method, we use sequence similarity scores as a measure of proximity. An example of the prokaryotic PSU shows that the low-dimensional representation preserves important grouping features such as relative proximity of functionally similar clusters and clear structural separation between clusters with specific and general functions. The advantages of the method and its implementation include the ability to scale to large numbers of sequences, integrate different similarity measures with other functional and experimental data, and facilitate protein annotation. Transdisciplinary approaches akin to the one described in this paper are urgently needed to quickly and efficiently translate the influx of new data into tangible innovations and groundbreaking discoveries. © 2012 ACM.


Webb S.J.,University of Washington | Neuhaus E.,Seattle Childrens Research Institute Scri | Faja S.,Harvard University
Quarterly Journal of Experimental Psychology | Year: 2016

Autism Spectrum Disorder (ASD) is characterized by impairment in social communication and restricted and repetitive interests. While not included in the diagnostic characterization, aspects of face processing and learning have shown disruptions at all stages of development in ASD, although the exact nature and extent of the impairment vary by age and level of functioning of the ASD sample as well as by task demands. In this review, we examine the nature of face attention, perception, and learning in individuals with ASD focusing on three broad age ranges (early development, middle childhood, and adolescence/adulthood). We propose that early delays in basic face processing contribute to the atypical trajectory of social communicative skills in individuals with ASD and contribute to poor social learning throughout development. Face learning is a life-long necessity, as the social world of individual only broadens with age, and thus addressing both the source of the impairment in ASD as well as the trajectory of ability throughout the lifespan, through targeted treatments, may serve to positively impact the lives of individuals who struggle with social information and understanding. © 2016 The Experimental Psychology Society


Stanberry L.,Seattle Childrens Research Institute SCRI | Rekepalli B.,Oak Ridge National Laboratory | Liu Y.,Oak Ridge National Laboratory | Giblock P.,Cisco Systems | And 5 more authors.
Concurrency Computation Practice and Experience | Year: 2014

Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curationinfeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data. Copyright © 2014 John Wiley & Sons, Ltd.


PubMed | University of Washington, Harvard University and Seattle Childrens Research Institute SCRI
Type: | Journal: Quarterly journal of experimental psychology (2006) | Year: 2016

Autism Spectrum Disorder (ASD) is characterized by impairment in social communication and restricted and repetitive interests. While not included in the diagnostic characterization, aspects of face processing and learning have shown disruptions at all stages of development in ASD, although the exact nature and extent of the impairment vary by age and level of functioning of the ASD sample as well as by task demands. In this review, we examine the nature of face attention, perception, and learning in individuals with ASD focusing on three broad age ranges (early development, middle childhood, and adolescence/adulthood). We propose that early delays in basic face processing contribute to the atypical trajectory of social communicative skills in individuals with ASD and contribute to poor social learning throughout development. Face learning is a life-long necessity, as the social world of individual only broadens with age, and thus addressing both the source of the impairment in ASD as well as the trajectory of ability throughout the lifespan, through targeted treatments, may serve to positively impact the lives of individuals who struggle with social information and understanding.


PubMed | Oak Ridge National Laboratory, University of Washington, Cisco Systems and Seattle Childrens Research Institute SCRI
Type: Journal Article | Journal: Concurrency and computation : practice & experience | Year: 2014

Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data.


Stanberry L.,Seattle Childrens Research Institute SCRI | Rekepalli B.,Oak Ridge National Laboratory | Giblock P.,Oak Ridge National Laboratory | Liu Y.,Oak Ridge National Laboratory | And 2 more authors.
ACM International Conference Proceeding Series | Year: 2013

Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the PSU (Protein Sequence Universe) expands exponentially. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible whereas a high compute cost limits the utility of existing automated approaches. In this study, we built an automated workflow to enable large-scale protein annotation into existing orthologous groups using HPC (High Performance Computing) architectures. We developed a low complexity classification algorithm to assign proteins into bacterial COGs (Clusters of Orthologous Groups of proteins). Based on the PSI-BLAST (Position-Specific Iterative Basic Local Alignment Search Tool), the algorithm was validated on simulated and archaeal data to ensure at least 80% specificity and sensitivity. The workflow with highly scalable parallel applications for classification and sequence alignment was developed on XSEDE (Extreme Science and Engineering Discovery Environment) supercomputers. Using the workflow, we have classified one million newly sequenced bacterial proteins. With the rapid expansion of the PSU, the proposed workflow will enable scientists to annotate big genome data. © 2013 by the Association for Computing Machinery, Inc.

Loading Seattle Childrens Research Institute SCRI collaborators
Loading Seattle Childrens Research Institute SCRI collaborators