Entity

Time filter

Source Type


Chen Y.,Hunan Agricultural University | Zhou W.,Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests | Wang H.,Kansas State University | Yuan Z.,Hunan Agricultural University
Medical and Biological Engineering and Computing | Year: 2015

Protein glycosylation is one of the most important and complex post-translational modification that provides greater proteomic diversity than any other post-translational modification. Fast and reliable computational methods to identify glycosylation sites are in great demand. Two key issues, feature encoding and feature selection, can critically affect the accuracy of a computational method. We present a new O-glycosylation sites prediction method using only amino acid sequence information. The method includes the following components: (1) on the basis of multi-scale theory, features based on multi-scale composition of amino acids were extracted from the training sequences with identified glycosylation sites; (2) perform a two-stage feature selection to remove features that had adverse effects on the prediction, including a stage one preliminary filtering with Student’s t test, and a second stage screening through iterative elimination using novel pairwise comparisons conducted in random subspace using support vector machine. Important features retained are used to build prediction model. The method is evaluated with sequence-based tenfold cross-validation tests on balanced datasets. The results of our experiments show that our method significantly outperforms those reported in the literature in terms of sensitivity, specificity, accuracy, Matthew’s correlation coefficient. The prediction accuracy of serine and threonine residues sites reached 95.7 and 92.7 %. The Matthew correlation coefficient of our method for S and T sites is 0.914 and 0.873, respectively. This method can evaluate each feature with the interactions of the rest of the features, which are still included in the model and have the advantage of high efficiency. © 2015, International Federation for Medical and Biological Engineering. Source


Qian G.,Hunan Provincial Key Laboratory of Crop Germplasm Innovation and Utilization | Qian G.,Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests | Wang H.-Y.,Kansas State University | Yuan Z.-M.,Hunan Provincial Key Laboratory of Crop Germplasm Innovation and Utilization | Yuan Z.-M.,Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests
Progress in Biochemistry and Biophysics | Year: 2012

β-Turn is a secondary protein structure type that is important in protein folding, protein stability and molecular recognition processes. To date, various methods have been put forward to predict ß-turns, but none of them have tried directly to map the structures of pre-existing homologues from structural databases like RCSB PDB to the protein to be predicted. Given the large size of PDB (>70 000 structures), it is actually of high possibility to find a structural homologue for a newly identified sequence. In this work, we present a new method that predicts ß-turns by combining homology information extracted from PDB with the results predicted by NetTurnP. Two datasets, the golden set BT426 and the self-constructed dataset EVA937, are used to assess our method. For each sequence in both datasets, only homologues deposited earlier than the sequence in PDB are employed. We have achieved Matthews correlation coefficients (MCCs) of 0.56, 0.52 respectively, which are higher than those obtained by NetTurnP alone of 0.50, 0.46, and the prediction accuracies (Q total) obtained using our method are 81.4% and 80.4% separately, while NetTurnP alone achieves 78.2% and 77.3%. The results confirm that combining the homology information with state-of-the-art ß-turn predictors like NetTurnP can significantly improve the prediction accuracy. A Java program called BTMapping has been written to implement our method, which is freely available at http://www.bio530.weebly.com together with the related datasets. Source


Wang X.,Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests | Wang X.,Hunan Agricultural University | Wang X.,Kyushu University | Wang M.,South China Agricultural University | And 6 more authors.
Zootaxa | Year: 2015

Seventy-seven species of family Bombycidae s. lat., belonging to 25 genera in three subfamilies, that have been recorded from China are listed and described, with illustrations of the adults, preimaginal stages (if available), and their genitalia. Keys to subfamilies and genera are provided. Two new genera and four new species are described, two subgenera are raised to generic status, seven new combinations are made, and one genus and six species are newly recorded from China. The new taxa are as follows: Rotunda Wang, X. & Zolotuhin, gen. nov., Comparmustilia Wang, X. & Zolotuhin, gen. nov., Triuncina daii Wang, X. & Zolotuhin, sp. nov., Triuncina xiongi Wang, X. & Zolotuhin, sp. nov., Gnathocinara boi Wang, X. & Zolotuhin, sp. nov. and Promustilia yajiangensis Wang, X. & Zolotuhin, sp. nov. The taxa newly recorded for China are: Sesquiluna Forbes, 1955; Trilocha friedeli Dierl, 1978; Bivincula kalikotei Dierl, 1978; Sesquiluna forbesi Zolotuhin & Witt, 2009; Mustilizans lepusa Zolotuhin, 2007; Smerkata brechlini (Zolotuhin, 2007) and Mustilia castanea Moore, 1879. The seven new combinations are: Rotunda rotundapex (Miyata & Kishida, 1990), comb. nov., Triuncina nitida (Chu & Wang, L.Y., 1993), comb. nov., Gunda sesostris (Vuillot, 1893), comb. nov., Smerkata fusca (Kishida, 1993), comb. nov., Comparmustilia sphingiformis (Moore, 1879), comb. nov., Comparmustilia semiravida (Yang, 1995), comb. nov., Comparmustilia gerontica (West, 1932), comb. nov.. The two subgenera raised to generic level are: Promustilia Zolotuhin, 2007, stat. nov. and Smerkata Zolotuhin, 2007, stat. nov.. The distributions of the species in China were determined and distributional maps provided. All type specimens of the new species described here are deposited in the College of Plant Protection, Hunan Agricultural University, China (HUNAU); Department of Entomology, South China Agricultural University, China (SCAU); Kyushu University Museum, Kyushu University, Japan (KUM), and Entomological Museum Thomas J. Witt, Munich, Germany (MWM). © 2015 Magnolia Press. Source


Xie Y.G.,Hunan Provincial Key Laboratory of Crop Germplasm Innovation and Utilization | Xie Y.G.,Hunan Agricultural University | Zhang H.Y.,Hunan Provincial Key Laboratory of Crop Germplasm Innovation and Utilization | Zhang H.Y.,Hunan Agricultural University | And 6 more authors.
Bulgarian Journal of Agricultural Science | Year: 2013

This paper proposes a method that creatively applies a Geo-statistics tool (GS) to complete fast and adequate order determination and introduces a novel algorithm, named Reasonable Sample Rejection (RSR) to realize rational sample selection. Then, combined with Support Vector Machine Regression (SVR), a high precision non-linear prediction method named GSRSR- SVR is proposed for multidimensional time series. The main steps of the novel method includes: 1) determine the order for the dependent variable of the training samples based on one-dimensional GS aftereffect duration (range), 2) screen the independent variables according to Leave-One-Out Cross Validation (LOOCV) based on the minimum Mean Squared Error (MSE), 3) reject some oldest training samples based on the minimum correlation coefficient of fitting absolute relative error of training sets of different rejected sizes and sample number. Three real-world datasets was used to test the effectiveness of GSRSR- SVR. The results show that GS-RSR-SVR has higher prediction precision and more stable prediction ability than MLR, ARIMA, CAR, BPNN, SVR and SVR-CAR. Source


Zhang H.,Hunan Provincial Key Laboratory of Crop Germplasm Innovation and Utilization | Zhang H.,Hunan Agricultural University | Zhang H.,Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests | Li L.,Hunan Provincial Key Laboratory of Crop Germplasm Innovation and Utilization | And 10 more authors.
BioMed Research International | Year: 2014

In efforts to discover disease mechanisms and improve clinical diagnosis of tumors, it is useful to mine profiles for informative genes with definite biological meanings and to build robust classifiers with high precision. In this study, we developed a new method for tumor-gene selection, the Chi-square test-based integrated rank gene and direct classifier (χ 2-IRG-DC). First, we obtained the weighted integrated rank of gene importance from chi-square tests of single and pairwise gene interactions. Then, we sequentially introduced the ranked genes and removed redundant genes by using leave-one-out cross-validation of the chi-square test-based Direct Classifier (χ 2-DC) within the training set to obtain informative genes. Finally, we determined the accuracy of independent test data by utilizing the genes obtained above with χ 2-DC. Furthermore, we analyzed the robustness of χ 2-IRG-DC by comparing the generalization performance of different models, the efficiency of different feature-selection methods, and the accuracy of different classifiers. An independent test of ten multiclass tumor gene-expression datasets showed that χ 2-IRG-DC could efficiently control overfitting and had higher generalization performance. The informative genes selected by χ 2-IRG-DC could dramatically improve the independent test precision of other classifiers; meanwhile, the informative genes selected by other feature selection methods also had good performance in χ 2-DC. © 2014 Hongyan Zhang et al. Source

Discover hidden collaborations