원문정보
초록
영어
Cell membrane proteins play a vitally important role in influencing the behavior of cells.
Knowledge of membrane protein type facilitates the determination of its functions, which has
implications in numerous applications including drug design. Due to the increasingly large
number of uncharacterized proteins in data-banks such as NCBI’s RefSeq, there is a high
desire to replace time and cost consuming experimental methods for membrane protein type
classification with computational methods. This paper introduces a new computational
method that accurately predicts the type of unclassified membrane proteins based on their
sequence. Our method is based on a novel representation of protein sequences that
incorporates seven different feature sets. Empirical comparison, which includes twelve
competing methods, shows that the presented method generates predictions that result in 8%
and 28% error rate reduction when compared with the best existing computational method
and when using the jackknife test and testing on an independent dataset, respectively. We also
show that the most influential sources of information for making the predictions include the
composition of 2-gram exchange groups and the amino acid composition of the underlying
sequence.
목차
1. Introduction
2. Methodology
2.1 Data
2.2 Feature-based Sequence Representation
2.3 Design of the Proposed Prediction Method
2.4 K* Classifier
3. Results and Discussion
3.1 Experimental Setup
3.2 Experimental Evaluation of the Proposed Method
3.3 Comparison with Competing Methods
3.4 Evaluation of Feature Sets
4. Conclusions
References