The spectrum kernel: A string kernel for SVM protein
classification
Christina Leslie, Eleazar Eskin and William Stafford Noble
Proceedings of the Pacific Symposium on Biocomputing, 2002.
pp. 564-575.
Abstract
We introduce a new sequence-similarity kernel, the spectrum kernel,
for use with support vector machines (SVMs) in a discriminative
approach to the protein classification problem. Our kernel is
conceptually simple and efficient to compute and, in experiments on
the SCOP database, performs well in comparison with state-of-the-art
methods for homology detection. Moreover, our method produces an SVM
classifier that allows linear time classification of test sequences.
Our experiments provide evidence that string-based kernels, in
conjunction with SVMs, could offer a viable and computationally
efficient alternative to other methods of protein classification and
homology detection.
PDF version
Home