Meta-MEME: Motif-based Hidden Markov Models of Protein
Families
William N. Grundy
Timothy L. Bailey
Charles P. Elkan
Michael E. Baker
Computer
Applications in the Biosciences (CABIOS), 13(4):397-406, 1997.
Abstract
Modeling families of related biological sequences using hidden Markov
models (HMMs), although increasingly widespread, faces at least one
major problem: because of the complexity of these mathematical models,
they require a relatively large training set in order to accurately
recognize a given family. For families in which there are few known
sequences, a standard linear HMM contains too many parameters to be
trained adequately. This work attempts to solve that problem by
generating smaller HMMs which precisely model only the conserved
regions of the family. These HMMs are constructed from motif models
generated by the EM algorithm using the MEME software. Because
motif-based HMMs have relatively few parameters, they can be trained
using smaller data sets. Studies of short-chain alcohol
dehydrogenases and 4Fe-4S ferredoxins support the claim that
motif-based HMMs exhibit increased sensitivity and selectivity in
database searches, especially when training sets contain few
sequences.
PDF
Home