|
HMMER
User's Guide
|
|
Dept. of Genetics |
WashU |
Medical School |
Sequencing Center |
CGM |
IBC|
|
Eddy lab |
Internal (lab only) |
HMMER |
PFAM |
tRNAscan-SE |
Software |
Publications
|
Next: alistat - show statistics
Up: Manual pages
Previous: hmmpfam - search a
Subsections
hmmsearch [options] hmmfile seqfile
hmmsearch reads an
HMM from hmmfile and searches seqfile for significantly similar sequence
matches.
seqfile will be looked for first in the current working directory,
then in a directory named by the environment variable BLASTDB. This lets
users use existing BLAST databases, if BLAST has been configured for the
site.
hmmsearch may take minutes or even hours to run, depending on the
size of the sequence database. It is a good idea to redirect the output
to a file.
The output consists of four sections: a ranked list of the
best scoring sequences, a ranked list of the best scoring domains, alignments
for all the best scoring domains, and a histogram of the scores. A sequence
score may be higher than a domain score for the same sequence if there
is more than one domain in the sequence; the sequence score takes into
account all the domains. All sequences scoring above the -E and -T cutoffs
are shown in the first list, then every domain found in this list is
shown in the second list of domain hits. If desired, E-value and bit score
thresholds may also be applied to the domain list using the -domE and -domT
options.
- [-h ] Print brief help; includes version number and summary
of all options, including expert options.
- [-A <n> ] Limits the alignment output
to the <n> best scoring domains. -A0 shuts off the alignment output and can
be used to reduce the size of output files.
- [-E <x> ] Set the E-value cutoff
for the per-sequence ranked hit list to <x>, where <x> is a positive real
number. The default is 10.0. Hits with E-values better than (less than) this
threshold will be shown.
- [-T <x> ] Set the bit score cutoff for the per-sequence
ranked hit list to <x>, where <x> is a real number. The default is negative
infinity; by default, the threshold is controlled by E-value and not by
bit score. Hits with bit scores better than (greater than) this threshold
will be shown.
- [-Z <n> ] Calculate the E-value scores as if we had seen a sequence
database of <n> sequences. The default is the number of sequences seen in
your database file <seqfile>.
- [-cpu <n> ] Sets the maximum
number of CPUs that the program will run on. The default is to use all
CPUs in the machine. Overrides the HMMER_NCPU environment variable. Only
affects threaded versions of HMMER (the default on most systems).
- [-domE
<x> ] Set the E-value cutoff for the per-domain ranked hit list to <x>, where
<x> is a positive real number. The default is infinity; by default, all
domains in the sequences that passed the first threshold will be reported
in the second list, so that the number of domains reported in the per-sequence
list is consistent with the number that appear in the per-domain list.
- [-domT <x> ] Set the bit score cutoff for the per-domain ranked hit list to
<x>, where <x> is a real number. The default is negative infinity; by default,
all domains in the sequences that passed the first threshold will be reported
in the second list, so that the number of domains reported in the per-sequence
list is consistent with the number that appear in the per-domain list. Important
note: only one domain in a sequence is absolutely controlled by this parameter,
or by -domT. The second and subsequent domains in a sequence have a de
facto bit score threshold of 0 because of the details of how HMMER works.
HMMER requires at least one pass through the main model per sequence;
to do more than one pass (more than one domain) the multidomain alignment
must have a better score than the single domain alignment, and hence the
extra domains must contribute positive score. See the Users' Guide for more
detail.
- [-forward ] Use the Forward algorithm instead of the Viterbi algorithm
to determine the per-sequence scores. Per-domain scores are still determined
by the Viterbi algorithm. Some have argued that Forward is a more sensitive
algorithm for detecting remote sequence homologues; my experiments with
HMMER have not confirmed this, however.
- [-null2 ] Turn off the post hoc second
null model. By default, each alignment is rescored by a postprocessing
step that takes into account possible biased composition in either the
HMM or the target sequence. This is almost essential in database searches,
especially with local alignment models. There is a very small chance that
this postprocessing might remove real matches, and in these cases -null2
may improve sensitivity at the expense of reducing specificity by letting
biased composition hits through.
- [-pvm ] Run on a Parallel Virtual Machine
(PVM). The PVM must already be running. The client program hmmsearch-pvm
must be installed on all the PVM nodes. Optional PVM support must have
been compiled into HMMER.
- [-xnu ] Turn on XNU filtering of target protein
sequences. Has no effect on nucleic acid sequences. In trial experiments,
-xnu appears to perform less well than the default post hoc null2 model.
Next: alistat - show statistics
Up: Manual pages
Previous: hmmpfam - search a
Direct comments and questions to <eddy@genetics.wustl.edu>