Our approach

Classic motif finding algorithms only consider sequences that are known (or believed) to contain the common patterns (call it the set T). This may lead to error in finding patterns that appear repeatedly in all the sequences, therefore leading to poor or inaccurate results. Our approach overcomes this problem by introducing an extra input to the problem  the background sequences (call it the set set F). Our algorithm search for patterns that frequently appears in the set T but rarely appear in the set F.
 Sometimes, the pattern that we are searching for may appear more than once in a DNA sequence. Our algorithm consider that the sequences with repeated patterns.
Problem
Definition
Given two sets of DNA sequences T and F, where T contain sequences that
are believed to contain the motifs and F as a set of background
sequences. Find those short patterns (motif)
occur frequently in set T but
rarely occur in set F.
To
be precise, given T and F as defined previously,
we want to find a probability matrix M (presenting a motif
pattern) and a threshold α, such that the pvalue
is minimized.
The EMlike Approach
ALSE finds
out those probability matrices and thresholds
which result in the lowest pvalue
by applying an EMlike iterations:
 Find a set of patterns (represented as
a probability matrix) with the lowest pvalues,
known as the set Seeds (say
100).
 For each M_{0} (with
corresponding threshold α_{0}) in
the set Seeds,
refine it to a matrix M_{1} (with α_{1})
such that M_{1} yieids
a lower probabilty pvalue(M_{1},α_{1})
 Continue Step 2 until there is no improvement
in the pvalue, or the maximum number of iterations
is reached.
 Output the motifs with smallest pvalues
(say 50).
