ALSE - Motif Finding Tool
The University of Hong Kong
Department of Computer Science, The University of Hong Kong
Useful Links

ALSE - An Introduction


Command Line Version of ALSE


Downloading and Compiling the source code

ALSE is written in C++, it can be compiled with the GCC C++ compiler version 3.00 or above. You can get the source code from the ALSE webpage. The file is in the format ALSE_v????.tar.gz where v???? is the version number.
Unix prompt is shown as "$". Un-tar the file and compile the ALSE by doing:

$ tar zxvf ALSE_v???.tar.gz
$ cd ALSE
$ make clean
$ make

If everything is smooth, you should get a executable file "find_motif".

Running ALSE


    ./find_motif [OPTIONS] -t <true seq file> -f <false seq file>


  1.  <true seq file> is the full path name of the sequence file that is known to contain the motif(s). It is in the FASTA format.
  2.  <false seq file> - sequence data file that consist of the background sequences, also in FASTA format.
  3.  [OPTIONS] fields specify the parameters used for running ALSE. They can be any of the following, appeared in any order. When a particular option is not set or not specified in the command line, the default value is used in ALSE.
         -m, --motif
               Length of the motif to be discovered
               (default 6)

    -r, --no_reverse
               do not test the reversed sequence

    -O, --output
               output HTML filename
               (default ./output.html)

    -b, --binding
               Specify the maximum number of motifs in a sequence
               (default 2)

    -s, --seeds
               number of seeds used for s tuning iteration
               (default 100)

    -h, --help
               Disply the command and options information

Web Version of ALSE


FASTA sequence data Format

The FASTA file format is used to contain multiple DNA (protein) sequences. It begins with a character > followed by the name of the sequence; the sequence data follow in the next line. More sequences are listed in the file in the similar way.
[SEQUENCE-NAME] := A|B|...|Z | a|b|...|z | 0|1|...|9 | _ | - 
More information can be found here

Output Format

The output of the ALSE is a summary of motifs discovered in HTML format. A sample of the output can be found here. The page has mainly 3 sections.

  1. Program parameters
    Here shows information and of the input data, including the options issued and the names pf input sequences and total execution time.

  2. List of motifs
    Here outputs a list of the found motifs in order of the p-values. Each motif's p-value and alpha are also displayed

  3. Detail for each motif


Introduction Software Download Online Services Algorithm Used Useful Links Email: