Gardenia

This page is a user manual for the program gardenia.

Input of Gardenia

Gardenia takes as input a set of RNA sequences with secondary structure. Each sequence should be specified in bracket format.

The bracket format consists of three lines. The first line contains a FASTA-like header. The second line contains the nucleic sequence. The last line contains the set of associated pairings encoded by brackets and dots. A base pair between bases i and j is represented by a ( at position i and a ) at position j. Unpaired bases are represented by dots. The lack of pseudoknots in the secondary structure ensures that this notation defines a unique folding.

     >trna E. coli
     ggggcuauagcucagcugggagagcgccugcuuugcacgcaggaggucugcgguucgaucccgcauagcuccacca 
     (((((((..((((........)))).(((((.......))))).....(((((........))))))))))))...

Edit operations and Scoring system

Edit operations may be divided into two groups: those concerning free bases and those concerning arcs between bases, or hydrogen bounds.

Edit operations for free bases are the same as usual: removing a base (base-deletion), renaming it (base-mismatch), or leaving it untouched (base-match).

U A U C C U A U C C U A U C C
U A U C C U A G C C U A - C C
base-match
         
base-mismatch
         
base-deletion

There are five main kinds of edit operations involving base-pairings. Let i,j be two positions in a sequence forming a base pair.

Arc-match: i,j is left untouched.

  (           )
A U C G G U A A C G
A U C G C A - A A G
  (           )
arc-match

Arc-mismatch: i,j is aligned with another base-pairing that is not identical. The cost of the edit operation depends on the number of mutations within the pairing: If only one base changes, then the cost is arc-mismatch (1). If both bases change, then the cost is arc-mismatch (2).

  (           )               (           )
A U C G G U A A C G A U C G G U A A C G
A U C G G U A G C G C U C G G U A G C G
  (           )   (           )
arc-mismatch 1
arc-mismatch 2

Arc-removing: i,j has no counterpart in the other sequence.

  (           )
A U C G G U A A C G
A - C G C A - - A G
arc-removing

Arc-altering: i,j is aligned with a single free base.The cost of the operation depends on the conservation of the free base.

  (           )   (           )
A U C G G U A A C G             A U C G G U A A C G
A U C G G U A - C G C C C G G U A - C G
arc-altering 1
arc-altering 2

Arc-breaking: the base pair i,j is aligned with two free bases. This operation which breaks the pairing between i and j and leaves the bases free. There are three possible weights : arc-breaking (1) is for identical bases, base-breaking (2) is for one identical base and one modified base, and arc-breaking (3) if for two modified bases.

  (           )   (           )   (           )
A U C G G U A A C G             A U C G G U A A C G             A U C G G U A A C G
A U C G G U A A C G C C C G G U A A C G C C C G G U A G C G
arc-breaking 1
arc-breaking 2
arc-breaking 3

You can adjust the values of the weight of each edit operation on the web server.

Command-line version

Gardenia is written in C. The source code of is freely available under the GPL license: gardenia.zip and read me.
It offers more options than the web server.

Output of Gardenia

The result is a multiple sequence alignment.

    trna coli        (((((((     ..((((........)))).(((((.......)) ))).....(((((.
    trna coli        ggggcua-----uagcucagcugggagagcgccugcuuugcacgc-aggaggucugcggu
    trna2            uccucgguaguauaguggug-aguauccgcgucugu--cacaugcgaga----cccgggu
    trna2            (((((((.......((((.. .....))))((((((  .....)).)))    )(((((.
                         *       ***    *  *     *** ***     ** ** **     *   ***

    trna coli        .......))))))))))))...
    trna coli        ucgaucccgcauagcuccacca
    trna2            ucaau-ucccggccgggga--g
    trna2            ..... .))))))))))))  .
                     ** **  * *        *   

Each helix that appears in both sequences is assigned a colour. * indicate conserved positions.