This page is a usr manual for the web application regliss.

## Submission form

**Enter the RNA sequence.**
The sequence should be in **FASTA** format.
A sequence in FASTA format consists of a
single-line description, followed by lines of sequence
data. The first character of the description line is a
greater-than (">") symbol in the first
column.

Example of FASTA format:

> Name of the sequence ctgcgagcgcgcgatgatagcgcggcgagcatgtagcatgctagctgtcgcgagcact cggccgagatcaggcgatgcatgcgcagggagcagcgagcgacgagcacagcatgcta gctagatgcatgctgtaggcagcgccgagagacgatggagctgc

Lower-case and upper-case letters are both accepted.
The full standard IUPAC nucleic acid code is not supported:
only `A`, `C`, `G`, `T` and `U`
symbols are recognized.
Numerical digits `0`, ..., `9`, `-` and
dot `.` symbols are not accepted.

**Choose your set of helices.** `regliss` takes as input a
set of putative helices. A helix is a set of nested
base pairs. It can contains bulges or internal loops.
The set of helices can be specified explicitly by the user, or
computed automatically from the input sequence.

*Paste your own set of helices.*In this case, helices are given by the user. The decription format is as follows. Each line contains a single helix, and each helix is described in bracket-dot format. Positions that appear in the 5' part of a base pair are marked with an opening bracket and positions that appear in the 3' part of a base pair are marked with a closing bracket. So the helix contains as many base pairs as pairs of matching parentheses. Unpaired bases, such as internal loops or bulges, are indicated with a dot. The position of the helix on the sequence is given by a pair of positions, at the beginning of the line: The first number is the position of the first base of the helix on the sequence, from 5' to 3' starting at position 1. The other number is the position of the matching base on the sequence.

For example, this example specifies a helix starting at position 2, ending at position 25, containing 5 base pairs and one bulge at third position.2 25 ((.((( )))))

This corresponds to this helix on the sample example given in the preceding section.ctgcgagcgcgcgatgatagctccaggcgagcatgtagcatgctagctgtcgcgagcact ((.((( )))))

The set of helices can contain as many helices as wanted. Helices can be embedded or overlapping.

Example of embedded helices: the three last helices are embedded in the first one.3 67 ((((((( )))..)))) 4 66 (((((( )))..))) 3 67 (((( )))) 7 61 ((( )))

*Upload a file.*You can upload a text file containig a set of helices. The description format is the same as above.-
*Compute helices with Unafold (Mfold).*With this option, helices are automatically built for the input sequence using the Unafold software (ref). Unafold is used to compute all suboptimal secondary structures for the input sequence, with percent suboptimality 100%. We then proceed as follows. We first select all non-redundant suboptimal structures from Unafold's output. A structure is considered as redundant if it contains another suboptimal structure output by Unafold. We then extract all putative helices from this set of structures. Given a structure, a helix is a maximal set of nested base pairs, such that any other base pair of the structure is either nested in all base pairs of the helix, or juxtaposed with all base pairs of the helix, or all base pairs of the helix are nested in this base pair. -
*Compute helices with a nearest neighbour model.*Helices are computed with a custom program that finds all helices without bulges and internal loops for the nearest neighbor energy model.

**Paste the helices that must be in all structures.** It is
possible to force some base pairs to occur in all structures computed
by `regliss`.
This field is optional.
By default, `regliss` outputs all locally optimal secondary structures.

**Select the output size.**
Regliss computes all locally optimal secondary structures of the input sequence.
This usually includes a very large number of structures, which are
sorted according to the free energy value given by
the `RNAeval` program from the Vienna RNA
package.
It is possible to limit the amount of output data either by giving a
percent suboptimality (from 1 to 100) or by giving the maximal
number of structures to return. By default, `regliss` outputs
all structures within a 20%
suboptimality range.

## Results

All structures found by `regliss` are displayed in a tabular,
with one color for each helix.
Each line contains a structure in bracket-dot format.
At the end of the line, the free energy of the structure computed by
RNAeval is given. Structures are sorted by increasing free energy.

GGUCCCGUAGCUCAGUUGGUUAGAGCGUUGGUCUUAUGAGCCGAAGGUCGCGGGUUCGAGCCCCGCCGGGACCA structure1 (((((((..((((.........)))).(((((.......))))).....(((((.......)))))))))))). (-30.0) structure2 (((((((..((((.........)))).(((((.......))))).......((((....))))...))))))). (-27.7) structure3 (((((((..(((((......................)))))........(((((.......)))))))))))). (-27.15) structure4 (((((((..((((.........)))).............(((...))).(((((.......)))))))))))). (-26.8) structure5 (((((((..((((.........))))..........((((((..........))))))........))))))). (-25.7) structure6 (((((((........((((((.................)))))).....(((((.......)))))))))))). (-25.53) structure7 (((((((..(((((......................)))))..........((((....))))...))))))). (-24.85) structure8 (((((((..((((.........)))).............(((...)))...((((....))))...))))))). (-24.6) structure9 (((((((........((((((.................)))))).......((((....))))...))))))). (-23.23) structure10 (((((((..........(((((((((.(((((.......))))).........)))))))))....))))))). (-22.34)

**Output files.**
For each structure the result is stored in four formats: CT, JPEG, PS, and bracket notation.
JPEG and PS files contain 2D visualization of the structures that are both automatically produced from the CT file
using NAview.

* CT - Connectivity Table format:*
This is a text file which contains the nucleic acid sequence
and base pairing information, such as produced by Mfold.

*bracket-dot notation:* This is a fasta-like format which contains
both the primary and the secondary structure.
The first line contains the heading. The second line contains
the sequence. The third line contains the set of base pairs encoded by
brackets and dots. A base pair between *i* and *j* is
represented by a *(* at position *i* and a *)* at
position *j*. Unpaired bases are represented by dots.

**Set of input helices.** This section summarizes the set of input
helices that have been used to construct all locally optimal secondary structures.

It is also possible to download a zip archive storing all result files (ct, ps, jpeg, bracket notation and helix file).

## Energy landscape graph

The *energy landscape graph* gives an insight into the full energy
landscape of the RNA sequence.
In this graph, each vertex represents a locally optimal structure.
Two structures are in the same neighborhood if they differ by at most
two stems. In this case, an edge is put between the two structures.

You can explore the graph and visualize structures by clicking on the vertices. It is also possible to zoom in, zoom out (+, - buttons). When you click on a vertex, the 2D representation of the structure appears in a new window. At the same time the bracket notation of structure is displayed on the top of the graph. The structure window can be dragged and moved. Likewise it can be widened and reduced.

The graph viewer is based on the **SVG** format. Therefore, to visualize the graph, you need to use a
web browser which is compatible with SVG.

## Retrieve result with an ID

Each job is assigned an identifier, that allows to retrieve folding results. Files are stored for 24h after job submission.