Combinatorial permutation based algorithm for representation of closed RNA secondary structures

A permutation-based algorithm is introduced for the representation of closed RNA secondary structures. It is an efficient ‘loopless’ algorithm, which generates the permutations on base-pairs of ‘k-noncrossing’ setting partitions. The proposed algorithm reduces the computational complexity of known similar techniques in O(n), using minimal change ordering and transposing of not adjacent elements.

evolutionary algorithms which attempt to mimic a natural folding pathway by using a populations based approach [5]. Nowadays, an increasing number of researchers have released novel RNA structure analysis and prediction algorithms for comparative approaches to structure prediction. Their approaches are based on the fact that closed RNA structures can be viewed as mathematical objects obtained by abstracting topologically non-relevant properties of planar folding of single-stranded nucleic acids. These algorithms require significant computational resources and thus are impractical for sequences of even modest length.
From the biological view, the RNA's structure is dominated by base-pairing interactions, most of which are Watson-Crick pairs between complementary bases. The base-paired structure of RNA is called, its secondary structure. Due to the fact that Watson-Crick pairs are such a stereotyped and relatively simple interaction, accurate RNA secondary structure prediction appears to be an achievable goal. RNA secondary structures (Figure 1) folding cooperatively allow the creations of pseudoknot free secondary structures, where no base pairs overlap, that is there are no pair of bases (i, j) and (i', j') with i < i' < j' < j. In literature [6] except hairpin and interior loops we can also find definitions for bans, multiloops, external loops,

Methodology:
In this case consideration will be given to the surveys of Trotter for the generation of specific permutations by transposing pairs of elements, using a recursive procedure.

K-noncrossing closed RNA structures:
Closed RNA secondary structure is represented as knoncrossing set of partitions, which corresponds to the basepairs and no base-pairs respectively. A (set) partition of [2n] is a collection of disjoint subsets on [2n], representing a 2n union (Figure 1a). Each element of a partition is called a block. A (complete) matching on [2n] = {1, 2, 2n} can be represented by listing its 2n blocks, as {(i1, j1), (i2, j2),…, (i2n, j2n)}, where ir < jr for 1 ≤ r ≤ n. Two blocks (also called arcs) (i, j) and (i', j') form a crossing if i < i' < j < j', and a nesting if i < i' < j' < j. It is wellknown that the number of matching's on [2n] with no crossings (or with no nestings) is given by the n-th Catalan number. Let π2n denote the set of partitions of [2n] and a diagram π ε π2n. A k-distant (k is a nonnegative integer) crossing of π is a pair of edges (i, j) and (i', j') of π satisfying i < i' < j < j' and j < i' ≥ k. A k-distant nesting of π is a set of two edges (i, j) and (i', j') of π satisfying i < i' ≤ j' < j and j < i' ≥ k. A partition or matching π is k-distant noncrossing if π has no k-distant crossing and k-distant non-nesting if π has no k-distant nesting.

Generating Permutations:
Our case study includes all the numbers that begin with 1 and have unique alternate even-odd digits. The problem mainly concerns the quick development of a special set of permutations G2n, rather than the common n! permutations of the first n components. These alternative permutations can be defined in the form as shown in supplementary material.

Discussion:
RNA pseudoknot structures can be categorized in terms of the maximal size of sets of mutually crossing bonds. A knoncrossing RNA structure has at most k-1 mutually crossing bonds and a minimum bond-length of 2, i.e., for any i, the nucleotides i and i+1 cannot form a bond. According to this formulation, a k-noncrossing RNA structure can be represented as a digraph in which all vertices have degree, that does not contain a k-set of mutually intersecting arcs and 1-arcs, i.e. arcs of the form (i, i+1), respectively [9]. Furthermore, RNA secondary structure is often assumed to be sufficient for being able to predict the RNA function. This assumption can be justified by observations of well conserved secondary structures and the fact that secondary structures fold fast, while tertiary interactions need much more time to form [10]. The fact that it is possible to predict secondary structures using nearestneighbour parameters [11] also suggests that secondary structure contributes much more to the stability of the RNA structure, than the tertiary interactions.
Moreover, an imperative algorithm for generating combinatorial objects is called loopless, if for every set of n elements the number of steps needed to generate the first object is less than O(n), the decision whether an object is the last is obtained within O(1) steps and every transition between successive objects requires at most O(1) steps. Generally, an algorithm is loopless if the objects are represented in a simple form and can be read directly without requiring any additional steps. The proposed model-algorithm (Figure 2) includes the following procedure: Given an integer array of certain length (L), the algorithm generates the permutation of digits {1, 2,…, 2n} in the integer array M[i, j], i, j = 1, 2,…, 2n. The number of the iterations T is calculated proportionally with the total number of elements required for the permutations of number n. (e.g. for 3 one element, for 4→2, for 5→4 etc). A permutation is created by swapping the newly added digit n with an existing digit in the array. If n is odd, the permutation should occur only if the corresponding swapped number is odd and vice versa. Thus, only digits at positions n-3, n-5... should be considered. Note that the first digit (1) of the array is not swappable. For the n+1 element that is added after the last position on each of the previous permutations every permutation of previous mark is recalled. Since the recursive detection of the transposing, through the minimal change of permutations, can be performed at the same time, the running time of the algorithm will be proportional to the size of the computation tree (the number of recursive calls). Furthermore, in this tree, each node has exactly T-1 children and each leaf corresponds to a unique permutation.

Conclusion:
From the set of canonical pairs, it is clear that a given RNA sequence has many potential structures. In fact, the number of possible structures grows exponentially with the length of the RNA sequence. The challenge is to identify whether structure plays a functional role for a given RNA sequence and, if yes, to predict this functional RNA structure. In medical applications, accurate structural knowledge will be the starting point to create new lead compounds which would eventually be applied into more effective drugs. Therefore, the accurate prediction of RNA structure could simultaneously provide clues for curing an assortment of diseases, especially those that are based on RNA viruses. Since the conception of permutation to the individual representation of RNA secondary structure in genetic algorithms has been introduced, the problem can be essentially represented as a neural network in future work, which can be optimized through genetic algorithms techniques.