HOME   |    PDF   |   


Title

Composition, physicochemical property and base periodicity for discriminating lncRNA and mRNA

 

Authors

Rajesh Prasad & Annangarachari Krishnamachari*

 

Affiliation

School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, 110067, India,

 

Email

Rajesh Prasad - E-mail: rajes55sit@jnu.ac.in

Annangarachari Krishnamachari E-mail: chari@jnu.ac.in

 

Article Type

Research Article

 

Date

Received December 1, 2023; Revised December 31, 2023; Accepted December 31, 2023, Published December 31, 2023

 

Abstract

Annotation of genome data with biological features is a challenging problem. One such problem deals with distinguishing lncRNA from mRNA. In this study, three groups of classification features, namely base periodicity, physicochemical property and nucleotide compositions were considered. We are attempting to propose a simple neural network model to obtain better results using judicious combination of the above said sequence features. Our approach uses balanced dataset, simple prediction model and use of limited features in distinguishing lncRNA and mRNA. Accordingly (a) two properties of base periodicity: peak power spectrum of the signal and noise-to-signal ratio (SNR) of this peak signal (b) three physicochemical properties: solvation, stacking and hydrogen-bonding energy and (c) all dinucleotides and trinucleotides compositions were used. Classification was performed by considering features independently followed by combining these properties for improvement. Classification metric was used to compare the result for seven eukaryotic organisms for various combinations of features. Nucleotide compositions combined with physicochemical property or base periodicity group of features becomes a strong classifier with more than 99 percentage accuracy. Base periodicity analysis with SNR can be used as discriminating feature of lncRNA from mRNA.

 

Keywords

lncRNA, mRNA, Bioinformatics, physicochemical feature, machine learning, computational biology

 

Citation

Prasad & Krishnamachari, Bioinformation 19(12): 1145-1152 (2023)

 

Edited by

P Kangueane

 

ISSN

0973-2063

 

Publisher

Biomedical Informatics

 

License

This is an Open Access article which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. This is distributed under the terms of the Creative Commons Attribution License.