Chemistry Faculty Publications

Document Type

Article

Abstract

Analyzing a single data set using multiple RNA informatics programs often requires a file format conversion between each pair of programs, significantly hampering productivity. To facilitate the interoperation of these programs, we propose a syntax to exchange basic RNA molecular information. This RNAML syntax allows for the storage and the exchange of information about RNA sequence and secondary and tertiary structures. The syntax permits the description of higher level information about the data including, but not restricted to, base pairs, base triples, and pseudoknots. A class-oriented approach allows us to represent data common to a given set of RNA molecules, such as a sequence alignment and a consensus secondary structure. Documentation about experiments and computations, as well as references to journals and external databases, are included in the syntax. The chief challenge in creating such a syntax was to determine the appropriate scope of usage and to ensure extensibility as new needs will arise. The syntax complies with the eXtensible Markup Language (XML) recommendations, a widely accepted standard for syntax specifications. In addition to the various generic packages that exist to read and interpret XML formats, an XML processor was developed and put in the open-source MC-Core library for nucleic acid and protein structure computer manipulation.

Publication Date

5-2002

Publication Title

RNA-A Publication of the RNA Society

Publisher

Cambridge University Press

DOI

https://doi.org/10.1017/S1355838202028017

Start Page No.

707

End Page No.

717

Included in

Chemistry Commons

COinS