Bujnicki Group

Bujnicki group is from International Institute of Molecular and Cell Biology in Warsaw, Poland.

Prediction Approaches:

Strategy I: Typically, we use a hybrid modeling strategy based on the approach used earlier for protein structure prediction. First, the RNA target sequence is analyzed to identify homologous sequences and to generate a multiple sequence alignment. At this stage we also try to determine similarity to any other RNA with experimentally determined structure. If such RNA is identified, it is used as a template to build a model of the target molecule (or its parts) using a template-based (comparative) modeling method ModeRNA. If the molecule lacks a template, or if certain regions of the molecule cannot be modeled by template-based modeling, they are subjected to template-free folding simulations using the coarse-grained method SimRNA. Template-free folding can be aided by spatial restraints obtained from computational predictions (e.g., information on the secondary structure predicted from the sequence alignment) and from experimental analyses (e.g., on contacts between residues that interact in 3D). Finally, models are refined using the QRNAS method that extends the AMBER force field with energy terms explicitly modeling hydrogen bonds, idealizes base pair planarity and regularizes the backbone conformation. A tutorial that presents this strategy with a detailed example is available: Piatkowski P, Kasprzak JM, Kumar D, Magnus M, Chojnowski G, Bujnicki JM. RNA 3D Structure Modeling by Combination of Template-Based Method ModeRNA, Template-Free Folding with SimRNA, and Refinement with QRNAS. Methods Mol Biol. 2016;1490:217-35

Strategy II: Recently, in collaboration with the Das group, we have started exploring another strategy, also inspired by successful protein structure prediction. This approach requires a sequence alignment of the target RNA with several homologs. Initially, models are built using the template-free approach for several different sequences. Structural fragments corresponding to the evolutionary conserved regions (in particular helices) – determined from the alignment – are extracted from all models and clustered to identify the most common structural arrangement. The model of the target RNA can be then subjected to refinement, e.g., as in strategy I. This approach is currently under development.

A real case:

Prokaryotic ribosomal protein genes are typically grouped within highly conserved operons. In many cases, one or more of the encoded proteins not only bind to a specific site in the ribosomal RNA, but also to a motif localized within their own mRNA, and thereby regulate expression of the operon. We used computational methods to predict an RNA motif present in many bacterial phyla within the 5′ untranslated region of operons encoding ribosomal proteins S6 and S18. We demonstrated experimentally that this region in RNA indeed functions as the S6:S18 complex-binding motif (S6S18CBM). This motif contains a conserved CCG sequence presented in a bulge flanked by a stem and a hairpin structure. A similar structure containing a CCG trinucleotide forms the S6:S18 complex binding site in 16S ribosomal RNA. We have constructed a 3D structural model of a S6:S18 complex with S6S18CBM, using a combination of template-based and template-free modeling. The model explained how the CCG trinucleotide in a specific structural context could be specifically recognized by the S18 protein. Our prediction was supported by site-directed mutagenesis of both RNA and protein components. Overall, this study, which combined RNA sequence analysis, computational structure modeling, and biochemical analyses, provided a molecular basis for understanding protein-RNA recognition in the regulation of S6 and S18 protein expression. [fig] S6:S18 ribosomal protein complex interacts with a structural motif present in its own mRNA. Matelska D, Purta E, Panek S, Boniecki MJ, Bujnicki JM, Dunin-Horkawicz S. RNA. 2013 Oct;19(10):1341-8.

Publications:

TBA

Chen Group

Chen group from University of Missouri investigates the physical mechanism of RNA folding and develops predictive models for RNA structure and function.

  • Vfold2D: physics-based 2D structure prediction.
  • VfoldCPX: RNA/RNA complexes 2D structure prediction.
  • VfoldMTF: motif database with various topologies.
  • Vfold3D: motif-based 3D structure prediction.

Prediction Approaches:

Generally, RNA structure prediction is based on a hierarchical strategy. First, for a given RNA sequence, the minimum free energy and suboptimal 2D structures are predicted by Vfold2D, which uses Vfold-derived motif-based loop free energies. In this stage, if the homologous sequence alignment data is available from Rfam or BLAST, we can also incorporate the information as constraint to the Vfold2D algorithm to predict the 2D structures. Second, from the given sequence and the 2D structure, Vfold3D predicts the 3D structures by assembling A-form helices and motif templates from known RNA structures. We use the sequence similarity scores to search for the optimal motif templates in the VfoldMTF database. To effectively enlarge the search space for the fragments, we allow zipping/unzipping of the terminal base pairs of the helix stems in order to relax the restrictions on the lengths of the loop branches. After that, some slight adjustments for the spatial arrangement of motifs may be necessary to eliminate inter-motif clash. Finally, the 3D structures are refined by the all-atom energy minimization through AMBER or NAMD. A tutorial that illustrates this strategy with detailed examples can be found in the following reference: Xu X, Chen SJ. A Method to Predict the 3D Structure of an RNA Scaffold. Methods Mol Biol. 2015;1316:1-11.

In the above approach, it is possible that no template can be found in the VfoldMTF database. Moreover, the current workflow of the Vfold software for 3D structure prediction is only semi-automated because it may require some manual efforts to, for example, remove steric clashes. A fully automated software/server is currently under development.

A real case:

RNA aptamers are short single-stranded oligonucleotide ligands that can bind with high affinity and specificity to target molecules. The use of aptamers represents an emerging class of therapeutic strategy that can be easily adapted for personalized and precision medicine. Knowledge of the RNA aptamer’s structure would greatly facilitate and expedite the post-selection optimization steps required for translation, including truncation, chemical modification and chemical conjugation. Here, we use Vfold2D/Vfold3D computational model to predict the 2D and 3D structures of a 70-nucleotide long RNA aptamer (A9) for prostate specific membrane antigen (PSMA) and identify the key sequence/structure motifs (Reference: Rockey, et al. Zou, X., Chen, SJ., Giangrande, P.H. (2011) Rational truncation of an RNA aptamer to prostate specific membrane antigen using computational structural modeling. Nucleic Acid Therapeutic, 21: 299-314). Based on the predicted 2D structure of A9, a series of systematic changes, including nucleotide deletions/insertions/mutations, were introduced to alter the aptamer’s 2D/3D structure (predicted by Vfold2D/Vfold3D). The altered RNA sequences were then in vitro transcribed and validated experimentally to assess activity in an established functional assay. From the structure-activity relationship for different sequences and structures, the sequence and structural motifs that are essential for aptamer activity were identified. The stem-loop motif of A9g were found to be essential for the function. Overall, a structure modeling methodology, in combination with a standard functional assay, was used to determine key sequence and structural motifs of an RNA aptamer and this methodology can be easily applied to optimize other aptamers with therapeutic potential.

Publications:

Xu et al., Structural computational modeling of RNA aptamers, Methods 103, 2016, 175-179.

Szachniuk Group (formerly: Adamiak Group)

Szachniuk group gathers researches of Institute of Bioorganic Chemistry, Polish Academy of Sciences and European Centre for Bioinformatics and Genomics, Faculty of Computing and Telecommunications, Poznan University of Technology. The group develops publicly available computational methods for RNA structural bioinformatics within the RNApolis project (Szachniuk 2019), including tools dedicated for a fully automated RNA 3D structure prediction and quality assessment.

  • RNAComposer: a fully automated, fragment assembly method and webserver for RNA 3D structure prediction.
  • RNA FRABASE: a database that allows users to search three-dimensional fragments within experimentally determined RNA 3D structures.
  • RNApdbee: a webserver to derive secondary structures from 3D structures of knotted and unknotted RNAs.
  • RNAssess: a webserver for quality assessment of RNA 3D structures within the context of the reference structure.

Prediction Approaches:

RNAComposer allows users for fully automated prediction of RNA 3D structures (Popenda et al. 2012) by providing a knowledge-based method that employs automated fragment assembly based on a tree graph representation of the secondary structure and homology of structural elements. The developed workflow allows to translate rapidly the RNA secondary structure into the corresponding 3D structure. Crucial component used during this translation process is a dedicated dictionary of 3D structure elements constructed based on the RNA FRABASE database (Popenda et al. 2008, 2010). It relates RNA secondary- and tertiary structure elements. Since our initial report, its volume has been considerably enlarged leading to a substantial increase of the predicted 3D structure accuracy. The algorithms incorporated in the RNAComposer engine allow users to predict automatically RNA 3D models in the following steps:

  • RNA secondary structure fragmentation. RNA secondary structure is divided into fragments according to its tree graph representation. The fragmentation algorithm provides secondary structure elements, namely stems, loops (i.e., apical, bulge internal, and n-way junctions), and single strands.
  • 3D structure elements search. An automated dictionary search of the related 3D structure elements is performed for each secondary structure element resulted from fragmentation.
  • 3D structure elements preparation. Most suitable 3D structure elements are selected from the dictionary and prepared for further processing.
  • Initial RNA 3D structure building. The building process is based on the tree graph representation of the input secondary structure. The 3D structure elements are superimposed regarding common canonical base pairs and assembled to give initial, already well-shaped RNA 3D structure. Up to this step, RNAComposer is very fast. It usually takes several seconds on a single processor architecture.
  • 3D structure refinement. Energy minimization in torsion angle space (Guntert et al. 1997) and, subsequently, in atom coordinate space (Schwieters et al. 2003) is performed leading to final, high-quality RNA 3D model.

RNAComposer system offers a user-friendly interface that allows users to predict automatically large RNA 3D structures. Although the main engine runs based on RNA secondary structure, the webserver itself enables users to predict the RNA 3D structure from sequence as well. User can input RNA sequence and select one of incorporated tools to run RNA secondary structure prediction: CentriodFold (Sato et al. 2009), ContextFold (Zakov et al. 2011), CONTRAfold (Do et al. 2006), IPknot (Sato et al. 2011), RNAfold (Lorenz et al. 2011), or RNAstructure (Reuter & Mathews 2010). Further computation performs according to standard RNAComposer workflow.

Since the first relase, RNAComposer has gained high interest in the scientific community. According to the statistics, it reaches over 1 million uses every year. It is efficient enough to support 3D modeling in an interactive mode. It also provides the batch mode for large-scale modeling of RNA 3D structures based on up to 10 user-defined RNA secondary structures. As an input a set of up to 10 RNA sequences can be entered. Up to ten 3D models can be generated for every pair of sequence and secondary structure. In the batch mode, users can significantly improve the reliability of predicted RNA 3D models by applying own 3D structure elements, influencing the search within the database of available RNA 3D structure elements or incorporating own restraints for interatomic distances and torsion angles (Antczak et al. 2016).
Numerous applications of RNAComposer were reported throughout literature in all fields of molecular and structural biology of RNA (e.g., NMR, SAXS, cryo-microscopy) and RNA nanotechnology.

A real case:

Prediction of the cyclic di-GMP-II riboswitches from different bacteria (Purzycka et al. 2015) is presented as an example. This riboswitch controls the carbohydrate processing. 3D structure of its aptamer domain from Clostridium acetobutylicum was solved at 2.5Å resolution (PDB ID: 3Q3Z; Smith et al. 2011). This RNA adopts a compact structure, which contains the a second order pseudoknot, a triple helix within pseudoknot major groove, and an unusual U-turn/S-turn motif unique among any other PDB-deposited 3D structures.

Analysis of this riboswitch allows us to predict yet unknown 3D structures of related riboswitches from Clostridium difficile 4, Bacillus halodurans 1, and Thermus aquaticus Y5.1. We performed our predictions on RNA family stored in the RFAM database (ID: RF01786) that comprises 237 members clustered in a seed subgroup containing 54 entities differing in the sequence length and identity. We have generated secondary structures for all RNAs included in that seed, following the alignment of the consensus secondary structures within this subgroup and the crystal structure topology (PDB ID: 3Q3Z). These secondary structures were used as an input for automated 3D structure prediction using RNAComposer. One 3D model was generated for each of 54 RNA secondary structures which was then fitted into the crystal structure (PDB ID: 3Q3Z) based on alignment of tertiary structures generated using ARTS (Dror et al. 2006). Next, we selected three subgroup members i.e., the riboswitches from Clostridium difficile 4, Bacillus halodurans 1, and Thermus aquaticus Y5 1, for which the predicted 3D structures fitted the best to the crystal structure of c-di-GMP-II riboswitch (PDB ID: 3Q3Z). For each selected member, 10 3D models were generated and the best ones were output. It appeared that within the core fragment comprising potential ligand-binding site, all three riboswitch 3D structures very closely resemble that from C. acetobutylicum (RMSD score <2 Å).

Publications:

Popenda et al., Automated 3D structure composition for large RNAs, Nucleic Acids Res 40(14), 2012, e112.

Biesiada et al., Automated RNA 3D structure prediction with RNAComposer, in Doug H. Turner, David H. Mathews (eds.) RNA Structure Determination: Methods and Protocols (Methods in Molecular Biology 1490), Springer, Humana Press, 2016, 199-215.

Purzycka et al., Automated 3D RNA structure prediction using the RNAComposer method for riboswitches, in Shi-Jie Chen, Donald H. Burke-Aguero (eds.) Methods in Enzymology: Computational Methods for Understanding Riboswitches 553, Elsevier, 2014, 3-34.

Popenda et al., RNA FRABASE version 1.0: an engine with a database to search for the three-dimensional fragments within RNA structures, Nucleic Acids Res 36(1), 2008, D386-D391.

Popenda et al., RNA FRABASE 2.0: an advanced web-accessible database with the capacity to search the three-dimensional fragments within RNA structures, BMC Bioinformatics 11, 2010, 231.

Antczak et al., RNApdbee - a webserver to derive secondary structures from pdb files of knotted and unknotted RNAs, Nucleic Acids Res 42(W1), 2014, W368-W372.

Zok et al., MCQ4Structures to compute similarity of molecule structures, Cent Europ J Oper Res 22(3), 2014, 457-474.

Wiedemann et al., LCS-TA to identify similar fragments in RNA 3D structures, BMC Bioinformatics 18, 2017, 456.

Zok et al., RNApdbee 2.0: multifunctional tool for RNA structure annotation, Nucleic Acids Res 46(W1), 2018, W30-W35.

Lukasiak et al., RNAssess - a webserver for quality assessment of RNA 3D structures, Nucleic Acids Res 43(W1), 2015, W502-W506.

Szachniuk, RNApolis: computational platform for RNA structure analysis, Found Comput Decis Sci 44(2), 2019, 241-257.

Xiao Group

Xiao group is from School of Physics of Huazhong University of Science and Technology in Wuhan, China.

Prediction Approaches:

3dRNA is a fast and automated method of building ncRNA 3D structure based on sequence and 2D structure. It builds 3D RNA structure from the smallest secondary elements (SSEs), which are defined as helix, hairpin loop, internal loop, bulge loop, and junction loop. 3dRNA works as follows: in the first step, it uses a secondary structure tree (SST) to represent the RNA secondary structure. Each node in an SST corresponds to an SSE; in the second step, 3dRNA traverses the tree and match one or more suitable templates for each node from the 3D structure templates library. In the third step, 3dRNA traverses the tree again and assembles the matched templates of the node and its parent node every time it visits a node. So when finishing the traversing, a complete tertiary structure is generated. Since each SSE may have multiple templates, we can get a lot of tertiary structures by using different templates for each SSE. When there are no appropriate templates for certain SSE, 3dRNA uses a distance geometry (DG) method to build a raw template for it. The last step is to cluster all the structures and then use 3dRNAscore to score cluster centers for the user to choose an appropriate structure.

A real case:

3NDB is the Crystal structure of a signal sequence bound to the signal recognition particle in Methanocaldococcus jannaschii. We predicted the RNA chain (136 nt) of 3NDB. The secondary structure of 3NDB consists of 9 loops including 2 hairpin loops (HL), 5 internal loops (IL), 1 bulge loop (BL) and 1 multi-branch loop (ML). The 3D template of each of these loops can be found from the 3D structure templates library. The table shows the sources of the templates of all these loops. The 3D structure of the rest helices are built according to the parameters of standard helix. We then assembled all the 3D structures of these loops and helices into an integral 3D structure. The RMSD of the predicted structure is 4.26 Å.

Publications:

Zhao et al., Automated and fast building of three-dimensional RNA structures. Scientific Reports 2, 2012, 734.

Wang et al., 3dRNAscore: a distance and torsion angle dependent evaluation function of 3D RNA structures, Nucleic Acids Res 43(10), 2015, e63.