pyRBDome: a comprehensive computational platform for enhancing RNA-binding proteome data

July 2024, Granneman Lab, Life Science Alliance

Authors

Chu, L., Christopoulou, N., McCaughan, H., Winterbourne, S., Cazzola, D., Wang, S., Litvin, U., Brunon, S., Harker, P.J.B., McNae, I., and Granneman, S.

Summary

By Natalia Kochanova, Earnshaw Lab

A number of high-throughput methods exist for identification of RNA binding proteins (RBPs), RNA-binding domains and RNA sequences which bind to them. However, an unknown fraction of interactions identified by these methods represents false positives, which could, until recently, be identified only with low throughput methods. The Granneman lab created a Python computational pipeline called pyRBDome, which uses five distinct machine/deep learning algorithms to  predict RNA-binding sites (RBS) and their RNA-interacting peptides. Additionally, they created an ensemble model that learns from the successes and failures of the individual algorithms to significantly improve prediction results.

The prediction is performed using programs trained on data from known human RBPs, sequences, structural features and sequence-based prediction algorithms, as well as programs relying solely on electrostatic features in the structures. This allows the detection of RBS in less well-characterized RBPs or proteins with domains not previously associated with RNA. As a structural input, the tool uses both experimentally derived structures and Alphafold modelling. Since the pipeline is able to predict the RNA-interacting motifs, the authors gained insights into the motif composition and location in the structure of a typical RNA-binding domain. 

Using multiple prediction tools, and a more extensive collection of data increased the precision of detecting RBS. Using pyRBDome, the authors correctly predicted the structure of the RNA-protein complex of the polynucleotide phosphorylase 3’–5’ exonuclease from Staphylococcus aureus. For the prediction of RBS, the pipeline significantly outperformed the crosslinking data in its agreement with structural data, and highlighted some flaws in existing experimental UV cross-linking approaches. Therefore, pyRBDome is a strong tool which can predict RBS in proteins, and is particularly valuable for investigation of less studied organisms and proteins.     

 

 

Granneman Image July 2024 v2
The Nrd1-Nab3-Sen1 transcriptional termination complex attenuates and homogenises the expression of the mitochondrial carrier PIC2, improving cell fitness and adaptability. Preventing the binding of Nab3 to PIC2 enhances Pic2 levels, causes a growth defect, increases cell size and intracellular stress, and prolongs the cell cycle. Creating an imbalance in the binding of Nab3 and Nrd1 to PIC2 mRNA disturbs the homeostasis of co-regulated transcripts and aggravates the defects resulting from suboptimal PIC2 expression.

Related Links