Higher-order modular regulation of the human proteome

Rappsilber Lab – Molecular Systems Biology.

Authors

Kustatscher, G., Hödl, M., Rullmann, E., Grabowski,P., Fiagbedzi, E., Groth, A., and Rappsilber, J.

Image
Research paper image, details in text
Clustering is combined with machine-learning to identify large modules of co-regulated proteins in the human proteome. A simple web-based implementation of the workflow allows users to search for additional co-regulation modules.

Summary of Paper by Lori Koch

Proteins are the building blocks of cells and organisms. How do cells know which proteins to make and how much to make at a given time? Protein levels can be controlled by modifying many steps in the process of the ‘central dogma’ of molecular biology; when the DNA is used as the template to make RNA, which is then used as a template to make protein. In their recent study published in Molecular Systems Biology, scientists led by Juri Rappsilber sought to identify groups of co-regulated proteins whose abundances change together in response to specific conditions. Previously, they had measured the human cell “proteome”, meaning all of the proteins in the cell, in 294 different conditions (Kustatscher et al, Nature Biotechnol, 2019). In this study they applied machine learning approaches to identify co-regulated proteins which they called progulons. Their approach was two pronged: first, they used clustering to detect small groups of co-regulated proteins and second, they used these groups as training sets, or ‘seeds’ for a random forest algorithm to detect larger co-regulated modules. They used this approach to identify new proteins important for DNA replication; when primed with 41 known components of the replisome, the algorithm predicted 212 additional protein candidates with potential replication-related functions. The scientists tested 20 of these candidate proteins by knocking down their expression in cells and screening for DNA damage and cell cycle defects. This approach identified that 75% of candidates were validated with high or medium confidence, while the remaining 25% of proteins may have additional functions other than in DNA replication. The authors have designed a webtool for the identification of progulons for a given set of seed proteins, freely available at the link below.

Progulon finder

Related links

Journal link

Rappsilber Lab website

DOI