New machine learning model provides high-quality images of the genome structure in single cells

Oct, 2021 - by CMI


The Higashi algorithm is built on hypergraph representation learning, a type of machine learning model that can do 3D object detection and suggest songs in an app.

The complex folds and configurations of protein and DNA bundles inside the microscopic boundaries of a single human cell determine a person's fate: which genes are expressed, which are repressed, and, most crucially, whether they remain healthy or develop an illness. Given the promising signs of these bundles on overall health, scientists have little knowledge about how genome folding befalls in the cell nucleus, its impacts on gene expression. A novel algorithm provides a strong tool for showing the mechanism at remarkable resolution.

The Higashi algorithm is the one to perform a high-definition study of genome organization in single cells by utilizing advanced neural networks on hypergraphs. A hypergraph links many vertices to a single intersection known as an edge, whereas an ordinary graph binds two vertices to a single intersection to the edge. Chromosomes contain chromatin, a DNA-RNA-protein complex that folds and organizes itself to fit inside of cell nucleus. By bringing the functional components of every ingredient closer together, the process impacts how genes are expressed to enable them to activate or inhibit a certain genetic characteristic.

The Higashi algorithm is based on a new technique called single-cell Hi-C, which captures pictures of chromatin interaction processes in a single cell at the same time. Higashi delves further into the structure of chromatin in single cells of complex tissues as well as biological processes, and how its interactions differ from cell to cell. The analysis enables scientists to observe minute differences in the folding and structure of chromatin from cell to cell, even those that are modest yet crucial in determining health consequences.

The algorithm also enables scientists to evaluate additional genetic signals that have been concurrently analyzed with single-cell Hi-C at the same time. This feature would eventually allow Higashi's capabilities to be improved, which is opportune given the projected rise of single-cell data expects to see in the coming years through various initiatives. Such type of data will open up new prospects for the development of new models that will enhance knowledge of the structure of the human genome within the cell and how it functions in health and illness.