A new machine learning model has been developed for analyzing complex protein data

Nov, 2021 - by CMI


Machine learning has been developed to better interpret data from a strong scientific tool - Nuclear Magnetic Resonance. Understanding proteins and chemical processes in the human body is one use of NMR data.

Scientists use NMR spectrometers to determine molecular structures such as proteins, but it might take a long time for highly experienced human specialists to examine the data. This new machine learning technology can evaluate data considerably faster and with the same accuracy. The researchers explained their method, which entails teaching computers how to decipher complicated data regarding atomic-scale features of proteins and parse it into single, readable pictures. Computers are taught to scan pictures from NMR spectrometers by this method.

Spectra are pictures made up of hundreds or even thousands of peaks and valleys that depict changes in proteins or diverse metabolite combinations in a biological sample, such as urine and blood, at the atomic scale. The NMR data provide critical data on the function of a protein as well as important insights into what is going on in a person's body. Furthermore, as the peaks frequently overlap, dissecting the spectra into legible peaks can be challenging. The impact is similar to that of a mountain range, with higher, closer peaks obscuring smaller ones that may also contain vital information.

The method requires the formulation of an artificial deep neural network, which is a multi-layered network of nodes that the computer utilizes to sort and evaluate data. The researchers built the network and then taught it to evaluate NMR spectra by giving it spectra that were already analyzed by a human and telling it the previously identified right answer. The researchers began by training the computer to evaluate extremely simple spectra, much as if they were teaching a kid to read. The researchers went on to more complicated sets after the machine grasped that. They eventually fed very complicated spectra of various proteins as well as a urine sample of mice into the computer.

According to the researchers, the computer was able to pick out the peaks in the very complicated sample with the same precision as a human expert using a deep neural network that had been trained to interpret spectra. Furthermore, the computer completed the task faster and with a high degree of consistency.