Research in the field of machine learning and AI, now a critical technology for almost every industry and company, is too much for anyone to read. This column, Perceptron, aims to collect some of the most recent findings that are relevant to the papers — especially, but not limited to, artificial intelligence — and explain why they are important.
This month, Meta engineers provided two new insights from the depths of the company’s research labs: an AI system that compresses audio files and an algorithm that can speed up protein-folding AI by 60x. Elsewhere, scientists at MIT have revealed that they are using spatial acoustic information to help machines better perceive their surroundings, simulating how a listener would hear sound from any location in a room.
The Meta compression function does not directly reach the untested area. Last year, Google announced Lyra, a neural audio codec trained to compress low bitrate speech. But Meta says its system starts working in CD-quality, stereo sound, making it useful for commercial applications such as voice calls.
Architecture diagram of Meta’s AI noise compression model. Photo credits: Meta
Using AI, Meta’s compression system, called Encodec, can compress and decompress audio in real time on a single CPU core at rates of approximately 1.5 kbps to 12 kbps. Compared to MP3, Encodec can achieve a compression rate of about 10x at 64 kbps without any noticeable loss of quality.
The researchers behind Encodec say that human testers prefer the quality of audio processed by Encodec compared to audio processed by Lyra, suggesting that Encodec could eventually be used to deliver better quality audio in situations where bandwidth is constrained or at a premium.
As for Meta’s protein folding activity, it has little immediate commercial potential. But it can lay the foundation for important scientific research in the field of biological sciences.
Protein structures predicted by the Meta system. Photo credits: Meta
Meta says its AI program, ESMFold, predicted the structure of nearly 600 million proteins from bacteria, viruses and other as-yet-unidentified organisms. That’s more than three times the 220 million structures Alphabet-based DeepMind was able to predict earlier this year, which includes nearly every protein from known organisms in the DNA database.
Meta’s system is not as accurate as DeepMind’s. Of the 600 million proteins it produced, only a third were of “high quality.” But it is 60 times faster at predicting structures, enabling it to scale structure predictions in very large protein databases.
Not to be outdone by Meta, the company’s AI division also this month detailed a program designed for statistical consulting. Researchers at the company say their “neural problem solver” has learned from a dataset of successful mathematical proofs to arrive at new, different types of problems.
Meta is not the first to create such a system. OpenAI has developed its own, called Lean, which it announced in February. Separately, DeepMind has experimented with systems that can solve challenging mathematical problems in the subjects of symmetries and knots. But Meta says the neural problem solver was able to solve the International Math Olympiad five times faster than any previous AI system and outperformed other systems on widely used math benchmarks.
Meta notes that math-solving AI can benefit the fields of software authentication, cryptography and aerospace.
We turn our attention to work at MIT, where research scientists are developing a machine learning model that can capture how room sounds will spread through space. By modeling the acoustics, the system can learn the geometry of the room from the sound recording, which can then be used to create a visual rendering of the room.
Researchers say the technology could be used in virtual and augmented reality software or robots that have to navigate complex environments. In the future, they plan to develop the system so that it can adapt to new and larger scenes, such as entire buildings or towns and cities.
At Berkeley’s robotics department, two separate teams are accelerating the rate at which a quadrupedal robot can learn to walk and perform other tricks. One team is looking to combine the best of breeder work with many other advances in reinforcement learning to allow a robot to go from an empty stand to steady walking in an uncertain environment in just 20 minutes in real time.
“Perhaps surprisingly, we find that with a few careful design decisions regarding task setup and algorithm implementation, it is possible for a quadrupedal robot to learn from scratch with deep RL in less than 20 minutes, on many different surfaces and surface types. Worse, this requires no parts of a novel algorithm or any other unexpected innovation,” the researchers wrote.
Instead, they choose and combine the best methods and get amazing results. You can read the paper here.
A robotic dog demo from EECS professor Pieter Abbeel’s lab in Berkeley, California in 2022. (Photo courtesy of Philipp Wu/Berkeley Engineering)
Another walking learning project, from (friend of TechCrunch) Pieter Abbeel’s lab, was described as “training the imagination.” They set up a robot that has the ability to try to predict how its actions will be realized, and although it starts out helpless, it quickly gains more knowledge about the world and how it works. This leads to a better forecasting process, which leads to better information, and so on with feedback until it’s less than an hour away. It learns quickly to recover from being pushed or “perturbed,” as the lingo has it. Their work is listed here.
Work on a potentially faster application took place earlier this month at Los Alamos National Laboratory, where researchers are developing a machine learning method to predict collisions that occur during earthquakes — providing a way to predict earthquakes. Using a language model, the team says they were able to analyze the statistical properties of seismic signals emitted by a fault in a laboratory seismograph to predict the timing of the next quake.
“The model is not constrained by physics, but it predicts the physics, the actual behavior of the system,” said Chris Johnson, one of the lead researchers on the project. “Now we make a prediction for the future from past data, which is more than describing the immediate state of the system.”
Photo credits: Time for dreams
It’s challenging to apply the method in the real world, the researchers say, because it’s unclear whether there is enough data to train a prediction system. But still, they are optimistic about the plans, which could include anticipating damage to bridges and other structures.
Finally this week is a cautionary note from MIT researchers, who warn that neural networks used to simulate real neural networks should be carefully evaluated for training.
Neural networks are of course based on the way our brain processes and information signals, which reinforce certain connections and combinations of nodes. But that doesn’t mean that synthetics and reals work the same. In fact, the MIT team found, simulations based on a neural network of grid cells (part of the nervous system) only produced the same activity when they were carefully forced to do so by their creators. When allowed to self-regulate, as real cells do, they did not produce the desired behavior.
That is not to say that deep learning models are useless in this domain – far from it, they are very important. But, as professor Ila Fiete said on the school’s news site: “they can be a powerful tool, but one must be very careful in interpreting them and deciding whether they really predict de novo, or shed light on what they say. that the brain is working well.”