This AI can harness sound to reveal the structure of unseen spaces

This AI can harness sound to reveal the structure of unseen spaces

Imagine that you are walking through a series rooms and circling closer to a sound source. It could be music from a speaker, or a person speaking. Based on your position, the noise you hear will change and distort depending on where you are. A team of researchers from Carnegie Mellon University and MIT have been trying to create a model that accurately depicts how the sound around a listener changes when they move through a space. Their work was published in a new preprint last Wednesday.

The sounds we hear around the world can differ based on many factors, including what type spaces sound waves bounce off of, what material or distance they need to travel, and what material they are hitting. These factors can have an impact on how sound scatters or decays. Researchers can reverse engineer this process. They can take a sound sample, and even use that to deduce what the environment is like (in some ways, it’s like how animals use echolocation to “see”).

” We’re mostly modeling spatial acoustics so the [focus] is on reverberations,” said Yilun Du, a graduate student from MIT and the author of the paper. “Maybe if there are lots of reverberations in a concert hall, or maybe in a cathedral there are many echo .”

. But if you are in a small space, there isn’t really any echo.

Their model, known as a neural field (NAF), can account for both the position of the sound source and listener as well as the geometry of space through which the sound has traveled.

To train the NAF, researchers fed it visual information about the scene and a few spectrograms (visual pattern representation that captures the amplitude, frequency, and duration of sounds) of audio gathered from what the listener would hear at different vantage points and positions.

” We have a limited number of data points. From this, we can create a model that can accurately simulate how sound would sound from any position in the room and from a new location. “Once we fit this model, you can simulate all sorts of virtual walk-throughs.”

The team used audio data from a virtual room. Du points out that there are some results from real scenes. However, it takes a lot more time to gather this data in real life.

Using this data, the model is able to predict how the sounds that the listener hears will change if they move to a different position. If music is coming from a speaker in the middle of a room, it would be louder if the listener moved closer to it. It would also become muffled if they moved into another room. This information can also be used by the NAF to predict the world’s structure.

This model has a big application in virtual Reality , where sounds could be accurately generated to a listener who is moving in VR. Artificial intelligence is another big application that he sees.

” We have many models for vision. Perception is not limited to vision. Sound is also important. He says that we can also imagine that this is an attempt at perception using sound.”

Researchers are also exploring AI in

Sound. Machine learning technology today can take 2D images and use them to generate a 3D model of an object, offering different perspectives and new views. This is especially useful in virtual reality settings where engineers and artists must create realism in their screens.

Additionally, models like this sound-focused one could enhance current sensors and devices in low light or underwater conditions. “Sound also allows for you to see around corners. Lighting conditions can affect the color of sound. Du says that objects can look very different. “But sound bounces the same most times.” It’s a different sensory modality .”

For now, the main obstacle to further development of their model’s is the lack of information. He says that it was difficult to get data because people haven’t really explored the problem. “There are tons of datasets available, all real images, when you try to synthesize new views in virtual reality. It would be interesting to explore more of these methods, especially in real scenes .”

Watch and listen to a walkthrough in a virtual space.

YouTube video

Read More