Designed to push “us toward a more immersive reality at a faster rate, the three AI models — Visual-Acoustic Matching, Visually-Informed Dereverberation and VisualVoice — focus on human speech and sounds in video”
Three new (artificial intelligence) AI models have been built and designed by Meta (formerly Facebook) to make sound more realistic in mixed and virtual reality experience.
Designed to push “us toward a more immersive reality at a faster rate, the three AI models — Visual-Acoustic Matching, Visually-Informed Dereverberation and VisualVoice — focus on human speech and sounds in video,” the company said in a statement.
“Acoustics play a role in how sound will be experienced in the metaverse, and we believe AI will be core to delivering realistic sound quality,” said Meta’s AI researchers and audio specialists from its Reality Labs team.
These AI models were built by them in collaboration with researchers from the University of Texas at Austin, and they are making these AI models for audio-visual understanding open to developers.
The self-supervised Visual-Acoustic Matching model, called AViTAR, adjusts audio to match the space of a target image.
Despite their lack of acoustically mismatched audio and unlabelled data, the self-supervised training objective learns acoustic matching from in-the-wild web videos, informed Meta.
To achieve audio-visual speech separation, VisualVoice learns in a way that’s similar to how people master new skills, by learning visual and auditory cues from unlabelled videos.
For example, imagine being able to attend a group meeting in the metaverse with colleagues from around the world, but instead of people having fewer conversations and talking over one another, the reverberation and acoustics would adjust accordingly as they moved around the virtual space and joined smaller groups.
“VisualVoice generalises well to challenging real-world videos of diverse scenarios,” said Meta AI researchers.
CIO News, a proprietary of Mercadeo, produces award-winning content and resources for IT leaders across any industry through print articles and recorded video interviews on topics in the technology sector such as Digital Transformation, Artificial Intelligence (AI), Machine Learning (ML), Cloud, Robotics, Cyber-security, Data, Analytics, SOC, SASE, among other technology topics