Meta's Audiobox: AI Sound with Voice and Text

By Rahul Bhagat By Rahul Bhagat

Meta has launched Audiobox, an AI sound generation model that accepts simultaneous voice and text input. Meta has launched Audiobox, an AI sound generation model that accepts simultaneous voice and text input.

Audiobox can create various environmental sounds and natural conversational speech based on the Voicebox AI model. Audiobox can create various environmental sounds and natural conversational speech based on the Voicebox AI model.

The model integrates audio generation and editing features, allowing users to generate customized audio quickly. The model integrates audio generation and editing features, allowing users to generate customized audio quickly.

Meta aims to lower the barrier to sound generation by providing a tool accessible to the public for creating videos, games, and more. Meta aims to lower the barrier to sound generation by providing a tool accessible to the public for creating videos, games, and more.

Audiobox utilizes Voicebox's "guided sound" mechanism and the "flow-matching" diffusion model for multi-layered audio generation. Audiobox utilizes Voicebox's "guided sound" mechanism and the "flow-matching" diffusion model for multi-layered audio generation.

In tests, Audiobox outperformed AudioLDM2, VoiceLDM, and TANGO in sound quality and the "accuracy of generated content," according to Meta. In tests, Audiobox outperformed AudioLDM2, VoiceLDM, and TANGO in sound quality and the "accuracy of generated content," according to Meta.

Meta's Audiobox: AI Sound with Voice and Text

Meta's Audiobox: AI Sound with Voice and Text

Other Stories

Select Instagram Introduces Exclusive Sharing with Close Friends

Tesla’s $25,000 Car: Gigafactory Berlin Plans