Nvidia unveiled the Foundational Generative Audio Transformer Opus 1 (Fugatto), an innovative artificial intelligence (AI) model capable of generating music and audio, modifying voices, and creating original sounds with precision on Monday.
With this advancement, Nvidia joins a number of other top platforms in the nascent field of generative AI, including well-known companies like Meta and startups like Runway, which are renowned for their enhanced audio and video-generating capabilities.
What Sets Fugatto Apart?
One thing that sets Fugatto apart from its competitors is its capacity to receive and adapt existing audio, which is driven by AI. The model’s use of sophisticated AI-powered approaches unlocked new avenues of expression in the fields of music composition, video game audio design, and content creation.
“If we think about synthetic audio over the past 50 years, music sounds different now because of computers, because of synthesizers. I think that generative AI is going to bring new capabilities to music, to video games, and to ordinary folks that want to create things,” admitted Bryan Catanzaro, VP of research for Nvidia’s applied deep learning program.
Why Fugatto’s Training Make It a Leader in AI Audio?
The model’s exceptional performance and adaptability are guarantees of its rigorous training on open-source data. Nvidia hasn’t said when it will be available to everyone, though.
“Any generative technology always carries some risks, because people might use that to generate things that we would prefer they don’t, we need to be careful about that, which is why we don’t have immediate plans to release this,” Catanzaro noted.
The creators of Fugatto are presently exploring solutions to the problems caused by generative AI models, such as the spread of false information and infringements on intellectual property rights.