On Thursday, Google Cloud announced a number of improvements coming soon to the platform’s AI-powered speech tools.
Google Cloud took the decision to update its Text-to-Speech products by providing additional voices and languages to it, including beta-support for new languages or variants, including Danish, Norwegian Bokmål, Polish, Portuguese/Portugal, Russian, Slovakian and Ukrainian, making the product support a total of 21 languages as of now.
Moreover, the product now supports a total of 106 voices, after the addition of 31 new WaveNet voices and 24 new standard voices. This makes Amazon Web Services' Polly, which supports a total of 58 voices, the primary competition for Google’s Text-to-Speech services.
Thanks to unique access to WaveNet technology powered by Google Cloud TPUs, we can build new voices and languages faster and easier than is typical in the industry. - Dan Aharon, Google product manager
To help users enhance audio playback on various hardware, like headphones for podcasts, Google Cloud’s latest update includes the general availability of Google's Text-to-Speech Device Profiles feature.
Google Cloud also improved the general availability, and quality, of its Speech-to-Text transcription tools.
It announced the general availability of multi-channel recognition enabling Speech-to-Text API distinction between multiple audio channels, which would come in handy in situations involving multiple people.
Last year, Google had produced beta-accessible premium models for video and enhanced phone, which has now been made available generally. Data logging for premium-services customers in order to share usage data was made use of by Google to improve its video and phone models.
Google announced the improved video model to have 64 percent fewer transcription errors, and the phone model to have 62 percent fewer errors.
Prices for the premium phone and video models have been slashed. The upgraded phone and video models can be used without opting for data logging but opting for data logging would cost customers less for the products.
Through these updates, developers would benefit in building intelligent voice applications that can reach a wider audience along with providing greater efficiency and functionality.