Disclaimer: We may earn a commission if you make any purchase by clicking our links. Please see our detailed guide here.

Follow us on:

Google News

Tensorflow Team of Google Open Sourced Dataset for Diy Makers Interested in Artificial Intelligence

Aniruddha Paul
Aniruddha Paul
Writer, passionate in content development on latest technology updates. Loves to follow relevantly on social media, business, games, cultural references and all that symbolizes tech progressions. Philosophy, creation, life and freedom are his fondness.

Join the Opinion Leaders Network

Join the Techgenyz Opinion Leaders Network today and become part of a vibrant community of change-makers. Together, we can create a brighter future by shaping opinions, driving conversations, and transforming ideas into reality.

Today, Google researchers have gone on to open-source a speech recognition data set for enhancing their DIY AI projects. This open-sourcing will give AI-enthusiastic DIY makers more tools for creating basic voice commands for various smart devices.

The Speech Commands dataset is an accumulation of 65,000 utterances of 30 words for the training of the AI models. It is created by the AIY and Tensorflow teams of Google.

On the other hand, AIY Projects of Google was launched in May 2017. The aim of this initiative is to provide support to DIY makers who are interested in AI. The project starts by launching a line of reference designs with a smart speaker in a cardboard box and speech recognition.

Related to this, a blog post was written by Pete Warden, a software engineer of Google Brain. In the post, he revealed that they also open-sourced the infrastructure they used to create data. They intend for the communities to use the infrastructure and create their individual versions. This will cover the underserved applications and languages.

According to Warden, more variations and accents are constantly shared with the project. This broadens the dataset for DIY AI, which would have never been restricted to contributions by thousands only.

In fact, this DIY AI dataset allows you to add your voice to Speech Commands, unlike other datasets. All you need is to visit the AIY Projects website and go to the speech portion. After that, you will automatically be invited for short recordings of 135 simple words and a series of numbers and names.

DIY AI Exclusions in the dataset

The concerned project isn’t yet representing all communities in terms of gathered voice samples. Therefore, certain models may not yet understand the voice of every user. Similarly, while providing a voice command to a device, certain local dialects and slang have remained excluded from some groups.

In regards to that, Stanford AI researchers came across an interesting stat. A language identifier NLP (neuro linguistic programming) called Equilid is trained on Urban Dictionary and Twitter. The observation states that Equilid is more accurate than the identifiers trained with texts excluding users based on race, age, and way of talking.

Even, Equilid was found to be more precise than CLD2 of Google as well. Further academic tests on speech recognition tools concluded that the most used NLP tools are yet to be savvy in understanding Afro-American users.

Join 10,000+ Fellow Readers

Get Techgenyz’s roundup delivered to your inbox curated with the most important for you that keeps you updated about the future tech, mobile, space, gaming, business and more.


Partner With Us

Digital advertising offers a way for your business to reach out and make much-needed connections with your audience in a meaningful way. Advertising on Techgenyz will help you build brand awareness, increase website traffic, generate qualified leads, and grow your business.

Power Your Business

Solutions you need to super charge your business and drive growth

More from this topic