We’ve all done it, used the little microphone button on our smartphone to say our text message rather than type it or said “Ok Google” to speak our question and receive an instant answer. This luxury is possible all because of speech recognition programs that have improved immensely since their invention (although we may still feel that improvements could be made). With many applications, speech recognition will find its way into many new uses and technologies soon!
Speech recognition software aids computers to understand our spoken languages. In its early days, this type of technology was limited to single-speaker systems with vocabularies of around ten words. Nowadays, advanced systems such as Google voice, Amazon Alexa, and others, have the capability to accept and understand natural speech.
Speech recognition is not just one branch of computer science, but instead it falls under the interdisciplinary sub field of computational linguistics; a field concern with the rule-base modeling of natural languages from a computational perspective.
There are two types of speech recognition (SR) software, “speaker dependent” and “speaker independent”. Speaker dependent systems analyze the person’s specific voice and use it to fine-tune the recognition of that person’s speech, resulting in increased accuracy. Alexa, the SR AI from Amazon, living inside the amazing ECHO speaker, comes with a voice training app users can access to improve speech recognition on their device.
Demand for SR is growing. As stated by Shawn Spartz, Graydient Creative’s Director of Creative & Development, “UI isn’t a touchscreen. It’s the world around us.” Voice is the most natural form of I/O there is. We interact with and request things from each other through speech and learn by echoing the words spoken to us until they’ve been etched into our minds.
Here at Graydient, we believe this technology can bridge the gap that has been lacking so much when it comes to accessibility. Even though, technics and best practices have come a long way in including accessibility features for the blind and color deficient people, there is still this feeling when developing/designing a website or app that they are the after thought and are being left out. With the rise in SR, the window has been blown wide open and development for this technology not only helps but also includes these demographics right from the beginning.
Whether as an interactive component to a website or a custom Alexa skill to let your customers make purchases on the go, the process of creating this component can be relatively simple. The technology behind Alexa and other speech recognition AIs has evolved enough where developers are already creating user-friendly interfaces for custom interactions. For instance, Sayspring is a prototyping software that allows developers to quickly demonstrate a new skill without putting hours of effort into the code. With it, one can mock up and deploy a prototype straight to Alexa and pitch the idea to potential clients or investors. According to their website, “The value of wireframes … has been established. You want to get the user experience right before you start development. Sayspring brings that product design approach to the world of voice.”
With the ever-growing support of SR, the time to start developing for this technology is now. Speech makes us human, and as technology is advancing towards humanity, it will inevitably become a part of our lives with the machines. Unlike augmented reality, SR will become a necessity, and those who don’t adapt to our outspoken world will only get left behind.