Hey, Siri. What is Speech Recognition?

October 11, 2020

by Teva

Speech recognition is a concept far predating the 21st century’s reach. Some speech recognition breakthroughs even go as far back as the 18th century.

speech recognition microphone — Speech recognition microphone

Although, you could say their efforts and attempts weren’t as noticeable as that of today’s.

It wasn’t until the 19th century, 1952 precisely, that Bell Laboratories presented the first official speech recognition system. It was known as Audrey.

Audrey was famous for being able to recognize digits spoken by a person. From there on, other companies followed suit in the speech recognition creation train.

But enough of the past glimpsing. What is speech recognition, now?

Well, if you told Apple’s Siri to “call Stella” or “text Ken”, chances are that you would soon find yourself staring at a dialing screen in less than 10 seconds. Or even an open conversation tab in your messages.

On that note, speech recognition is that function that exists in any machine or program and gives it room to understand words and phrases in any spoken language by humans.

This is achieved through the process of converting those words to a machine-readable format.

The base work of it is all algorithms and language modeling through which there’s an existing link or relationship between the linguistic units of speech and audio signals.

Language modeling is what will match the sequences of the words to help the machine tell the difference between words that may seem similar.

Where can speech recognition be used?

girl using voice to text command for speech recognition on her phone — Voice to text command

You can often catch speech recognition performing voice-to-text processing, or even voice dialing and voice search, proving that your voice does matter.

Now, you can use it to “call Stella” to grab you some toilet paper from the supermarket on her way back from town, without so much as moving a finger.

You can even search the internet through spoken commands rather than typing. One popular medium through which you can do this is with Google’s very own voice search feature.

Google voice search app icon on Android — Google voice search feature

So, any voice search operation can be easily done with a voice input. Also, a voice input is any data item or command that you can us to input data into a system.

Take for instance, Siri Data. Siri Data includes transcriptions of all the requests you give Siri, which will help her interpret data and execute commands on your iPhone, or even your Apple watch.

So, how then can you activate speech recognition?

How can you activate speech recognition?

Well, no matter the device being used, there is one sure way to activate a speech recognition system, or rather, to “wake up” one. You can do it by clicking on the voice command icon and calling out the name of your virtual assistant.

However, the commands vary. But here are the various commands that are applicable to most devices descended from the top four tech companies:

Microsoft: Hey, Cortana
Apple: Hey, Siri
Google: OK, Google
Amazon: Hey, Alexa

Why is speech recognition important?

Well…

1. It is definitely faster than typing, for one.

Every technological creation, or advancements in technology rather, aim to make life easier for all of us. And speech recognition definitely does this.

You can get that long blog post written without having to tap excessively at your keyboard. All you have to do is just speak the words and it’s done!

2. It assists the physically disabled.

One other major thing that makes speech recognition quite special is that all it requires from you is that you speak to the system. This is so that it can translate what you say.

It is especially a useful feature for people with physical impairments and those with physical problems that make typing on a keyboard difficult.

3. It helps with bad spelling.

Sometimes, speling can be quite the task for some people. Wait, is that correct? Speling? Speiling? Spelling.

Yes. Spelling.

spelling error on paper with flawer instead of flower — Spelling error

The sad truth is that not everyone can type in dictionary perfect, or even ordinary manageable. So, this is somewhat of a barrier between what we write and how people accept it.

Speech recognition can prevent you from typing like a 5 year old, especially when writing a professional-slash-formal type document.

What are the problems associated with speech recognition?

We lazy lots dabble with speech recognition because we simply prefer talking to our devices over typing, even though it might make us look a little crazy.

But there are still some other people who definitely don’t want anything to do with.

Why?

1. Because they believe it’s a violation of their privacy.

Most smartphones today, not to talk of other types of devices, are already being suspects of spying. So, being in speech recognition technology cult doesn’t make it any better for these people.

They strongly believe it’s going to be a threat to their privacy.

plenty cameras aimed at two women — Privacy violation

Their concern is that the companies responsible for the creation of speech recognition tech have the power to listen in on all their private conversations, which gives them a major cause to worry.

This brings about the question of if such companies are ever going to work towards offering better privacy controls for users all over the world.

2. Because there is the possibility of misinterpretation.

As fascinating as speech recognition is, the fact remains that it isn’t all that perfect. At least, not yet.

speech recognition misinterpreting voice command — Speech recognition misinterpretation

Not all words we say can be precisely and easily translated with speech recognition, the way humans process words and turn it into meaning.

Its inability to contextually and completely understand words might just be a major interference to future commands and tasks given.

3. Talking as a human is very different from instructing a software with commands.

Voice recognition is assists most machines or technical tasks in voice-to-text translation. It isn’t so easy as picking up a pen and transcribing what you hear down on paper.

The program actually has to understand what you say, and how you say it matters a lot.

But the process isn’t so straightforward. Talking requires some level of understanding and comprehension in return, which speech recognition tech is yet to grasp.

Which is why it needs to learn your voice and speaking pattern, meaning that it goes beyond just you having an interest in using voice recognition. You would also need to have patience to work with such a software because the program isn’t human.

It can’t easily know that you mean “Stella” or “Ken” as names of people, or “bread” and “toilet paper” as objects. It just perceives them as what they are to it, in that moment; mere instructions.

Voice recognition has come a long way, up to the point where there’s even the debate about whether Virtual assistants are better than real-life assistants.

Perhaps, with it’s gradual evolution, and the looming possibility of an automated future, who’s to say they wouldn’t end up being better?

Teva

Teva is echvantage's mascot.

See Full Bio

Posted in Speech RecognitionTagged hey siri, recognition, siri, speech

ARTICLES