The Role of Artificial Intelligence in Transcription Technology

Jul 11, 2019 by

Transcription can be as important to a person as the speech itself. It is down to the transcription to accurately record what a person says and, more importantly, how they say it. By looking at transcription, we can tell an awful lot about the characteristics of speech a person might have. Transcription was the domain of human labour for a very long time but, like many other things, it is one of the things which is being taken over by AI. Let’s take a closer look at the role of artificial intelligence in transcription technology.

Microsoft’s Findings

The act of transcription is one which has been tackled by tech companies for years. In particular, Microsoft carried out some research in 2016 to find out who would do better; a human or a computer?

They contacted a third-party transcription company and had them transcribe an audio clip. The process was a standard one used by many in the industry. One person would transcribe the audio clip, and then a second would also listen to it and would correct any mistakes they heard. This produced an error margin of 5.9 % and 11.5%.

Microsoft then tested their own AI-based system on the same audio clip and received an error margin of 5.9% and 11.3%. This led them to make the claim that their software was better than transcription professionals and they later rolled out the AI into products like Cortana.

Working Together

Despite Microsoft’s claims that their software is better, what is actually needed is a meeting point between the two. Humans are still much better at recognising the emotion behind words or when a pause naturally takes place in speech. This is something the AIs struggle with.

However, an AI transcription software is much faster at transcription than a human could ever be. What’s more, an AI will never tire out. Humans will inevitably get tired as their workday continues and the more tired we are, the higher the chance of error might be.

It is therefore often proposed that it becomes a two-stage process. The AI will do the majority of the initial transcription labour while the human then acts as an editor afterwards. They will read over what the AI produced and, while also listening to the audio, will make corrections as needed. This will hopefully be able to produce a piece of high-quality transcription which can be used across multiple industries. By using both sources, any information behind the words such as emotion can also be recorded so this important context can’t be lost.

A Tool for the Future

It is clear that the use of AI in the transcription industry must be kept as a tool and not as a replacement. Until further advances are made in the world of artificial intelligence technologies, the error margins between AI and humanity are too similar to pit one against the other. In the future this might change, but for now, AI is merely another tool we can use to accurately transcribe the words we say for records and other purposes.

Image: Pixabay

This image has an empty alt attribute; its file name is image-1-810x680.png

Bernadine Racoma is a senior content writer at Day Translations, a human translation services company. After her long stint as an international civil servant and traveling the world for 22 years, she has aggressively pursued her interest in writing and research. Like her poetry, she writes everything from the heart, and she treats each written piece a work of art. She loves dogs! You can find Bernadine Racoma at Google Plus, on Facebook and Twitter.

Print Friendly, PDF & Email

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.