NEWS21 August 2017

Microsoft speech recognition achieves new accuracy levels

AI Innovations News North America

US – Microsoft has announced that its conversational speech recognition tool has reached its best accuracy levels yet with a 5.1% error rate.

A research blog written by Microsoft technical fellow, Xuedong Huang, said this overtakes its previous 5.9% error rate “a new industry milestone”.

Its recent studies have used Switchboard, recorded telephone conversations that speech recognition researchers use to benchmark systems. The new study by researchers at Microsoft Artificial Intelligence and Research involved transcribing conversations between strangers discussing subjects such as sports and politics.

“We reduced our error rate by about 12% compared to last year’s accuracy level, using a series of improvements to our neural net-based acoustic and language models,” wrote Huang. He added that the model was improved by using the whole conversation to predict what was likely to come next “effectively allowing the model to adapt to the topic and local context of the conversation”.

Huang added that challenges still remained in this area, especially achieving human recognition levels in noisy environments and recognising accents and speaking styles.

“Moreover, we have much work to do in teaching computers not just to transcribe the words spoken, but also to understand their meaning and intent. Moving from recognising to understanding speech is the next major frontier for speech technology,” he added.