How voice-search technology listens to and spits out information is a complex endeavour, but Google has attempted to explain what mechanisms make its voice-search app work in a new research paper.
Google's search app for iOS.
(Screenshot by Jason Parker/CNET)
Basically, it boils down to data, and lots of it.
According to Google, more data improves all web services. This may seem obvious, but for better speech recognition, it doesn't only mean a sheer amount of data, but also how that data is organised. Google's voice-search technology mainly uses data from anonymous queries on Google.com to get the information it needs.
"The language model is the component of a speech recogniser that assigns a probability to the next word in a sentence given the previous ones," Google research scientist Ciprian Chelba wrote in a blog post about the research. "As an example, if the previous words are 'New York', the model would assign a higher probability to 'pizza' than, say, 'granola'."
In conducting its voice-search evaluations, Google scientists used up to 230 billion words from "a random sample of anonymised queries from Google.com that did not trigger spelling correction".
Chelba concluded that with such a big dataset, word-error rate can be reduced by 6 per cent to 10 per cent; and for systems with an even wider range of operating points, word-error reduction can be between 17 per cent and 52 per cent.
Google's new voice-search technology app, which looks to be directly competing with Apple's Siri, was launched yesterday. The software went out as part of an update to Google's search application for iOS, provides contextual, spoken results for voice queries and serves up web searches for everything else. According to a CNET review, this app gives Siri a run for her money. It is lightning fast, has a clean layout and gives highly accurate results.