This is How Apple’s Siri Learns a New Language

BY Rajesh Pandey

Published 9 Mar 2017

Apple’s virtual assistant — Siri — is widely regarded as being inferior to Google Now and Assistant in terms of the functionality it offers. However, a report from Reuters highlights that Siri has a key advantage over other virtual assistants in the market: the number of languages it supports. The report also highlights the process that goes behind adding a new language to Siri.

Siri in total supports 21 languages and has localisation support for 36 countries. In comparison, Google Assistant supports only four different languages while Amazon’s Alexa does even worse at two.

However, the report cites Oren Etzioni, chief executive officer of the Allen Institute for Artificial Intelligence in Seattle, who says that Apple has squandered its lead in the virtual assistant space. Siri is the oldest of the all the virtual assistants on the market right now, and it should have had a better speech recognition and answering algorithm but that is not the case.

The report comes just days after Google opened up its Pixel-exclusive Google Assistant to all Android devices running Marshmallow and Nougat.

The report also reveals that when Apple starts working on adding a new language to Siri, it first brings in a team of humans to read the passages in a range of accents and dialects. These passages are then hand written so that the computer learns the exact representation of the spoken text. In addition, the company also captures sounds in a variety of voices post which Apple tries to build a language model. After this comes the”dictation mode” where user audio recordings are captured, made anonymous and then transcribed by humans to reduce errors in speech recognition.

Then Apple deploys “dictation mode,” its text-to-speech translator, in the new language, Acero said. When customers use dictation mode, Apple captures a small percentage of the audio recordings and makes them anonymous. The recordings, complete with background noise and mumbled words, are transcribed by humans, a process that helps cut the speech recognition error rate in half.

Apple then proceeds to hire a voice actor to record the new language and once enough data has been collected for the most commonly asked questions, it releases the new language to the public. Then as the company collects more data from users, it rolls out tweaks to the language every couple of weeks.

ios-10-2-beta-siri

However, as per Charles Jolley, the creator of a virtual assistant called Ozlo, this is not an optimal way to scale a virtual assistant for more languages. He says that there is a limit to the number of script writers you can hire in every language. The solution is to “synthesize the answers,” but that is still some years away from happening.

Incidentally, Viv — a virtual assistant developed by the original creators of Siri — specifically solves this problem. The startup was acquired by Samsung last year and its technology will likely be seen being used on the Galaxy S8 this year on Bixby — Samsung’s own answer to Google Assistant and Siri.

Building on its strength, Siri will soon start supporting Shanghainese, a dialect that is spoken only around Shanghai in China.

Do you think Siri’s strong localisation support give it an edge over other virtual assistants despite its limited functionality?

[Via Reuters]