Siri Speaker Wishlist: Top 5 Features We Would Love to See

BY Rohan Naravane

Published 31 May 2017

Amazon deserves a pat on the back for getting a speaker with a digital assistant built-in mass acceptance. The Amazon Echo was a cylindrical device launched in 2014 that featured Alexa, a voice-activated and voice-controlled virtual assistant that could do a variety of things, including ordering things from Amazon, of course. Eventually, several varieties of Echo products spawned including the Amazon Tap portable speaker, the tiny Echo Dot, and more recently the camera-packing Echo Look and screen-laden Echo Show.

Early this year, 11 million Echo products were reportedly sold which have over 10,000 third-party service integrations (known as Alexa Skills). Google followed on Amazon’s trail with the introduction of Google Home last year, while Microsoft’s Cortana virtual assistant has also found a home in speakers built by its hardware partners.

But lest we forget — the movement of a smart, conversational digital assistant was kickstarted back in 2011 by Apple with Siri. Unfortunately, while she charmed users at first, over time the general consensus seemed that Siri just couldn’t get what people were saying a lot of times, with some spectacularly poor performances.

The rumours of Apple building a standalone speaker powered by Siri have been contradictory to what Phil Schiller, the company’s VP of Worldwide Marketing thinks about them. But Ming-Chi Kuo, an analyst with a reputable track record, speculates there’s a 50 percent chance of a “Siri Speaker” to be shown at this year’s Worldwide Developer Conference (WWDC).

Assuming there was such a device unveiled on the 5th June 2017, what are the five things it needs to give the Amazon Echo, Google Home and others a tough fight? Here it goes…

1. Improved Siri Voice Recognition and Response

In the introduction, we talked about Siri’s inability to understand our speech at times. Although the digital assistant has gotten better over the years, plenty of anecdotal experience seems to suggest that the voice recognition does tend to mess up. But if Siri is going to power a device whose primary objective will be to understand voice queries and respond to them, Apple better not leave any stone unturned.

Although she may be better at understanding people of American descent, there’s no doubt that Siri cannot understand the local words in countries like India. For example, speak the name of a lesser known Indian actor, and watch voice recognition hopelessly fail at converting those words to text. This makes using the Apple TV 4th Gen, a product that has been on sale for well over a year in India, still feel crippled. Although the Echo is not available in India, the recently-launched Amazon Fire TV Stick can detect words of Indian origin quite well. And ironically Google Home, which doesn’t officially sell in India, is able to recognize India-centric words a lot better than Siri on the Apple TV.

Though, it’s not just going to be about converting speech to text, but also how finely can it cater to what you need. Google Home can be used for stuff like getting fairly accurate answers to any question you have, translate sentences to other languages, and even getting proactive alerts based on what Google finds in your Gmail and everything else it knows about you. Alexa may not have the power of Google’s knowledge graph, but it has a lead on the number of third-party service integrations, that make it a fun and useful device to own. To expand on the Siri Speaker capabilities, Apple will need to expand SiriKit (more on that at the bottom).

2. More Than Satisfactory Sound Quality

I’ve said this in the past — the combination of a digital assistant built into a nice-sounding speaker is a really good idea. The convenience of asking your speaker to play any song on your mind, only for it to start playing just seconds later is frankly a magical experience.

If you were to listen to music on an Amazon Echo or Google Home, it’s not that they sound bad — they just don’t sound really good either. Fortunately, three reason suggest that the Siri Speaker could sound better than the lot. First is the rumor from the same analyst who predicted the chances of Siri Speaker happening — he also suggested that Apple’s smart speaker “will offer great acoustic performance as it will feature one woofer and seven tweeters”. Next, Apple has been the owner of Beats audio for three years now, which gives it a strategic advantage over Google or Amazon when it comes to making audio products.

And finally, this wouldn’t be the first time Apple made a speaker — the short lived iPod Hi-Fi speaker from 2006 did have great sound quality, despite it’s fair share of flaws.

3. Seamless Routing of Phone Calls

Google leveraged the VoIP calling infrastructure from its Hangouts app to give Google Home hands-free calling (currently) to phone numbers in the US and Canada. This is over and above either Google Home and Amazon Echo’s ability to pair a phone over Bluetooth, so that they can be used as really good speakerphones. Think about it, the microphones on those devices are designed to pick up your voice from a distance, so one can only imagine how better an experience it will be to take a call using these things than your phone’s loudspeaker.

Apple is in an interesting position here because it already can route phone calls coming to an iPhone to devices that aren’t iPhones — including iPads, Macs, and Apple Watches. So, it wouldn’t be too much to expect the Siri Speaker to seamlessly take and make phone calls.

4. Multi-User Support

At Google I/O 2017, the company has paved the way for solving the problem of multiple people using Google Home, by identifying who’s speaking. Multi-user support is absolutely critical for devices like smart speakers, because there is a good chance that more than one person will want to use it in one household. This way, when Person A adds something to their calendar, it goes to that person’s associated Google account. Or if Person B asks for the time it’ll take to drive to work, the system is smart enough to look up that person’s office address.

It’s actually a shame that outside of the Mac, Apple devices don’t really support multiple users yet. So, the Apple TV 4th Gen that runs tvOS has apps that are signed into one account, which can completely screw up personalization of that user. Imagine people in the house subscribing to YouTube channels that the person logged in has no interest in.

It’s fair to assume that the Siri Speaker, and even the Apple TV and iPad are products that will have a high tendency of being used by multiple people at home. Therefore, we really hope at least the Siri Speaker has awareness of the person using it.

5. A Screen

This would perfectly counter Phil Schiller’s argument about a voice-only assistant — a screen to accompany the smart speaker. We’ve already seen assistants like the Amazon Echo Show or the just-announced Essentials Home, that work using a combination of hands-free voice input as well as a touchscreen interface.

Apple is no stranger to making custom interfaces for special use-cases too — last year it added the Touch Bar to MacBook Pros to complement input from a physical keyboard and trackpad. Apple Watch runs watchOS, a specialised UI built for a tiny screen. And speaking of the Watch, you’ll see Apple didn’t hesitate to add a physical rotating dial (the Digital Crown) to complement the touch interface; instead of going for an all-touch experience.

All this goes to show that Apple is comfortable introducing more than one interaction method. On the Siri Speaker, a touchscreen that works in tandem with voice input would call for a great experience. Imagine it being that digital photo frame you always wanted to buy.

One more thing: More Domains and Intents for SiriKit

SiriKit was introduced with iOS 10 as a way for developers to enable voice interactions for their apps with Siri. As of today, it only supports the following domains on the iPhone — ride booking, messaging, photo search, payments, VoIP calling, and workouts. This means Siri is capable of booking an Uber, sending a WhatsApp, or transferring money to a friend using an app that supports personal payments, etc. But it’s going to need to support a lot more domains and intents for the Siri Speaker to be as functional as the competitors.

For example, people have come to expect things like ordering from their favourite restaurants or playing music using their preferred music streaming app (not just Apple Music), from the other assistant devices. Hence, SiriKit will need to expand and accommodate more use cases than what it currently supports.

That’s it — I sincerely hope for all the things in our wishlist to be a part of Siri Speaker, should it be unveiled any time in the future. What do you think the Siri Speaker should have? Do you think it should have a touchscreen or a camera like those newer Amazon Echo products? Sound off in the comments below.

Image credit: MacRumors.com