A few weeks ago I spent several days marching around CES in Las Vegas (along with close to 200,000 other people), and as in previous years I saw ‘smart’ versions of just about anything you can imagine and many you can’t. I also heard just about any thesis you can imagine, from ‘this is all nonsense’ to ‘this is the next platform and voice-based AI will transform our homes and replace the smartphone.’
I’m not quite sure what my grand unified thesis on ‘smart home’ is, but I think there are some building blocks to try to get closer to one:
- Will people buy ‘smart’ anything at all? Will people buy a whole lot of smart things, or just one or two (for example, a door lock, a thermostat and nothing else). Why?
- If they do buy more than a handful of things, will they all be connected into one system, with a voice front end?
- Finally, if lots of people do have three dozen smart things all connected to Alexa (or Siri, or Google), does that change the broader tech environment? Does it result in massive company creation? Does it give, say, Amazon a major platform advantage – is the end result anything more than the sale of a bunch of ultra-low-margin generic Shenzhen boxes and a small reduction in the number of people cancelling Amazon Prime?
My answers: yes, maybe and no.
Let’s start with ‘why?’
My grandparents could probably have told you how many electric motors they owned. There were one or two in the car, one in the fridge, one in the vacuum cleaner and so on, and they owned maybe a dozen in total. Today, we have no idea how many motors we have (or even how many are in a car), but we probably know how many things we own with a network connection or some kind of digital intelligence. There’s a phone, and a tablet, and a laptop, and the TV, and… but again, our children will have no idea. It won’t be an interesting question. “How many smart devices do you have?” will be like asking how many incandescent lightbulbs you have.
Many of the things that get a connection or become ‘smart’ in some way will seem silly to us, just as many things that got ‘electrified’ would seem silly to our grandparents – tell them that you have a button to adjust the mirrors on your car, or a machine to chop vegetables, and they’d think you were soft in the head, but that’s how the deployment of the technology happened, and how it will happen again. The technology will be there, and will become very very cheap, so it will slide unnoticed into our lives. On the other hand, many things that people did think might get electrified did not, and many of the ideas that did work were not adopted in a uniform way. Most people in the UK have an electric kettle, but that’s not true in the USA, and most people in Japan have a rice cooker, but this in turn isn’t true in the UK. Anyone who’s baked a few times has bought an electric whisk for $20, but not many people use electric carving knives.
The smart home, or connected home, or internet of things (choose your term) will probably look much the same. Electrical components became cheap commodities that let people experiment with all sorts of ideas – today, the smartphone supply chain is a firehose of cheap commodity components that, again, let people experiment with all sorts of ideas for smart things. Some will work, some won’t, but our children will take the ones that do work for granted.
Though this determinist model of deployment will be much the same for smart things as for electricity, there is a difference in the character of what might get created. Washing machines and vacuum cleaners saved huge amounts of time and effort – they replaced entire jobs and liberated people from drudgery. Televisions take over hours of your time, for better or worse. In post-war Japan a television, refrigerator and washing machine were sometimes half-jokingly called ‘the three sacred treasures’. No-one would really call a smart light switch or a digital thermostat a treasure. Many smart home devices do not look as though they’re solving the same magnitude of problem (which is one reason people can get quite upset looking at some of these experiments).
But if a connected light switch isn’t a treasure, neither is an electric kettle. You can put a kettle on a stove, turn on the heat, wait for it to boil, turn off the heat and pour your tea. You could even use a saucepan. But a cheap electric kettle is much faster and turns off automatically when it’s ready. So, pretty much everyone who drinks tea owns one. Taking the analogy further, you could say the same about a simple vegetable peeler. Of course you could peel fruit and vegetables with a kitchen knife – you idiot! – but the tenth time that half of the apple and a small part of your thumb end up in the sink, you pick one up for $3 at the supermarket. An electric kettle or a vegetable peeler don’t save hours of your day or free you from drudgery – they just remove a tiny piece of friction a few times every single day for the rest of your life.
Today, the world of smart devices is trying to discover quite what other pieces of friction it might be able to address. By their nature, these often don’t look like a problem at all until you automate them away – any more than adjusting your wing mirror by hand did. Some of them don’t even look like a problem when you point them out – my grandmother could not understand why anyone would buy a dishwasher. Lots of pieces of friction are going to go away.
How might that discovery work? If we’re looking for things that take not hours of people’s time but little pieces of unnoticed friction, where do you start? One useful model, perhaps, is to look for questions. There’s an old line that a computer should never ask a question if it should be able to work out the answer, so what are the questions in our home? Well, when I go into my bathroom, do I want the light turned on? The answer is always yes, so why do I have to press the light switch? When I walk up to my front door, do I want it to be locked? The answer is always no, so why do I need to take a bunch of little pieces of carved metal out of my pocket, pick the right one and put it into a slot? When the kettle is boiling, do I want it to continue boiling until it’s dry? No, so turn off the heat. I’m baking something, and I want the oven pre-heated, do I want to fiddle with buttons, or just tell it to turn on and heat to 350 degrees? If I run out of pods for my automatic coffee machine, do I want to order more? Yes, so why ask?
These kinds of questions can start to sound a little like fairy stories. There’s an old joke that you can drive a car perfectly well (at least for a while) while believing that it is powered by little magic horses hidden somewhere inside. A smart door lock that watches for your phone or your watch or your car (or perhaps your face), and unlocks for you, but only you, is another kind of magic – there is a little genie living inside that will only let you in.
This takes me to the second question – if you do have half a dozen or a dozen or two dozen ‘smart’ things, will they all be part of a unified system in your home? Will they all talk to Alexa, or be controlled from a single smartphone screen, or do what I want without any controls at all? Do I want one genie or lots of genies, and which is less friction?
One system or many?
Should everything ‘smart’ in my home talk to everything else, and perhaps be controlled through one unified UI? The obvious answer is ‘of course it will all be one system’ but really, it depends what they are, and on what the right way to interact with that device itself might be. Some things would ideally need no interaction at all, some need to be interacted with directly, some can be controlled remotely, and some might get some value from talking to other devices but others might not. And many might fit into several of these.
Hence, the front door locks by itself, after all, so it should perhaps unlock by itself as I walk up the path, and there should really be no UI at all to that. A lot of smart home stuff should be invisible – you should never see it or talk to it. But then, the door might tell the alarm that you’re home so you don’t need to disarm it yourself. If you do need to interact deliberately, is voice or a screen the right model – and does that mean a screen on the device itself or just your phone? An oven that lets you tell it what you’re cooking might want a screen on the device, but also be accessed from your phone to check progress, and also talk to Alexa: ‘pre-heat the oven to 350 degrees please, and turn it off 30 minutes after I put the dish in’. Conversely, a connected camera clearly doesn’t need a screen on the device itself, but also doesn’t work well with an Echo unless the Echo has a screen, in which case why not use a phone (or use the Google Assistant app on your phone)? Then, there are also lots of use cases where talking might be less friction than anything else – it might be nice to say ‘Alexa, turn on the lights’ or, again, ‘Alexa, pre-heat the oven to 350 degrees’. But is it better to say ‘turn on the bathroom light’ or to walk into the bathroom and have a dumb IR sensor turn it on automatically? To have a phone that senses movement and location and tells the garage door to open, or to say ‘open the door’ – and would it be Siri or Google in the car and Alexa in the kitchen? Will there be lots of Venn diagrams, or one unified system, or many disconnected appliances?
That, is, the obvious answer to all of this complexity is to say ‘let’s just make it all Alexa or Siri or Google’, but that might actually be more complex, and more friction, and might make little sense for some use cases. We don’t worry about our fridge, door lock and light coming from different companies with different kinds of switch today, after all. We’ll see.
Part of the challenge is that very few people will convert their entire existing home to ‘smart’ all in one go, even if all of the possible products were available. You might buy a smart door lock or camera, or thermostat, but you probably won’t replace all the light switches, plug sockets, locks, blinds and appliances at the same time. Many of those other things are on long replacement cycles – we buy new smartphones every two to three years, but fridges and water heaters last for a decade or two. If you want people to replace a ‘dumb’ thing with a ‘smart’ thing, then either you must fit into the existing replacement cycle for that thing, or that thing must be cheap enough to be replaced off-cycle. You can keep a garage door opener for 20 years or buy a new smart one now, but no-one will replace a two-year-old fridge just to get a smart one.
This means adoption overall will take a long time no matter how much sense you think it makes, but it also means that most smart things have to make sense as a single thing by themselves without being part of a larger system. ‘Would it be good if I could have one voice control for all my lights, the curtains, blinds, doors, heating oven and music system?’ is a different question to ”do I want that light, and the washing machine (but not the dryer) to be controlled by Alexa?’ This makes some use cases more difficult, but it’s also why so many of these things tend to have their own app, or (on the larger devices) their own screen and user interface. The theoretical end-state might be no UI except a unified voice system, but you can’t sell an oven with no controls on the front today.
You can see this challenge in the way that the industry (or rather industries) are trying to implement it: if the consumer model is pretty unclear, there is an awful lot of industry push, but that comes with lots of bases being covered at once. Google, Apple and Amazon would obviously like there to be one UI, controlled by them, for reasons I’ll return to later. The motivations of Samsung and LG, the Silicon Valley company making a door camera, and the hundreds of Shenzhen companies each churning out 50 different things, are a little more mixed.
Samsung Group’s strategy is very clear – it wants the fridge, the cooker, the AC unit and the dishwasher all to use the Samsung voice assistant, so that people who bought the Samsung fridge have a reason to buy the Samsung cooker. The trouble is that the Samsung dishwasher team wants to sell to people who bought an LG fridge or a Subzero fridge as well, so they also want to support Alexa, Google Assistant and so on. They might also of course think that supporting any of these is an annoying waste of time that’s forced on them by Group. A nice side-effect of this is that the world could well fill up with appliances that notionally have Alexa or Google Assistant embedded (and probably both) but whose owners don’t actually know or ever use this, just as for years most ‘smart TVs’ never connected to the internet.
Meanwhile, Shenzhen and the ‘smart home startup’ have almost exactly opposite motivations to each other. The smartphone supply chain means that there are an enormous range of very sophisticated, very small, low power and cheap commodity components available off the shelf for anyone else to pick up and turn into products. That, plus the contract manufacturers that are also part of the smartphone supply chain, is behind a lot of the current Cambrian explosion in smart device creation, but it also means that hardware differentiation is extremely hard. Many of these device categories (smart light switches, say) will be commodity products using commodity components – some categories will have 50 companies making near-identical devices. These companies will embrace Alexa/Google Assistant/Homekit because it gives them a commodity front-end as well, just as Android did for phones.
Conversely, a Silicon Valley startup trying to make a device in this world has to find a way to make something that cannot easily be copied, and since it mostly uses the same components as everyone else that generally means something to do with the software. So, is there a network effect? A cloud service? Something with the use of aggregated data across all the devices? Or, do you have a route-to-market advantage? If not, then your whole category will probably go to the incumbents – generic ‘consumer electronics’ devices (baby cameras, say) will go to Shenzhen and washing machines will go to the washing machine companies, where smart becomes just another high-end feature. The challenge for the startup is that if I can control your device entirely with Alexa or Siri, you don’t have much of a moat left, but if you don’t support them, won’t people just buy a generic Chinese one that does? How do you square the circle?
You can see a fascinating case study of this question in connected door locks. Is it harder for the incumbent lock companies, with all their manufacturing scale and route-to-market advantages, to learn how to add ‘smart’, than for software companies to learn how to make a good lock at scale and get it into the channel? Is there enough work to the user experience of a lock that it’s harder for Yale than for a startup? is there a network effect?
That is, is a connected lock really a piece of software wrapped in metal and plastic, or is it just a better lock?
So far, it’s an open question. Again, though, if this does become an Alexa use case, that’s good for Yale – they can go back to worrying about competing with Schlage (and the Chinese entrants they’ve been thinking about for a decade or more) and let Amazon and Google worry about the network and the UX.
That takes me to the third question – if everything is Alexa, or Google Assistant, or Siri, so what? If everyone does buy lots of these devices, and they are all connected into a central assistant UI of some kind, so what? How much leverage would that give – how much ecosystem power?
Smart speakers and ecosystem value
Self-evidently, Amazon and Google make little to no money from selling cheap smart speakers per se, nor from the sale of smart devices with their tech embedded (Amazon of course makes some money as a retailer, but that’s the same regardless of which system the devices use). Apple or Google will make a real margin from a $350 speaker but not much relative to the money coming from iPhones and Adwords, and there are lots of other things it could probably sell at a good margin but chooses not to.
By extension, it’s also not terribly clear that they get much value from being the hub to a smart home of itself. There’s a lot of handwaving about knowing more about you (‘now Google knows when you do the dishes!’), but that’s part of a mosaic with much more, much more specific information – after all, an Android phone already knows everywhere you go. Then, there’s also a story about opening the door for an Amazon delivery when you’re not home, but that could be done with Google Assistant or Homekit – Amazon doesn’t need Alexa for this. Rather, controlling the smart home is a use-case to get you to buy the device, and making the device into the hub of a smart home makes it sticky, but the value of the device to Google or Apple is something else. The point is not really sales of the device, nor the smart home, but the leverage to their ecosystems, in some way, that it provides.
What leverage? Well, there are maybe three levels worth thinking about.
First, these devices help with retention – they keep you in the broader ecosystem. The HomePod, Watch, Apple TV and AirPods are all accessories to your iPhone – they have some reason of their own to exist, and some margin of their own, but their main benefit to Apple (like Apple Music) is make you more likely to replace your iPhone with another iPhone and not an Android. Equally, almost everything that Amazon is doing right now trickles back in some way to making Prime more attractive to join and less attractive to cancel, and having an Echo binds you that much more into Amazon. On this level, Google might have a weaker strategic benefit – it has no directly equivalent subscription to stop you cancelling (and the iPhone is in effect a subscription) – you can use a Chromecast or Google Home pretty happily with an iPhone.
Second, like Apple TV or Google Maps, these devices also extend the number of touch points to those ecosystems in new ways, sometimes in ways that let them do quite new things. An enormous amount has been written about what it would mean if you said ‘hey Alexa, I need more soap’ instead of choosing Tide on a screen or in a shop – Amazon would take an even stronger and more embedded role in people’s buying, and gain even more control over its suppliers – what do you have to pay to be the default choice for ‘Alexa, order more toothpaste’?. (Of course, people also said that the Kindle Fire would be a buying platform, which never happened, and this presupposes a major shift in consumer behaviour, but that’s far from impossible.) Equally, a large part of Google’s strategy for years has been to move beyond ‘ten blue links’ to a fundamental understanding of what you might be searching for, and perhaps to anticipate that search – machine learning is part of this, but having touch points in mail, map, messaging, mobile and now perhaps in the home are ways to shift the character of its reach and its ability to answer or anticipate question from a search bar in a browser to much more of your daily internet life.
Third, however, the huge amount of excitement around Alexa clearly doesn’t actually come from the prospect of decreased churn in Amazon Prime accounts or even seamless toothpaste ordering. Rather, there is there is now an idea that voice is a new fundamental platform, with possibilities for search, discovery and application development platform that might be as significant as smartphones or social. Voice wil be as important as multitouch, apparently.
I am extremely skeptical about this, as I explained in detail here. Essentially, I think this mischaracterises the nature of the breakthroughs that machine learning has given us in voice recognition: we can now transcribe audio to text, and we can turn text into a structured query, but have no scalable way to be able to answer more and more kinds of such structured queries. ML means we can use voice to fill in dialogue boxes, but the dialogue boxes still need to be created, one at a time, by a programmer in a cubicle somewhere. That is, voice is an IVR – a tree. We can now match a spoken, natural language request to the right branch on the tree perfectly, but we have no way to add more branches except by writing them one at a time by hand. If Alexa or Siri or Google Assistant can give you cricket scores but not rugby, it’s because someone wrote the cricket score module, by hand, but hasn’t written a rugby score module yet.
Worse, even if you do create hundreds or thousands of such queries (which Amazon is trying to do with Alexa Skills), you haven’t solved the problem, since there is no way for the user to know what they can ask, nor remember what skills Alexa does and does not have. The ideal number of skills for such a system is either 3 or infinity, but not 50 or 5000.
This means voice can work very well in narrow domains where you know what people might ask and, crucially, where the user knows what they can and cannot ask, but it does not work if you place it in a general context. That, I turn, means I see these devices as, well, accessories. They cannot replace a smartphone, tablet or PC as your primary device.
Of course, I might be completely wrong here, and the limitations of voice recognition as we have it now might be expanded, or not matter. There is, though another consideration to think about as we try to assess the broader impacts of these devices – there appears to be little or no network effect, and so little or no winner-take-all effects. Even if voice and smart speakers are very, very important, that doesn’t necessarily mean that Alexa or anyone else will run away with the space.
There is word-of-mouth – your friends will visit your home, see your set-up and want it (hopefully). There is notionally a data network effect in that the largest platform will collect the most ML training data in the form of voice commands – but Apple and Google already collect vast quantities of voice commands though their other end-points (Apple has said it gets 2bn Siri requests every week). There may be a network effect inside your home – you may want everything that is centrally controlled to be on the same system, and not have to talk to Google for the lights and Alexa for the appliances (though this might not be the case, as I suggested above).
However, there is almost no network effect between homes. You don’t have to buy one product over another because your friends have it, not will one product get irresistibly better if it has more market share. Most things you can buy will support all of the interconnection systems, for the reasons outlined above. There may be some network effects around developers, but as above I highly doubt many people will use large numbers of third party apps, and even if they do these are not of the same scale, cost and opportunity cost as smartphone apps. In turn, if there are no network effects, there won’t be a winner-takes-all effect. Windows squashed the Mac, and then iOS and Android squashed Windows Phone, because once a product’s market share fell below a certain level it stopped being as useful, since developers did not support it. PCs and Smarpthones were winner(s)-take-all markets. But a Google Home can answer the same questions and control much the same devices whether it has 15% or 85% market share. Google can mess it up, but Alexa’s success can’t kill or, nor vice versa.
This takes me back to accessories. Accessories can add incremental revenue and margin, and they can make an ecosystem more sticky – they can can make it harder to churn. But they don’t change the market dynamics. Apple TV or Chromecast or Daydream make an ecosystem more attractive and harder to leave, but they don’t change the market. Neither, I suspect, do smart home, Alexa, or smart speakers.