I was reading yesterday the demise of TickTock, a Silicon Valley start up that focussed on robotics in the home. Their vision was a home where many of the chores would be done by several robots, each one with its own focussed task. Rather than building a very complex robot with a broad set of capabilities TickTock imagined the transformation of the home by steps, each one seeing a low cost robot taking up a specific and relatively simple task. And here is the problem.
What we take as a simple task, think about looking for your smartphone in the living room and taking it to the kitchen or rinsing cutlery, is actually a pretty complex endeavour for a robot.
The first stumbling block for a robot is to get the lay of the land. This means understanding the lay out of the home, what objects are inside and what each object mean. Now these are all very stupid things, aren’t they. You just take a look around and you see the table, the chairs, the couch, that toy left on the carpet by the toddler….
Actually this survey and recognition is pretty complex. Autonomous cars use Lidar (a radar) to detect objects and place them in a specific position but you cannot equip a domestic robot with Lidars, unless you are prepared to shell out tens of thousands of $ for it, which you aren’t. TickTock chose to use low cost hardware technology, video cameras, and develop advanced image recognition software to make sense of the images and create a model of its surrounding.
Still, the assessment of what is around it remained challenging (not because the image recognition software was not good, but because our homes are so different in details one from the other and, as we say, the devil is in the details). TickTock engineers decided to engage the home owner, or even better the kid at home, to explain to the robot the home and its various components. They developed augmented reality apps, watch the clip, that could be used to train the robot’s brain to recognise objects and understand them (a porcelain cat may look like a real cat but behaves quite differently…).
What made me think is the carelessness that we have in assessing our every day activities. Most of them goes unnoticed, it is so trivial that you really don’t give them a second thoughts. Of course it took us months, years, as we grew up, to understand the world and our connection with it. We just forgot everything about those many days spent in picking up objects, rotating them in our hands and throwing them on the floor, to the frustration of our parents. We should be giving a robot the same time span and opportunity to learn, but we don’t want to. We want a robot that by magic is as good as we are in doing those very trivial things. Come on: it is not asking much!
I remember quite a few years ago Marvin Minsky, the founder of the MIT Artificial Intelligence Lab, advocating the need to teach machine “common sense” as a first step towards machines that can interact with us. A machine does not need rocket science to engage in a conversation with us (that would actually be quite easy…), it needs a general understanding of the context and of what living is about 99% of the time. So far we have been pretty good in creating specialised Artificial Intelligence that can address 1% of life problems, the ones that baffle us.
Having machines addressing the remaining 99% using Artificial General Intelligence (AGI) remains a challenge as TickTock fail remind us.