Although we may seldom realise it, whenever we are mulling over a decision we are actually imagining what would be the consequences of our action, and some time we do this visually, like seeing with our mind’s eye the movement of an object and its path leading to hitting another object -which may be a desirable or not desirable outcome.
This “visual foresight” helps us in taking decisions.
Researchers at UC Berkeley have found a way to enable robots with this kind of visual foresight and presented their results at the NIPS 2017 conference.
They have developed a learning technology based on deep learning and convolutional recurrent video prediction. The robot learns in complete autonomy, based on experience. It imagines what its video camera (robot’s eye) will likely see if it performs a certain movement.
So far the visual foresight is limited to a few seconds ahead, and it is also constrained to a limited context. As an example, a robot would learn that turning right it will likely see a pot of flowers if its arm will push to the right the box on which the pot was left; or -more difficult- if it pushes a button that will cause the conveyor belt, on which the pot of flowers is resting, to move. It will actually see in its mind’s eye the new virtual position of the pot of flowers. It will not, however, predict that a bee may be buzzing on the flowers. This requires a broader grasp of the context.
What is really interesting in this news is that we are starting to see the first, tiny steps, in the path leading to intelligent machines. Intelligence requires the capability to look ahead, and being able to acquire a visual foresight is a step in this direction. A very limited one, so far, but as they say, even the longest journey begins with a small step.