Why teaching robots is more like raising toddlers than programming computers

CSIRO

As a non-technical user of AI, it’s easy to assume that all AI works the same way – fast learning, absorbing huge volumes of information and rapidly improving.

A silver robot with a fluorescent blue circle face, black elbow joints and black feet, sits on a black office chair in the middle of a science laboratory. The robot has CSIRO printed across its chest.

Teaching a robot is often compared to teaching a toddler, rather than simply programming a machine.

But the kind of learning that powers large language models like ChatGPT doesn’t translate neatly into the physical world. Robots, for example, don’t learn by reading, they learn by interacting with their environment – just like toddlers.

That means moving through a space, responding to changing conditions and gradually improving through experience.

From code to crawling

CSIRO robotics researcher Brendan Tidd explained that this distinction helps us understand why robotics is progressing more cautiously than its digital AI counterparts – and why researchers often equate teaching a robot to teaching a child, rather than programming a machine.

“Toddlers don’t just stand up and walk perfectly one day. First, they learn to crawl, then to pull themselves up and finally they take those first teetering steps. More than likely, those first steps end in tears because they lack the coordination or experience to walk steadily. It’s similar with robots – without the tears of course – as trial and error becomes valuable learning experience,” said Dr Tidd.

The real world sets the pace

At the heart of the challenge is the environment itself.

“The real world is much more diverse and complex than language, there is much more structure in language. In text-based systems, errors are relatively low-stakes. A model can generate an incorrect response, adjust and try again almost instantly. For robots, every action has a tangible outcome,” said Dr Tidd.

“Unlike text, robots don’t have the opportunity to hallucinate and get part of the answer correct. You can’t immediately recover from an incorrect action the robot takes.”

When something goes wrong, the system has to work through the consequences – whether that’s reattempting a movement, resetting the task or incorporating that outcome into future learning. As a result, progress tends to be incremental and closely managed.

[Music plays as an image appears of a human like robot falls over beside a dog like robot, and then the image changes to show a black screen and text appears: “The gang trains robots to play soccer”]

[Image changes to show a male looking into a human robot’s back as it moves about a bit as the male works, and image changes to show the robot hit by a baseball and colleagues watch it stumble to the left]

Voiceover: Training a robot is more like teaching a child than programming a machine.

[Images move through to show a human robot walking through a doorway as the camera follows it, a female playing Jenga with a human robot, and a male follows behind a human robot walking outside]

In the same way a parent would teach their toddler, researchers need to create situations where robots learn through trial and error.

[Images move through to show a dog robot walking through a tunnel, a human robot mimicking a male wearing headgear picking up a Jenga tile, and a human robot legs stand by a ball facing a dog robot]

This can be navigating an obstacle course, picking up objects or playing soccer.

[Music plays as images move through to show AI stadium posters for the Keepers with a female kneeling in flames beside a dog robot, then for the Strikers with a male standing in flames beside a human robot]

[Music continues to play as images move thought to show a male dressing a human robot in sports uniform, a split screen of a human robot over a dog robot as camera pans in, and a female stretching with a dog robot]

[Music continues to play as images move through to show a soccer ball as the camera zooms in on the ball a whistle blows, a male using a remote behind a human robot walking over the ball]

[Music continues to play as images move through to show a range of attempts from different views of the human robot kicking the ball at a goal defended by robot dog operated by a female, Xs appear at the bottom]

[Sad Music starts to play with views of a female and robot dog moving side to side, a female laughing slowly, the human robot stands motionless as inset top right humans score goals over and over]

[Music continues to play as images move through to show the human robot kicks the ball, camera follows the ball as it passes between robot dog’s legs and scores a goal, and views of the camera zooming in on the goal]

[Image changes to show the robot human’s legs kicking the ball to the goal defended by the robot dog and scoring]

Robot progress happens slowly.

[Images move through to show the robot human’s legs kicking the ball to the goal defended by the robot dog and scoring again, and then fast-forwarded views of humans and robots working together to play soccer]

That’s because real world experience takes real time.

[Images flash though to show humans and robots working together playing soccer, and then operating in the outside world through a tunnel]

Our research is focused on tasks that are dangerous, dirty or dull.

[Image changes to show a robot dog walking through a tunnel away from the camera, and then the image changes to show the robot dog walking on a pathway from left to right]

Rather than sending humans into a collapsed mine, we can send Spot.

[Image changes to show the human robot dancing in front of the robot dog]

And clearly they are not replacing the Socceroos any time soon.

[Music plays]

We taught our robots to play soccer. Spoiler alert: the Socceroos have nothing to worry about.
  • View transcript
  • Copy embed code

/Public Release. View in full here.