Sony announced today that its researchers have developed an AI driver named GT Sophy that is ‘reliably superhuman’ race times in Gran Turismo.” — able to beat top human drivers in Gran Turismo Sport in back-to-back laps. You might think this an easy challenge. After all, isn’t racing simply a matter of speed and reaction time and therefore simple for a machine to master? But experts in both video game racing and artificial intelligence say GT Sophy’s success is a significant breakthrough, with the agent showing mastery of tactics and strategy.
GT Sophy was trained using a method known as reinforcement learning: essentially a form of trial-and-error in which the AI agent is thrown into an environment with no instructions and rewarded for hitting certain goals.‘reliably superhuman’ race times in Gran Turismo. In the case of GT Sophy, Sony’s researchers say they had to craft this “reward function” extremely carefully: aggressive enough to win but that didn’t lead to the AI simply bullying other racers off the road.
“Outracing human drivers so skilfully in a head-to-head competition represents a landmark achievement for AI,” writes Stanford automotive professor J. Christian Gerdes in an editorial in the scientific journal Nature that accompanies a paper describing the work. “GT Sophy’s success on the track suggests that neural networks might one day have a larger role in the software of automated vehicles than they do today.”
Using reinforcement learning, GT Sophy was able to navigate around a racetrack with just a few hours of training and “within a day or two” was faster than 95 percent of drivers in its training dataset. After some 45,000 total hours of training, GT Sophy was able to achieve superhuman performance on three tracks. (For Gran Turismo Sport players, the tracks in question were Dragon Trail Seaside, Lago Maggiore GP, and Circuit de la Sarthe.)
A common concern when testing AI agents against humans is that machines have a number of innate advantages, like perfect recall and fast reaction times. Sony’s researchers note that GT Sophy does have some advantages compared to human players, like a precise map of the course with coordinates of track boundaries and “precise information about the load on each tire, slip angle of each tire, and another vehicle state.” But, they say, they accounted for two particularly important factors: action frequency and reaction time.
GT Sophy was tested against a trio of top e-sport drivers: Emily Jones, Valerio Gallo, and Igor Fraga. Although none of the humans were able to beat the AI in time trials, their match-ups lead to them discovering new tactics.
“It was really interesting seeing the lines where the AI would go, there were certain corners where I was going out wide and then cutting back in, and the AI was going in all the way around, so I learned a lot about the lines,” e-sports driver Emily Jones said in a testimonial in the Nature paper. “Going into turn 1, for example, I was braking later than the AI, but the AI would get a much better exit than me and beat me to the next corner. I didn’t notice that until I saw the AI and was like, ‘Okay, I should do that instead.’”
GT Sophy’s inputs were capped at 10 Hz, compared to a theoretical maximum human input of 60 Hz. This sometimes led to human drivers displaying “much smoother actions” at high speeds, write the researchers. For reaction times, GT Sophy was able to respond to events in the game environment in 23–30 ms, which is much faster than an estimated top reaction time for professional athletes of 200–250 ms. To compensate, researchers added artificial delay, training GT Sophy with reaction times of 100 ms, 200 ms, and 250 ms. But as they found out: “All three of these tests achieved a superhuman lap time.”