How the GT Sophy AI actually works

  • Thread starter rlx
  • 15 comments
  • 4,405 views

rlx

112
San Marino
San Marino
Now that a first iteration of the GT Sophy AI has been released to the public, it may be interesting to know how it actually works. I've seen quite a few misconceptions, so I've read the paper (1), and the following is a short summary that I hope is easy to understand for anyone, with or without a background in machine learning.

Basically, GT Sophy is a program that, 10 times per second, gets all relevant information from a Gran Turismo race (coordinates of the center line and boundaries of the track, plus the coordinates of all other cars), and replies with appropriate controller inputs (throttle/brake, left/right steering).

It was trained using Reinforcement Learning (similar to Google's Alpha Zero chess engine), meaning that it has no hardcoded knowledge about racing, like where to brake before a corner, or how to maximize exit speed. Rather, it has learned to drive on its own, being given rewards (for fast lap times or overtakes) and penalties (for track excursions or collisions), and then optimizing its own behavior. After a while, it would manage to complete a lap, and after tens of thousands of hours of simulated racing, it would become strong enough to beat the best human sim racers.

During training, Sophy is penalized for any contact between cars - regardless of who is at fault, which is often hard to determine objectively. As a result, it obeys basic racing ettiquette, and tends to leave room for opponents in tight situations, rather than cutting them off.

It's interesting to note that it didn't learn exclusively by racing against itself, but has also been given specific scenarios to be able to deal with human imperfection. This includes taking corners in traffic (where the driver in front may brake too early) or crowded grid starts (where humans may drive more erratically).

Still, GT Sophy is different from a human driver in a couple of ways. For example, it won't use throtte and brake at the same time. It can determine the position of cars behind it as precisely as the position of cars in front of it, but doesn't know the actual dimensions of the cars. Also, at 100 ms per distinct action (which takes around 25 ms to compute), it acts and reacts more slowly than human drivers - but at the same time, it is more precise and doesn't make mistakes.

There are several aspects of the game that Sophy, as of now, ignores, like manual gear changes, traction control or brake balance. And while it has learned advanced racing techniques (like exploiting slipstream for overtakes, or driving defensive lines) it hasn't been trained on overall race strategy (like fuel saving or tire management). It also doesn't seem to have been exposed to wet or mixed conditions. Of course, all of this could be added in the future.

During training, Sophy is a computer program receiving information from, and sending information to, a cluster of Playstations, continually collecting rewards and penalties and updating the optimal driving model. What has been deployed for the current "Race Together" event is simply a version this model. It receives the race state 10 times per second, and replies with the optimal controller input. It's a static copy of the model, so regardless of how you drive against it, it does not learn from you.

As far as I'm aware, there is no precise information about the different color variations of GT Sophy. In case they are different (beyond the performance of the cars), they could represent different stages of learning, or be optimized for slightly different scenarios. It is also unknown if Sophy has driven all of the available tracks (possible) and all of the available cars (unlikely), so it may not (yet) be an expert at, say, driving a Suzuki Escudo on an off-road course.

(1) Outracing champion Gran Turismo drivers with deep reinforcement learning
 
Now that was quite interesting!
I wonder if the Sophy we get now will adapt to the players driving (to give everyone a challenge, no matter the skill level) or if it's still static.
 
rlx
As far as I'm aware, there is no precise information about the different color variations of GT Sophy. In case they are different (beyond the performance of the cars), they could represent different stages of learning, or be optimized for slightly different scenarios. It is also unknown if Sophy has driven all of the available tracks (possible) and all of the available cars (unlikely), so it may not (yet) be an expert at, say, driving a Suzuki Escudo on an off-road course.
Different stages of learning and perhaps different reward functions? i.e. one could be penalized for approaching/exceeding the traction limit of its tires to make it more "docile" while another is more heavily rewarded for overtakes to make it more aggressive.
 
Now that was quite interesting!
I wonder if the Sophy we get now will adapt to the players driving (to give everyone a challenge, no matter the skill level) or if it's still static.
Sophy 2.0 is still going to be static. In order to train or finetune the system, you need more infrastructure than just one PS5.

Different stages of learning and perhaps different reward functions? i.e. one could be penalized for approaching/exceeding the traction limit of its tires to make it more "docile" while another is more heavily rewarded for overtakes to make it more aggressive.
Totally possible -- even though I'm pretty sure that for now, difficultly levels for Sophy are simply going to be performance penalties, i.e. less competitive car setups.
 
rlx
Sophy 2.0 is still going to be static. In order to train or finetune the system, you need more infrastructure than just one PS5.
I really don't understand the whole concept of this then. So Sophy is basically just an "AI" that is hard to beat. Something that games already had 30 or more years ago on machines like the C64 or Amiga. You don't need machine learning for this. Why all the effort (and costs)? What's the goal, what's so special about this? I really don't get it.
 
I really don't understand the whole concept of this then. So Sophy is basically just an "AI" that is hard to beat. Something that games already had 30 or more years ago on machines like the C64 or Amiga. You don't need machine learning for this. Why all the effort (and costs)? What's the goal, what's so special about this? I really don't get it.

What's special about Sophy is that, unlike any other AI in any other racing game, it is not following a set of hardcoded rules or "knowledge" about racing. The issue with hardcoded, rule-based systems (stick to the racing line, adjust speed according to corner angle, etc) is that they tend to get pretty complex when dealing with special cases (take the inside line into a corner when defending against a car behind you, stay in slipstream long enough before overtaking to not be overtaken again on the same straight, etc). This complexity often results in unwanted, suboptimal behavior that can be hard or impossible to fix. At the same time, the system will still miss a lot of corner cases that human players will learn to exploit. To make such an AI "hard to beat" usually means to give it some advantages behind the scenes (like the rubberbanding in GT7 that lets slower cars catch up with you once you've taken the lead, even if you're in a Bugatti on a straight).

Sophy is fundamentally different in that it is trained using reinforcement learning. This approach was popularized by the chess engine AlphaZero, which (using Google's enourmous computing infrastructure) went from knowing nothing about chess to playing stronger than all humans, and every chess program ever invented, in just 24 hours. Sony has fewer resources to throw at the problem, so their progress is going to be a bit slower. But most will agree that the February 2023 Sophy 1.0 demo already showed that this new type of AI appears to be substantially stronger and more situationally aware than anything that came before it. And this is quite precisely because it has derived everything it "knows" about racing from racing against copies of itself, and thus has learned very specific behaviour and tactics that are very difficult or outright impossible to precisely formulate as concrete "rules" for a driving algorithm.

Why all this effort? There's definitely a marketing benefit, since using machine learning for a racing game AI is probably a "cool" thing to do right now. But beyond that, there is a real scientific interest in using neural networks for a variety of tasks (specifically in gaming, which, as you state, has been dealing with AI for decades, since the C64 or Amiga), and the results not only tend to beat everything that has been tried before, but can also be generalized and applied to other fields of interest.

Finally, regarding the idea of Sophy adapting to an individual player's driving style and skills: even if it was possible to put Sophy in training mode on everyone's PS5s (which it isn't), this is most likely not what you want. Sophy has no idea what the best controller inputs are for "matching a player's skill", it is only rewarded for fast laps and overtakes. So what would happen is that some players would create situations that Sophy has rarely seen (deliberate ramming or blocking), and Sophy would invent counterstategies (avoiding the player by aggressively taking the lead, or staying behind waiting for mistakes) that are, overall, relatively uninteresting. Also, even if every player could finetune their own version of Sophy 2.0, any improved Sophy 3.0 would reset that personalization.
 
Last edited:
Well, thanks for your thoughts, quite interesting!

But I still don't understand why PD would do this. Fact is that they released GT7 with an extremely poor AI. But instead of just improving the existing AI (which really can't be that hard to do) they make all this effort and spend all this money (you mentioned tenth of thousands hours of training) for a machine learning-based AI that is still very limited in what it can do. Doesn't make any sense for me.
And another thing I don't get: If you can't adjust the strength and Sophy can't adapt to the player, will it not be almost impossible to beat?

But let's wait until tomorrow and then we will see what Sophy does and can do. Maybe it makes more sense for me then.
 
Last edited:
Well, thanks for your thoughts, quite interesting!

But I still don't understand why PD would do this. Fact is that they released GT7 with an extremely poor AI. But instead of just improving the existing AI (which really can't be that hard to do) they make all this effort and spend all this money (you mentioned tenth of thousands hours of training) for a machine learning-based AI that is still very limited in what it can do. Doesn't make any sense for me.

But let's wait until tomorrow and then we will see what Sophy does and can do. Maybe it makes more sense for me then.

You're welcome!

I think one has to keep in mind that Sophy is a Sony AI project, not something that some folks at PD are doing in-house. At the same time, Sophy is a tiny effort compared to AlphaZero or ChatGPT, so it's something that Sony can easily afford, and also promises benefits beyond just improving GT7.

But you're totally correct about its limitations. A "proper" version of Sophy would have to include racing strategy, i.e. deal with pit stops, fuel saving, tyre wear, weather conditions, etc. The vanilla AI in GT7 can be pretty stupid (my favorite example would be changing from inters to hards before the last lap of an endurance race on a drying track, trading a 50 second pit stop for 5 seconds in lap time), but at least it makes an effort to simulate strategic behaviour. My guess is that the problem Sophy has learned to solve (here's the track, here are the other cars, what buttons are you going to push on the controller?) is simpler that the problem it still needs to solve (here's your fuel and tyre status, here's the weather radar, here's the number of remaining laps, are you going to pit now or later?).

And another thing I don't get: If you can't adjust the strength and Sophy can't adapt to the player, will it not be almost impossible to beat?
Sophy is already impossible to beat. It would win all GT7 events against the best players in the world (if there were no pit stops). But I guess that giving Sophy a disadvantage (less performant car) feels more realistic than giving the vanilla AI an advantage.
 
Last edited:
Well, thanks for your thoughts, quite interesting!

But I still don't understand why PD would do this. Fact is that they released GT7 with an extremely poor AI. But instead of just improving the existing AI (which really can't be that hard to do) they make all this effort and spend all this money (you mentioned tenth of thousands hours of training) for a machine learning-based AI that is still very limited in what it can do. Doesn't make any sense for me.
And another thing I don't get: If you can't adjust the strength and Sophy can't adapt to the player, will it not be almost impossible to beat?

But let's wait until tomorrow and then we will see what Sophy does and can do. Maybe it makes more sense for me then.
Sophy is not being created to be a new AI for racing games. GT7 is just a giant demo for it. It's created to be a new AI that Sony can sell for 7, 8, 9 figures to corporate customers. Like IBM didn't create Watson to win at Jeopardy, that was just a way of showing it off.
 
Sophy is not being created to be a new AI for racing games. GT7 is just a giant demo for it. It's created to be a new AI that Sony can sell for 7, 8, 9 figures to corporate customers. Like IBM didn't create Watson to win at Jeopardy, that was just a way of showing it off.
Well, ok, something like that makes sense then!
 
Sophy is not being created to be a new AI for racing games. GT7 is just a giant demo for it. It's created to be a new AI that Sony can sell for 7, 8, 9 figures to corporate customers. Like IBM didn't create Watson to win at Jeopardy, that was just a way of showing it off.
I think that if you take a look at the paper (linked in the first post), everything suggests that Sophy is primarily designed to be a new AI for racing games. It's a great project, everyone involved will be happy to have it in their biography, but I don't think the intent is to build a marketable product. It's very unlikely that any of their racing game competitors would licence it ("The new Forza Motorsport - now with Gran Turismo AI!"). And compared to actual AI products that are being sold to actual corporate customers (think self-driving cars), Sophy is just a toy project. If Google or Tesla wanted to, they could make a system that would blow Sophy out of the water, using zero information from the game, just trained on a video stream of what the player sees on the screen.
 
Last edited:
rlx
I think that if you take a look at the paper (linked in the first post), everything suggests that Sophy is primarily designed to be a new AI for racing games. It's a great project, everyone involved will be happy to have it in their biography, but I don't think the intent is to build a marketable product. It's very unlikely that any of their racing game competitors would licence it ("The new Forza Motorsport - now with Gran Turismo AI!"). And compared to actual AI products that are being sold to actual corporate customers (think self-driving cars), Sophy is just a toy project. If Google or Tesla wanted to, they could make a system that would blow Sophy out of the water, using zero information from the game, just trained on a video stream of what the player sees on the screen.
I think you're misunderstanding what I'm trying to say. Yes, the current implementation of Sophy is an AI for racing games. But the main goal of Sony AI doing this is not so it can create better AIs for racing games, it's so that they can demonstrate Sony's AI capabilities that could be generalized to a wide variety of fields.
 
I think you're misunderstanding what I'm trying to say. Yes, the current implementation of Sophy is an AI for racing games. But the main goal of Sony AI doing this is not so it can create better AIs for racing games, it's so that they can demonstrate Sony's AI capabilities that could be generalized to a wide variety of fields.
Yes, definitely!

Here is the press release: https://ai.sony/articles/sonyai022/
 
Awesome posts @rlx , thanks a lot! I totally agree, that a concerted effort by e.g. Google's deep mind who did alphaGo would be even more impressive. Compared to finally creating self driving cars, that have uncountable inputs to evaluate creating an AI for a racing game is actually trivial.

@Sir Crashalot I'm no expert on the history of racing games but I would be very surprised if the unbeatable AIs of past games you mentioned didn't cheat while making it look like they were on a level playing field with the player. That's not meant to be a knock btw, just trying to emphasize how much better than anything there was before Sophy is.

However the inability to create different difficulty settings is of course a problem. To try and create a "naturally" slower AI would be even more difficult than making it as good as it can be because you would have to add many more reward functions that would have to be delicately fine tuned to balance out with the still existing goal of getting around the track fast.
 
However the inability to create different difficulty settings is of course a problem. To try and create a "naturally" slower AI would be even more difficult than making it as good as it can be because you would have to add many more reward functions that would have to be delicately fine tuned to balance out with the still existing goal of getting around the track fast.

The press release (linked above) claims: "The GT7 audience voiced strong support for incorporating GT Sophy into the game as a permanent feature, with many remarking how closely GT Sophy mimicked racing another human opponent. Players were also looking forward to a version of GT Sophy that aligns with players of all skill levels. With GT Sophy 2.0, the AI agent showcases a wider range of behaviors to create new racing experiences for all players." But I think that's just marketing speak, and deliberately skirts around the issue of skill levels.

If it was me, one thing I'd try would be to add a tiny amount of noise or delay to Sophy's output (potentially depending on race position, section of the track, etc.) in order to simulate imperfection or driving errors. Since what might get annoying in the long run is if Sophy makes no mistakes whatsoever. I actually quite appreciate the vanilla AI in heavy rain at Le Mans, when some (but not all) cars slide off the track. Once Sophy can handle weather, I'd hope it's not 100% perfect all the time.
 
Last edited:
Back