One of the authors here. It's a somewhat nuanced answer. In principle, I think a classical controller would have been fine here and if you read the paper (might be in one of the other papers) we do benchmark a bunch of them. But what's really nice about RL is what it does to the workflow. We can add a sensor, drop a sensor, change the dynamics of the system, and have a functional controller the next day. It trades compute for control engineer time.
On a secondary small point, the dynamics of the cruise control cars are an unpleasant switched system and there's a lot of partial observability, we never fully sense the traffic state, we didn't even have direct measurements of the distance to the car in front, and the individual car control decisions are coupled to macroscopic effects on the system i.e. since all the cars have the same policy their decisions actually affect the traffic flow. So, it's not a trivial control design problem at all.
evinitsky|11 months ago