top | item 44626577

(no title)

ryandamm | 7 months ago

Genuine questions for AI engineers, or self driving car people: is the Tesla approach of only using cameras inherently flawed? I've read that the AI is directly hooked up to the cameras, with no explicit intermediate 3D representation... everything is done in latent space. If true, this seems inherently hard to improve; throw more data at it, sure, but you can't necessarily understand how and why it fails when it does. That seems... non-optimal for safety-critical systems like self-driving cars.

discuss

order

maxlin|7 months ago

There's plenty of visualizations of their intermediary voxel representation. Hardly worse than any LIDAR for the task but without all the downsides of LIDAR.

ryandamm|7 months ago

Do you know that there is an actual voxel representation that exists before the self-driving AI? I was under the impression (from conversations with engineers who might know, admittedly) that there was not an explicit 3D representation that went into the self-driving module, and the AI was operating directly on pixels. I would be relieved to hear that there's an explicit 3D solve before that step... if accurate. Obviously there are 3D views on the dash, but my understanding is that is not an input to the full self driving solve. But again, it's hearsay (hence my question).