top | item 22302026

(no title)

haditab | 6 years ago

I acknowledge the issues in the dataset and that it has a lot of stars on github because it's from Udacity; but calling it 'a popular self-driving car dataset' is misleading as it implies this dataset is popularly used for self-driving cars when it is in fact only a small dataset Udacity uses to teach the basics of training neural networks for self-driving cars.

I've been involved in the autonomous vehicle industry for a while and have been focused on perception for most of it. Most research papers will test their models on popular datasets for self-driving cars and show the results as a sort of benchmark. I've never seen this dataset mentioned anywhere. Heck the size of the dataset is an order of magnitude smaller than most of the popular ones as well.

This is just a github repo. That's it.

discuss

thanatropism|6 years ago

Are these larger datasets routinely subject to the same kind of inspection this titanic.csv of self-driving car datasets?

yeldarb|6 years ago

I hope so. I've personally tested Scale's labeling service and it was much higher quality than this dataset. But it's a pretty secretive industry so I'd bet some companies' data is better than others.

It'd be interesting if the NHTSB had a held-back "test set" they used to evaluate self driving cars before letting them on the road.