By analyzing a video or picture (or whatever other sensor data you give it). It is a pretty straightforward classification problem, and imagine classification has been a major portion of ML research for decades.
It is a reference to specific testing of general models. Specific models can probably do this pretty easily, but genearl vision models acting on video still struggle. Not sure why this is controversial or down voted. Go and test it for yourself.
lazide|1 year ago
gizmo686|1 year ago
hndamien|1 year ago
KTibow|1 year ago
hndamien|1 year ago