Maybe its possible to feed everything in to a model that can identify the situation or context in audio or video and block a section out because its an ad. We would not be short of training material.
Latency would have to be low enough to be attractive to users.
No comments yet.