I made this tool to prepare for exams more efficiently.
It is written in rust and uses OpenCV to detect and match features in slides (provided as PDF) and video frames. The matching algorithm is described in [1].
The PDF pages can appear anywhere in the video frames, in any order. Rotation, translation, scaling and obstruction is fully supported due to the feature extractor (ORB) being rotation/scale/transformation invariant.
There's many old videos on media.ccc.de with very low quality, standard definition camcorder and without slide capture or at maximum grainy vga captured slides. Might it be possible to take re-render those to FHD or even 4K with everything upscaled and the projector replaced? Would need some additional magic to recognize laser pointers, mouse pointers and the presenter walking in front of the screen, but might be worth it for some of the more timeless talks.
The tool does not work very well on very low quality videos (maybe ORB is not the best feature extractor for that). However, theoretically, it should work. The tool already calculates a projection of the pdf slide into the video frame to compare the similarity. It should be relatively easy to dump all those updated frames into a new video.
Detecting laser/mouse pointers is a different story though.
Great idea! I have watched some lectures/talks online with terrible camera work (e.g. panning to the speaker when they are asking the audience to read the slide).
Another solution I've seen was to play a simultaneous video feed of the slides only. Not particularly efficient. Would love to see something like this instead for instructional content!
Still, you don't have the reverse mapping. Given a slide, how do you find the commentary?
The primary goal of this tool is to find the reverse mapping and to provide a simple tool to play the video from a given slide. This is very useful for going back to or skipping slides.
If the slide is not visible at all in the video, such mapping cannot be computed automatically.
This is awesome! I can imagine something like this in the text-based video editor we are developing. One usage could be automatic frame alignment for moving objects.
[+] [-] Gehinnn|4 years ago|reply
The PDF pages can appear anywhere in the video frames, in any order. Rotation, translation, scaling and obstruction is fully supported due to the feature extractor (ORB) being rotation/scale/transformation invariant.
[1] https://github.com/hediet/slideo/blob/master/BACKGROUND.md
[+] [-] summm|4 years ago|reply
[+] [-] Gehinnn|4 years ago|reply
Detecting laser/mouse pointers is a different story though.
[+] [-] natemo|4 years ago|reply
Another solution I've seen was to play a simultaneous video feed of the slides only. Not particularly efficient. Would love to see something like this instead for instructional content!
[+] [-] Gehinnn|4 years ago|reply
The primary goal of this tool is to find the reverse mapping and to provide a simple tool to play the video from a given slide. This is very useful for going back to or skipping slides.
If the slide is not visible at all in the video, such mapping cannot be computed automatically.
[+] [-] Michael_Sieb|4 years ago|reply
[+] [-] Gehinnn|4 years ago|reply
[+] [-] kburman|4 years ago|reply
[+] [-] Gehinnn|4 years ago|reply
https://github.com/hediet/slideo/blob/master/webview/src/vie...