(no title)
HDMI_Cable | 6 months ago
I wonder if the author has ever considered reaching out to makers of Anki decks used by premeds and medical students like the AnKing [1]. They create Anki decks for users studying the MCAT and various Med School curricula, so have a) relatively stable deck content (which is very well annotated and contains lots of key words that would make semantic grouping quite easy) b) probably contains loads of statistics on user reviews (since they have an Anki addon that sends telemetry to their team to make the decks better IIRC), and c) contains incredibly disparate information (all the way from high-school physics to neurochemistry).
---
ran3000|6 months ago
HDMI_Cable|6 months ago
Also, having used those decks in the past, and downloaded the add-on/look at the monetization structure of developers like the AnKing, I would be very surprised if aggregate data on review statistics wasn't collected in some way. I.e., if the AnKing is collecting this data already to design better decks/understand which cards are the hardest—probably to target individual support—then I imagine that collecting some de-anonymized version of that data wouldn't be too much of a stretch.
Plus, considering that all of the developers of AnKing-style decks are all doctors, they probably have a pretty good grasp at handling PII and could (hopefully) make pretty sound decisions on whether to give you access :)