(no title)
ran3000 | 7 months ago
I believe this technical shift in how SRS models the student's memory won't just improve scheduling accuracy but, more critically, will unlock better product UX and new types of SRS.
ran3000 | 7 months ago
I believe this technical shift in how SRS models the student's memory won't just improve scheduling accuracy but, more critically, will unlock better product UX and new types of SRS.
IncreasePosts|7 months ago
I have a script for it, but am basically waiting until I can run a powerful enough LLM locally to chug through it with good results.
Basically like the knowledge tree you mention towards the end, but attempt to create a knowledge DAG by asking a LLM "does card (A) imply knowledge of card (B) or vice versa". Then, take that DAG and use it to schedule the cards in a breadth first ordering. So, when reviewing a new deck with a lot of new cards, I'll be sure to get questions like "what was the primary cause of the civil war", before I get questions like "who was the Confederate general who fought at bull run"
ran3000|7 months ago
What I like about your approach is that it circumvents the data problem. You don't need a dataset with review histories and flashcard content in order to train a model.
unknown|7 months ago
[deleted]
gwd|6 months ago
I've got a system for learning languages that does some of the things you mention. The goal is to be able to recommend content for a user to read which combines 1) appropriate level of difficulty 2) usefulness for learning. The idea is to have the SRS system build into the system, so you just sit and read what it gives you, and review of old words and learning new words (according to frequency) happens automatically.
Separating the recall model from the teaching model as you say opens up loads of possibilities.
Brief introduction:
1. Identify "language building blocks" for a language; this includes not just pure vocabulary, but the grammar concepts, inflected forms of words, and can even include graphemes and what-not.
2. For each building block, assign a value -- normally this is the frequency of the building block within the corpus.
3. Get a corpus of selections to study. Tag them with the language building blocks. This is similar to Math Academy's approach, but while they have hundreds of math concepts, I have tens of thousands of building blocks.
3. Use a model to estimate the current difficulty of each word. (I'm using "difficulty" here as the inverse of "retrievability", for reasons that will be clear later.)
4. Estimate the delta of difficulty of each building block after being viewed. Multiply this delta by the word value to get the study value of that word.
5. For each selection, calculate the total difficulty, average difficulty, and total study value. (This is why I use "difficulty" rather than "retrievability", so that I can calculate total cognitive load of a selection.)
Now the teaching algorithm has a lot of things it can do. It can calculate a selection score which balances study value, difficulty, as well as repetitiveness. It can take the word with the highest study value, and then look for words with that word in it. It can take a specific selection that you want to read or listen to, find the most important word in that selection, and then look for things to study which reinforce that word.
You mentioned computational complexity -- calculating all this from scratch certainly takes a lot, but the key thing is that each time you study something, only a handful of things change. This makes it possible to update things very efficiently using an incremental computation [1].
But that does make the code quite complicated.
[1] https://en.wikipedia.org/wiki/Incremental_computing
ran3000|6 months ago
How far along are you in developing the system?