top | item 34606480

(no title)

bitforger | 3 years ago

Pretty cool.

I once worked on AI Dungeon and we had a similar idea to parse the story so far into a graph, so that we could manage long-term memory outside of the context window (which was only 2048 tokens).

Coreference is hard. ("he took the sword"... who is he?) Updating the graph is also hard. (As the story progresses, new facts contradict old facts. Jenny was dating Tom, but now she's dating Mike.)

And knowing what to do with the knowledge graph is hard too, especially if you don't know the schema up front. The only thing we could think to use it for was... programmatically turning relevant sections back into text and prepending it to the context window. (There were easier ways to get a similar effect.)

discuss

Agentlien|3 years ago

It's really fascinating hearing about this and what the issues were. I have played a lot of AI Dungeon on and off and this always felt like part of what was missing: some way for it to keep a structured view of the story to help consistency. The biggest problem has always been that it keeps contradicting itself or lose track of the plot. It's gotten a bit better with the manageable context being fed back each step, but it's still not nearly good enough.

varunshenoy|3 years ago

Handling state (especially long-term) is really a struggle for LLMs right now. This issue should become easier to work with as context windows scale up in the next couple years (or months, who knows!).

dm3|3 years ago

People are already making progress on this, e.g. the H3 project[1].

[1] https://arxiv.org/abs/2212.14052

machiaweliczny|3 years ago

Google scaled context to 40K tokens

unknown|3 years ago

[deleted]

Dwolb|3 years ago

The new facts contradicting old facts thing is fascinating to me.

Why can’t graphs properly model time or sequences?

yorwba|3 years ago

It's possible to model by annotating facts in the database with a timestamp (Wikidata has this, as well as qualifiers for e.g. the source of a statement, or that it applies within a restricted context) but you still need to somehow integrate the information if you want to know the state right now. E.g. if you have (Jenny, date, Tom) from a year ago and (Jenny, date, Mike) from yesterday, does that mean (Jenny, date, Tom) is no longer valid? Or are both simultaneously true? Or is (Jenny, date, Mike) invalid too, because yesterday was like ages ago?

You could have some heuristics to handle this and then you add another relation "has met" and suddenly you need a whole new set of heuristics.

visualphoenix|3 years ago

Cool story! Feeding context back into the 0 shot is the hotness. I’ve had a lot of success with that.

Curious what other (easier) ways you found to accomplish the same effect?

groestl|3 years ago

> programmatically turning relevant sections back into text

I can't help but think, is this the voice in our heads?