(no title)
badrunaway | 1 year ago
The process can look like-
1. Use existing large models to convert the same previous dataset they were trained on into formal logical relationships. Let them generate multiple solutions
2. Take this enriched dataset and train a new LLM which not only outputs next token but also a the formal relationships between previous knowledge and the new generated text
3. Network can optimize weights until the generated formal code get high accuracy on proof checker along with the token generation accuracy function
In my own mind I feel language is secondary - it's not the base of my intelligence. Base seems more like a dreamy simulation where things are consistent with each other and language is just what i use to describe it.
randcraw|1 year ago
How to extend LLMs to add mechanisms for reasoning, causality, etc (Type 2 thinking)? However that will eventually be done, the implementation must continue to be probabilistic, informal, and bottom-up. Manual human curation of logical and semantic relations into knowledge models has proven itself _not_ to be sufficiently scalable or anti-brittle to do what's needed.
visarga|1 year ago
We could just use RAG to create a new dataset. Take each known concept or named entity, search it inside the training set (1), search it on the web (2), generate it with a bunch of models in closed book mode (3).
Now you got three sets of text, put all of them in a prompt and ask for a wikipedia style article. If the topic is controversial, note the controversy and distribution of opinions. If it is settled, notice that too.
By contrasting web search with closed-book materials we can detect biases in the model and lacking knowledge or skills. If they don't appear in the training set you know what is needed in the next iteration. This approach combines self testing with topic focused research to integrate information sitting across many sources.
I think of this approach as "machine study" where AI models interact with the text corpus to synthesize new examples, doing a kind of "review paper" or "wiki" reporting. This can be scaled for billions of articles, making a 1000x larger AI wikipedia.
Interacting with search engines is just one way to create data with LLMs. Interacting with code execution and humans are two more ways. Just human-AI interaction alone generates over one billion sessions per month, where LLM outputs meet with implicit human feedback. Now that most organic sources of text have been used, the LLMs will learn from feedback, task outcomes and corpus study.
badrunaway|1 year ago
verdverm|1 year ago
PaulHoule|1 year ago
biomcgary|1 year ago
badrunaway|1 year ago
yard2010|1 year ago
badrunaway|1 year ago
My take - Hallucinations can never be made to perfect zero but they can be reduced to a point where these systems in 99.99% will be hallucinating less than humans and more often than not their divergences will turn out to be creative thought experiments (which I term as healthy imagination). If it hallucinates less than a top human do - I say we win :)
jonnycat|1 year ago
wizardforhire|1 year ago
qrios|1 year ago
[1] https://cyc.com
[2] https://github.com/asanchez75/opencyc
badrunaway|1 year ago
lossolo|1 year ago
badrunaway|1 year ago
I was not actually aware that building KG from text is NP-hard problem. I will check it out. I thought it was a time consuming problem when done manually without LLMs but didn't thought it was THAT hard. Hence I was trying to introduce LLM into the flow. Thanks, will read about all this more!
lmeyerov|1 year ago
Eg, KGs (RDF, PGs, ...) are logical, but in automated construction, are not semantic in the sense of the ground domain of NLP, and in manual construction, tiny ontology. Conversely, fancy powerful logics like modal ones are even less semantic in NLP domains. Code is more expressive, but brings its own issues.
badrunaway|1 year ago
slashdave|1 year ago
Why should you believe the output of the LLM just because it is formatted a certain way (i.e. "formal logical relationships")?