top | item 39951571

Loki: An open-source tool for fact verification

238 points| Xudong | 1 year ago |github.com

68 comments

order

axegon_|1 year ago

Overall great idea though, I'll be definitely checking it back in the future. A few things that hit me out of the box:

* The idea behind using Serper is great, however it would be cool if other search engines/data sources can be used instead, ie. Kagi or some private search engine/data. Reason for the latter: there are tons of people who are sourcing all sorts of information which will not immediately show up on google and some might never do. For context: I have roughly 60GB (and growing) of cleaned news article with where I got them from and with a good amount of pre-processing done on the fly(I collect those all the time).

* Relying heavily on OpenAI. Yes, OpenAI is great but there's always the thing at the back of our minds that is "where are all those queries going and do we trust that shit won't hit the fan some day". It would be nice to have the ability to use a local LLM, given how many and how good there are around.

* The installation can be improved massively: setuptools + entry_points + console_scripts to avoid all the hassle behind having to manage dependencies, where your scripts are located and all that. The cp factcheck/config/secret_dict.template factcheck/config/secret_dict.py is a bit.... Uuuugh... pydantic[dotenv] + .env? That would also make the containerizing the application so much easier.

Xudong|1 year ago

Thank you for your suggestions, axegon!!! We will definitely consider them and add the features in a future version shortly.

Regarding the first version, we are currently working on enabling customized evidence retrieval, including local files. Our plan is to integrate existing tools like LlamaIndex. Any suggestion is greatly appreciated!

Regarding the second point, we have found OpenAI's JSON mode to be greatly helpful, and have optimized our prompts to fully utilize these advances. However, we agree that it would be beneficial to enable the use of other models. As promised, we will add this feature soon.

Lastly, we appreciate your suggestion and will work on improving the installation process for the next version.

xyst|1 year ago

I fully expect some sort of enshittification of openai at some point.

swores|1 year ago

Feedback on the example gif: at the moment it's almost comically useless. First you're bored watching the beginning 90% while commands are slowly being typed, and then the bit that's actually interesting and worth reading scrolls too fast and then resets to the beginning of the gif before there's a chance to read it.

Xudong|1 year ago

Thanks for your feedback on the gif figure, swores! We will revise it soon.

eMPee584|1 year ago

mpv ftw: playback speed control even for gifs..

martinbaun|1 year ago

Maybe the name is not so fitting as Loki is a name in Norse Mythology. Known for deceiving and lying which is basically the opposite you're trying to do :)

smoyer|1 year ago

It's also the name of a well-known open-source log collection system that's part of the LGTM stack (predominantly led by GrafanaCloud Labs.)

croes|1 year ago

Maybe it's on purpose.

Who could better know the patterns of liars than the god of lying.

vinni2|1 year ago

It’s a bit misleading to call it open source tool when it relies on proprietary LLMs for everything.

btbuildem|1 year ago

Presumably the LLMs are swappable -- today the proprietary ones are very powerful and accessible, but the landscape may yet change.

rjb7731|1 year ago

Isn't this similar to the Deepmind paper on long form factuality posted a few days ago?

https://arxiv.org/abs/2403.18802

https://github.com/google-deepmind/long-form-factuality/tree...

Xudong|1 year ago

Yes, they are similar. Actually, our initial paper was presented around five months ago (https://arxiv.org/abs/2311.09000). Unfortunately, our paper isn't cited by the DeepMind paper, which you may see this discussion as an example: https://x.com/gregd_nlp/status/1773453723655696431

Compared with our initial version, we have mainly focused on its efficiency, with a 10X faster checking process without decreasing accuracy.

RcouF1uZ4gsC|1 year ago

> This tool is especially useful for journalists, researchers, and anyone interested in the factuality of information.

Sorry, I think an individual who is not only aware of reliable sources to verify information, and who is not familiar enough with LLMs to come up with appropriate prompts and judge output should be the last person presenting themselves as the judger of factual information.

Xudong|1 year ago

Thanks for your response. When discussing fact-checking capabilities, the key question is always: Can we guarantee that it will always offer the correct justification? While it's unfortunate, errors can occur. Nonetheless, we prioritize making the checking process both interpretable and transparent, allowing users to understand and trust the rationale behind each assessment.

We present the results at each step to help users understand the decision process, which can be seen from our screenshot at https://raw.githubusercontent.com/Libr-AI/OpenFactVerificati...

We will try our best to ensure this tool makes a positive difference

chamomeal|1 year ago

Very cool! I’ve toyed with an idea like this for a while. The scraping is a cool extra feature, but tbh just breaking down text into verifiable claims and setting up the logic tokens is way cooler.

I imagine somebody feeding a live presidential debate into this. Could be a great tool for fact checking

Xudong|1 year ago

ahah thanks!

dscottboggs|1 year ago

That seems like something unlikely to do well at being automated, and not that at least current-gen ai is capable of.

Does it...work?

Xudong|1 year ago

Hi there, I agree that fact-checking is not something that current generative AI models can directly solve. Therefore, we decompose this complex into five simpler steps, which current techniques can better solve. Please refer to https://github.com/Libr-AI/OpenFactVerification?tab=readme-o... for more details.

However, errors can always occur. We try to help users in an interpretable and transparent way by showing all retrieved evidence and the rationale behind each assessment. We hope this could at least help people when dealing with such problems.

szszrk|1 year ago

I just tried similar queries as they show on their screenshots with Kagi. Basically asked it the exact same question.

While it answered a general "yes" when the more precise answer was "no", the motivation in the answer was perfectly on point and exactly the same things.

As a general LLM for regular user fastGPT (their llm service) is in my opinion "meh" (lacks conversations for instance). But it's really impressive that it contains VERY recent data (like news and articles from last few days) and always provides great references.

dekervin|1 year ago

I have a project where I take a different approach [0] . I basically extract statements , explicit or implicit , that should be accompanied by a reference to some data but aren't and I let user find the most relevant data for those statements.

[0] https://datum.alwaysdata.net/

evolve2k|1 year ago

The main problem / drawback of LLM’s is their propensity to hallucinate (or lie, said plainly) and thus is a significant issue even for ChatGPT 4.

Intuitively, just because you put your LLM into a workflow/pipeline this doesn’t really address how to eliminate hallucinations.

For those of us that don’t follow the research closely, can you explain how your findings and approach allows you to utilise LLMS and work around this hard limitation. Said another way, how are you getting around the fact that LLMs themselves regularly output lies/false answers?

Der_Einzige|1 year ago

You might want to look into integrating DebateSum or OpenDebateEvidence (OpenCaseList) into this tool as sources of evidence. They are uniquely good for these sorts of tasks:

https://huggingface.co/datasets/Hellisotherpeople/DebateSum

https://huggingface.co/datasets/Yusuf5/OpenCaselist

Xudong|1 year ago

Hi Der_Einzige, thanks for pointing out these two great datasets! We are currently working on including customized evidence sources internally and will definitely consider these two datasets in the future version of this open-source project.

kkfx|1 year ago

IMVHO people do not need "automated fact verification" as a source of trust we can't trust, but summarizers: most FLOSS users and not so few computer users in the largest sense do use feeds, but they got many posts per day and some days they like to read them all, some others they are busy. Tools to skim news and offer a sort of index to decide what to see, a kind of smart scoring is much more interesting.

njrc9|1 year ago

Agree. "Fact-checking" can never be more than assertions of a particular bias. I am surprised that this project has received so few critical comments along these lines here.

The idea that "specificity," such as what scientific research aims for, can be better evaluated for truthfulness or approach what "truly matters," as this project purports, is dubious. E.g., why would a notion that is more limited in scope matter more than something more vast (to use the word that it cites as an example)? In addition to its dystopian idea of a "source of truth," it completely dismisses "vague" language in the name of "science" or "factuality," which is utterly the opposite of science, which I thought was to understand ourselves and nature with as few presuppositions as possible.

siffland|1 year ago

When I saw Loki as the name, I instantly thought of Grafana Loki for logging. I click on the GitHub and get Libr-AI and OpenFactVerification.

I am not commenting on the actual software and I know names are hard and often overlap, but with something as popular as Loki already used for logging I think it might get confusing.

Xudong|1 year ago

Hi siffland! Thank you for your feedback. We understand your concern about the potential confusion given the popularity of Grafana Loki in the logging space. When naming our project, we sought a name that encapsulates our goal of combating misinformation. We chose Loki, inspired by the Norse god often associated with stories and trickery, to symbolize our commitment to unveiling the truth hidden within nonfactual information.

When we named our project, we were unaware of the overlap with Grafana Loki. We appreciate you bringing this to our attention! I will discuss this issue with my team in the next meeting, and figure out if there is a better way of solving this. If you have any suggestions or thoughts on how we can better differentiate our project, we would love to hear them.

Thank you again for your valuable input!

njrc9|1 year ago

How is information qualified as evidence (e.g., the “Evidence Crawler” functionality)?

The best case scenario would seem to be that results are derived from certain biases built into the model, unless it weighs “factuality” by the number of occurrences of certain statements on the internet which is as far from a qualification for truthfulness as the biased model.

dfgdfg34545456|1 year ago

The last time I looked we can't even parse and build a semantic model for anything more than simple sentences to build a coherent representation of their meaning. Which tells me this is just some glorified sort of fuzzy matching algorithm.

badrunaway|1 year ago

I found it very interesting. I had this funny thought that just like CAPTCHA, may be soon we will have to ask humans to give their input on fact verification systems at scale.

eeue56|1 year ago

Interesting. In the Nordics, we have a couple of sites dedicated to fact checking news stories, done by real people. I think these kinds of automated tools can be helpful too, but needs to be tied to reliable sources. This became pretty apparent to me with the tech news coverage of xz, too. Lots of accidental (or sometimes intentional?) misinformation being spread in news articles. I wrote about it a bit[0], it was pretty sad to see big international publishers publishing an article based entirely on the journalist's misunderstandings of the situation. Facts and truth is important, especially as we see gen AI furthering the amount of legitimate looking content online that might not actually be true.

[0] - https://open.substack.com/pub/thetechenabler/p/trust-in-brea...

pelasaco|1 year ago

> In the Nordics, we have a couple of sites dedicated to fact checking news stories, done by real people.

We have it everywhere. The problem is however well-known: Human bias, political engagement from the fact checkers, etc.. AI (without any kind of lock, political bias built-in etc) could be the real deal, but because it may be not political correct, it will never happen.

Xudong|1 year ago

I wholeheartedly agree on the necessity of linking fact-checking tools to credible sources. Currently, our team's expertise lies primarily in AI, and we find ourselves at a disadvantage when it comes to pinpointing authoritative sources. Acknowledging the challenges posed by the rapid spread of misinformation, as highlighted by recent studies, we developed this prototype to assist in information verification. We recognize the value of collaboration in enhancing our tool's effectiveness and invite those experienced in evaluating sources to join our effort. If our project interests you and you're willing to contribute, please don't hesitate to reach out. We're eager to collaborate and make a positive impact together.

t0bia_s|1 year ago

How does AI observe facts in real world? I find hilarious clarify something as fact checkig based on data on internet.

verdverm|1 year ago

How does anyone who was not there? The answer should be similar

KETpXDDzR|1 year ago

Great idea. However, I wouldn't trust it's results since it's heavily relying on LLMs and crawling the web. That means "facts" are whatever is the most popular opinion in the Internet. At times where we get more and more enshittification, you'll probably get your "facts" from LLMs generated SEO websites.

I think the only proper way to verify facts is to derive them from "fundamental facts". E.g., that the earth is round (and even for that there are ppl believing the opposite).

Cartoxy|1 year ago

garbage, require an invitation might as well be non existent "Open Source"

1231232131231|1 year ago

Horrible AI-generated imagery especially with all the AI garbled words.

redder23|1 year ago

The name Loki is such a great fit! WOW!

This is some giant BS that is for sure. Some stupid, literally brain-dead AI searching things created by humans to determine what is a "fact". This is beyond dystopian crap.

We all know all the fact-checker orgs. used by big tech like Facebook and others are filled with hyper biased woke people who do not actually fact-check things but get off on having the power to enforce their beliefs, feelings and biases.

I can already tell this is total BS without even looking into it, what kinds of sources will it use? What ranking will they give them? Snopes? ROFL. Probably just uses some woke infested, censored and curated language model to determine a fact based on what has the most matches or THE MOST LIKELY because that how AI works. Has absolutely nothing to do with facts.

And it's even worse, we are literally in a time when AI hallucinates things that do not exist. I won't use a stupid AI to find me "facts".

featbit|1 year ago

How can I get an invitation code?