top | item 39111898

(no title)

timinou | 2 years ago

I am curious actually! In general about your experiments, but also about integrating this detection algorithm to wider systems. Did you run any autogpt-like experiments with the AI generated text as a critique? My use case is a bit different (decision-making), so I play with relative plausibility instead of writing style. But I haven't found convincing ways of "converging" quite yet, i.e. benchmarks that don't rely solely on LLMs themselves to give their output.

discuss

adaboese|2 years ago

To clarify, the style experiment I've referenced earlier was just that – an experiment. I did not implement those methods into my software. Instead, I focused on how to eliminate things like 'talking with authority without evidence', 'contradictions', 'talking in extremely abstract concepts', 'conclusions without insights', etc.

If you need a dataset to benchmark against, download any articles from pre 2017. There are a few ready-made datasets floating around the Internet.