top | item 47078404

Show HN: TWFF – A container format for declaring AI use in writing

2 points| normanbell | 11 days ago |github.com

TWFF (Tracked Writing File Format) is my proposal for moving away from so called AI detection to verifiable declaration.

Instead of an external model guessing if a text is AI-generated, TWFF is a ZIP-based container (similar to an EPUB) that stores the document alongside a Process Transcript (JSON).

How it works: 1) It captures Revision Velocity: the delta between human drafting and AI injections. 2) It intercepts paste and AI-interaction events, wrapping them in deterministic metadata. 3) It’s local-first. The audit trail stays with the author until they choose to export the signed container.

This is a v0.1 reference implementation built in Python/NiceGUI. I’m looking for feedback on: > The container structure (XHTML vs. Markdown). > The JSON event schema. > The Revision Distance logic: can we create a fingerprint for human effort that is as difficult to fake as the writing itself?

MVP Demo: https://demo.firl.nl/

TWFF spec:https://github.com/Functional-Intelligence-Research-Lab/TWFF...

5 comments

ryukoposting|11 days ago

The file format seems reasonable enough, it's the mechanics of actually deciding when to mark an edit as AI-generated that I'm curious about.

You mention Canvas and Moodle integration... do you envision students being required to use a TWFF-native editor embedded inside these platforms? If so, it seems to me like the actual hard part would be recreating gdocs but on a upload-your-homework SaaS budget.

And what about block quotes? If I have a couple sentences to quote from another work, will pasting them into the editor cause them to be marked as "AI generated?"

I think you're on the right track here, but it may be better to focus on logging how edits were made, whether by manual typing or pasting. Add an "undo" button that pops the latest edit off the stack. At that point, AI cheating can still be manually detected by searching the change log for long pastes, then inspecting any long pastes for their content.

But even that doesn't actually work, because I could just generate some sludge with ChatGPT, then hand-type the output into the editor. At least you've made it less convenient, I guess.

I really like this idea, best of luck.

normanbell|10 days ago

Thanks for the feedback > On the Hand-typing sludge: at the moment the idea is that if a student/author is forced to manually re-type AI output, the convenience gap narrows significantly. At that point, they are engaging with the text at a character level. More importantly, hand-typing has a distinct revision velocity (natural pauses, backspaces, typos) that differs from a Point-in-Time injection. I'm not trying to make cheating impossible, its more about trying to make it as much work as actually learning.

> 'Paste' vs. 'AI' distinction: In v0.1, we just treat paste and ai_interaction as similar events in the log. The 'AI' tag in the demo is just to show what’s possible. In a production spec, it would likely be logged as external_insertion, and the student could then add a citation.

> Editor vs. Integration: Recreating google docs is a non-starter, the vision is to be plugin first. Instead of a new SaaS, it would be a headless logging engine inside an Extension for Google Docs or an Overleaf plugin.

Love the idea of popping the latest edit off the stack, will probably add it to the next version.

normanbell|11 days ago

Just to pre-answer some question i usually get irl.

Can't I just script a 'human-like' delay and spoof the log? Currently, yes. But v0.1 is about the container. Future iterations will look at signing and making it as computationally expensive to fake the process as it is to just write the text.

Is this just surveillance? It’s an Opt-in Declaration. The user owns the file and the log.