top | item 44968186

(no title)

oehpr | 6 months ago

Just some food for thought: I was recently brainstorming ideas for building a more decentralized moderation system, and one of the ideas I arrived at was using the rules themselves as part of the flagging system.

It would work like this: When you flag a post for breaking the rules, the community's guidelines will pop up. You are then asked in this window to highlight the relevant section or sections of those rules that this post has violated. And I don't mean just "select which rule was violated", I mean "use your cursor and highlight the text of the rules that were violated." (with support for highlighting multiple sections if so desired).

This serves the following functions:

1. Communicates why something was flagged (obviously).

2. Forces the person who's flagging the submission to actually read the rules.

3. The subjectivity of the highlighting system is used to make Sybil attacks more obvious. I'll explain why after this list.

4. It differentiates flagging from downvoting. Downvoting is for saying "I don't like this". Flagging is for saying "This violates our community's rules".

As to why this helps reveal Sybil attacks: There are several subjective points on what, where, and how people will highlight rules. Should punctuation be included or not? Should the key word in the rule be highlighted? The key sentence? The whole section? What about examples? Should we include them? Or only highlight them? Users operating in good faith will cluster around common points in common areas, but will have different ways of doing so. So, if a block of users all have: the same input, in the same way, clustered around the same time, then it was likely a Sybil attack.

This system doesn't require that it de-anonymize the people who submit flags, but it does provide a form of publicly visible transparency as to why something was flagged.

Edit: I forgot to make clear, you would be able to see a heat map of the rules that were highlighted for a flagged post.

I'd be interested to hear any thoughts on this idea.

discuss

Paradigm2020|6 months ago

Except for the highlight the text (think people on mobile, people with disabilities etc) it sounds like a good plan.

Than other random "judges" would be asked if the reason given by the "accuser" are correct. There would have to be some "cost" in karma to flag a post (or limit of X flags / day for X karma status or smth) and some reward in karma for being chosen as a judge/jury.

Also the need to have a minimum flagging weight and a minimum of judging weight and to reconcile conflicting votes.

Anyway would love to talk about it more but tbh it's probably not gonna happen also because most people don't like jury duty... Maybe when ai gets over the "hallucinations" but well at that point we can also get our individual ai's to read everything and judge for us

oehpr|6 months ago

I don't find highlighting text on mobile to be too difficult, so I don't see that as a barrier imo.

for disabilities well... That one I dunno. I don't have a good concept of what kinds of UI are most convenient for each type of accessibility case.

And it's a little tempting to get lost in the weeds of who watches the watchers, but to be honest even if implemented in hacker news case, the mods themselves could vet flags for anomalies. Just this on its own would serve as a force multiplication for HN mods.

For more decentralized forms of moderation. One method might just be a simple flag appeal. Circles back to the community, they can discuss if the rule that is cited is fair, and if it wasn't possibly remove or limit flagging abilities of those who cited the rule incorrectly. And possibly some increased punishment if the appeal fails? There are lots of options there. Big wide design space.

I do think the direct text highlighting has a few important features. The Sybil attack resistance is one. That was one of the OP's primary concerns. Also, clarity on what rule was broken and why is very important, and a given rule can be verbose. It might not be obvious what specifically in a given rule was the reason for the violation. Direct highlighting lets flaggers more directly communicate what the issue is, without opening the communication channel up for a flame war.