top | item 39461614

(no title)

opportune | 2 years ago

I think this is a fair approach when things work well enough that a typical user doesn’t need to worry about whether they’ll trigger some kind of special content/moderation logic. If you shadowban spammers and real users almost never get flagged as spammers, the benefits of being tight-lipped outweigh those of the very few users who get improperly flagged or are just curious.

With some of these models the guardrails are so clumsy and forced that I think almost any typical user will notice them. Because they include outright work-refusal it’s a very frustrating UX to have to “discover” the policy for yourself through trial and error.

And because they’re more about brand management than preventing fraud/bad UX for other users, the failure modes are “someone deliberately engineered a way to get objectionable content generated in spite of our policies.” Obviously some kinds of content are objectionable enough for this to be worth it still, but those are mostly in the porn area - if somebody figures out a way to generate an image that’s just not PC, despite all the safety features, shouldn’t that be on them rather than the provider?

Even tuning the model for political correctness is not the end of the world in my opinion, a lot of LLMs do a perfectly reasonable job for my regular use cases. With image generators they are going so far as to obviously (there’s no other way that makes sense) insert diversity sub prompts for some fraction of images which is simply confusing and amateur. Everybody who uses these products just a little bit will notice it. It’s also so cautious that even mild stuff (I tried to do the “now make it even more X” with “American” and it stopped at one iteration) gets caught in the filters. You’re going to find out the policies anyway because they’re so broad an likely to be encountered while using the product innocently - anything a real non-malicious user is likely to get blocked by should be documented.

discuss

No comments yet.