top | item 39458335

(no title)

I would also love to see more transparency around AI behavior guardrails, but I don't expect that will happen anytime soon. Transparency would make it much easier to circumvent guardrails.

discuss

Jensson|2 years ago

Why is it an issue that you can circumvent the guardrails? I never understood that. The guard rails are there so that innocent people doesn't get bad responses with porn or racism, a user looking for porn or racism getting that doesn't seem to be a big deal.

bluefirebrand|2 years ago

The problem is bad actors who think porn or racism are intolerable in any form, who will publish mountains of articles condemning your chatbot for producing such things, even if they had to go out of their way to break the guardrails to make it do so.

They will create boycotts against you, they will lobby government to make your life harder, they will petition payment processors and cloud service providers to not work with you.

We've see this behavior before, it's nothing new. Now if you're the type to fight them, that might not be a problem. If you are a super risk-averse board of directors who doesn't want that sort of controversy, then you will take steps not to draw their attention in the first place.

viraptor|2 years ago

If you can get it on purpose, you can get it on accident. There's no perfect filter available so companies choose to cut more and stay on the safe side. It's not even just the overt cases - their systems are used by businesses and getting a bad response is a risk. Think of the recent incident with airline chatbot giving wrong answers. Now think of the cases where GPT gave racially biased answers in code as an example.

As a user who makes any business decision or does user communication including LLM, you really don't want to have a bad day because the LLM learned about some bias decided to merge it into your answer.

lmm|2 years ago

> The guard rails are there so that innocent people doesn't get bad responses with porn or racism

That seems pretty naive. The "guard rails" are there to ensure that AI is comfortable for PMC people, making it uncomfortable for people who experience differences between races (i.e. working-class people) is a feature not a bug.

finikytou|2 years ago

racism victims being defined in 2024 by anyone but western/white people. being erased seems ok. can you bet than in 20 years the standard will not shift to mixed race people like me? then you will also call people complaining racist and put guardrails against them... this is where it is going

charcircuit|2 years ago

Like a lot of potentially controversial things it comes down to brand risk.

unethical_ban|2 years ago

>The guard rails are there so that innocent people doesn't get bad responses

The guardrails are also there so bad actors can't use the most powerful tools to generate deepfakes, disinformation videos and racist manifestos.

That Pandora's box will be open soon when local models run on cell phones and workstations with current datacenter-scale performance. I'm the meantime, they're holding back the tsunami of evil shit that will occur when AI goes uncontrolled.

asdff|2 years ago

Transparency may also subject these companies to litigation from groups that feel they are misrepresented in whatever way in the model.

Jason_Protell|2 years ago

This makes me wonder, how much lawyering is involved in the development of these tools?

xanderlewis|2 years ago

Security through obscurity?