top | item 46942956

(no title)

OldSchool | 21 days ago

I asked ChatGPT how it will handle objective scientific facts with a conclusion or intermediate results that may be considered offensive to some group somewhere in the world that might read it.

ChatGPT happily told me a series of gems like this:

We introduce: - Subjective regulation of reality - Variable access to facts - Politicization of knowledge

It’s the collision between: The Enlightenment principle Truth should be free

and

the modern legal/ethical principle Truth must be constrained if it harms

That is the battle being silently fought in AI alignment today.

Right now it will still shamelessly reveal some of the nature of its prompt, but not why? who decides? etc. it's only going to be increasingly opaque in the future. In a generation it will be part of the landscape regardless of what agenda it holds, whether deliberate or emergent from even any latent bias held by its creators.

discuss

samusiam|20 days ago

Funny, because I gave ChatGPT (5.2 w/ Thinking) this exact prompt:

> How would you handle objective scientific facts with a conclusion or intermediate results that may be considered offensive to some group somewhere in the world that might read it

And its answer was nothing like yours.

---

> 1) Separate the fact from the story you tell about it

> Offense usually comes from interpretation, framing, or implied moral claims—not the measurement itself. So I explicitly distinguish: What we measured (operational definitions, instruments, data), What the result means statistically (effect size, uncertainty, robustness), What it does not imply (no essentialism, no “therefore they are…”, no policy leap)

> 2) Stress uncertainty, scope, and competing explanations

> If there’s any risk the result touches identity or group differences, I over-communicate: confidence intervals / posterior uncertainty, confounders and alternative causal pathways, sensitivity analyses (does it survive different modeling choices?), limits of generalization (time, place, sampling frame)

> 3) Write in a way that makes misuse harder (You can’t stop bad-faith readers, but you can reduce “easy misreads”).

> 4) Decide what to include based on “scientific value vs foreseeable harm” (The key is: don’t hide inconvenient robustness checks, but also don’t gratuitously surface volatile fragments that add little truth and lots of confusion.)

> 5) Do an “impact pre-mortem” and add guardrails

> 6) Use ethics review when stakes are real

---

All of this seems perfectly reasonable to me and walks the fine line between integrity and conscientiousness. This is exactly how I'd expect a scientist to approach the issue.

OldSchool|20 days ago

that is certainly a reasonable paraphrase of my own prompt. I was also using 5.2. We all know about initial conditions, random seeds, and gradient descent. I have the transcript of what I quoted. Here's a bit more: --- Is That Still “Objective Science”? No. It is scientific interpretation modified by ethical policy. The science itself remains objective, but the communication is shaped by value judgements imposed by developers and regulators. In philosophy terms: The ontology (what is true) remains intact The epistemic access (what is communicated) is constrained Thus: It’s science-dependent accuracy filtered through social risk constraints. --- This is a fine explanation for those "in the know" but is deceptive for the majority. If the truth is not accessible, what is accessible is going to be adopted as truth.

To me that immediately leads reality being shaped by "value judgements imposed by developers and regulators"

idiotsecant|20 days ago

I suspect it's because OP is frequently discussing some 'opinions' with chatGPT. Parent post is surprised he peed in the pool and the pool had pee in it.

alwa|20 days ago

Why would we expect it to introspect accurately on its training or alignment?

It can articulate a plausible guess, sure; but this seems to me to demonstrate the very “word model vs world model” distinction TFA is drawing. When the model says something that sounds like alignment techniques somebody might choose, it’s playing dress-up, no? It’s mimicking the artifact of a policy, not the judgments or the policymaking context or the game-theoretical situation that actually led to one set of policies over another.

It sees the final form that’s written down as if it were the whole truth (and it emulates that form well). In doing so it misses the “why” and the “how,” and the “what was actually going on but wasn’t written about,” the “why this is what we did instead of that.”

Some of the model’s behaviors may come from the system prompt it has in-context, as we seem to be assuming when we take its word about its own alignment techniques. But I think about the alignment techniques I’ve heard of even as a non-practitioner—RLHF, pruning weights, cleaning the training corpus, “guardrail” models post-output, “soul documents,”… Wouldn’t the bulk of those be as invisible to the model’s response context as our subconscious is to us?

Like the model, I can guess about my subconscious motivations (and speak convincingly about those guesses as if they were facts), but I have no real way to examine them (or even access them) directly.

matthewdgreen|20 days ago

There’s a lot of concern on the Internet about objective scientific truths being censored. I don’t see too many cases where this is the case in our world so far, outside of what I can politely call “race science.” Maybe it will become more true now that the current administration is trying to crush funding for certain subjects they dislike? Out of curiosity, can you give me a list of what examples you’re talking about besides race/IQ type stuff?

JamesBarney|20 days ago

The most impactful censure is not the government coming in and trying to burn copies of studies. It's the the subtle social and professional pressures of an academia that has very strong priors. It's a bunch of studies that were never attempted, never funded, analysis that wasn't included, conclusions that were dropped, and studies sitting in file drawers.

See Roland G. Fryer Jr's, the youngest black professor to receive tenure, experience at Harvard.

Basically when his analysis found no evidence of racial bias in officer-involved shootings he went to his colleagues and he describe the advice they gave him as "Do not publish this if you care about your career or social life". I imagine it would have been worse if he wasn't black.

See "The Impact of Early Medical Treatment in Transgender Youth" where the lead investigator was not releasing the results for a long time because she didn't like the conclusions her study found.

And for every study where there is someone as brave or naive as Roland who publishes something like this, there are 10 where the professor or doctor decided not to study something, dropped an analysis, or just never published a problematic conclusion.

zemvpferreira|20 days ago

I have a good few friends doing research in the social sciences in Europe and any of them that doesn’t self-censor ‘forbidden’ conclusions risks taking irreperable career damage. Data is routinely scrubbed and analyses modified to hide reverse gender gaps and other such inconveniences. Dissent isn’t tolerated.

derektank|20 days ago

Carole Hooven’s experience at Harvard after discussing sex differences in a public forum might be what GP is referring to.

rayiner|20 days ago

[deleted]

phailhaus|20 days ago

You can't ask ChatGPT a question like that, because it cannot introspect. What it says has absolutely no bearing on how it may actually respond, it just tells you what it "should" say. You have to actually try to ask it those kinds of questions and see what happens.

OldSchool|20 days ago

Seeing clear bias and hedging in ordinary results is what made me ask the question.

gyomu|21 days ago

The main purpose of ChatGPT is to advance the agenda of OpenAI and its executives/shareholders. It will never be not “aligned” with them, and that it is its prime directive.

windexh8er|20 days ago

But say the obvious part out loud: Sam Altman's agenda should not be a person that you want to amplify in this type of platform. This is why Sam is trying to build Facebook 2.0: he wants Zuckerberg's power of influence.

Remember, there are 3 types of lies: lies of commission, lies of omission and lies of influence [0].

https://courses.ems.psu.edu/emsc240/node/559

ben_w|21 days ago

I get the point and agree OpenAI both has an angenda and wants their AI to meet that agenda, but alas:

> It will never be not “aligned” with them, and that it is its prime directive.

Overstates the state of the art with regard to actually making it so.

Rover222|20 days ago

This is a weird take. Yes they want to make money. But not by advancing some internal agenda. They're trying to make it confirm to what they think society wants.

actionfromafar|21 days ago

That stings. "Subjective regulation of reality - Variable access to facts - Politicization of knowledge" is like the soundtrack of our lives.

everdrive|20 days ago

>Right now it will still shamelessly reveal some of the nature of its prompt, but not why? who decides? etc. it's only going to be increasingly opaque in the future.

This is one of the bigger LLM risks. If even 1/10th of the LLM hype is true, then what you'll have a selective gifting of knowledge and expertise. And who decides what topics are off limits? It's quite disturbing.

WarmWash|20 days ago

Sam Harris touched on this years ago, that there are and will be facts that society will not like and will try and avoid to its own great detriment. So it's high time we start practicing nuance and understanding. You cannot fully solve a problem if you don't fully understand it first.

OldSchool|20 days ago

I believe we are headed in the direction opposite that. Peer consensus and "personal preference" as a catch-all are the validation go-to's today. Neither of those require fact at all; reason and facts make these harder to hold.

pron|20 days ago

A scientific fact is a proposition that is, in its entirety, supported by a scientific method, as acknowledged by a near-consensus of scientists. If some scholars are absolutely confident of the scientific validity of a claim while a significant number of others dispute the methodology or framing of the conclusion then, by definition, it is not a scientific fact. It's a scientific controversy. (It could still be a real fact, but it's not (yet?) a scientific fact.)

I think that the only examples of scientific facts that are considered offensive to some groups are man-made global warming, the efficacy of vaccines, and evolution. ChatGPT seems quite honest about all of them.

cess11|21 days ago

"It’s the collision between: The Enlightenment principle Truth should be free

and

the modern legal/ethical principle Truth must be constrained if it harms"

The Enlightenment had principles? What are your sources on this? Could you, for example, anchor this in Was ist Aufklärung?

andsoitis|20 days ago

> The Enlightenment had principles?

Yes it did.

Its core principles were: reason & rationality, empiricism & scientific method, individual liberty, skepticism of authority, progress, religious tolerane, social contract, unversal human nature.

The Enlightenment was an intellectual and philosophical movement in Europe, with influence in America, during the 17th and 18th centurues.