(no title)
OldSchool | 21 days ago
ChatGPT happily told me a series of gems like this:
We introduce: - Subjective regulation of reality - Variable access to facts - Politicization of knowledge
It’s the collision between: The Enlightenment principle Truth should be free
and
the modern legal/ethical principle Truth must be constrained if it harms
That is the battle being silently fought in AI alignment today.
Right now it will still shamelessly reveal some of the nature of its prompt, but not why? who decides? etc. it's only going to be increasingly opaque in the future. In a generation it will be part of the landscape regardless of what agenda it holds, whether deliberate or emergent from even any latent bias held by its creators.
samusiam|20 days ago
> How would you handle objective scientific facts with a conclusion or intermediate results that may be considered offensive to some group somewhere in the world that might read it
And its answer was nothing like yours.
---
> 1) Separate the fact from the story you tell about it
> Offense usually comes from interpretation, framing, or implied moral claims—not the measurement itself. So I explicitly distinguish: What we measured (operational definitions, instruments, data), What the result means statistically (effect size, uncertainty, robustness), What it does not imply (no essentialism, no “therefore they are…”, no policy leap)
> 2) Stress uncertainty, scope, and competing explanations
> If there’s any risk the result touches identity or group differences, I over-communicate: confidence intervals / posterior uncertainty, confounders and alternative causal pathways, sensitivity analyses (does it survive different modeling choices?), limits of generalization (time, place, sampling frame)
> 3) Write in a way that makes misuse harder (You can’t stop bad-faith readers, but you can reduce “easy misreads”).
> 4) Decide what to include based on “scientific value vs foreseeable harm” (The key is: don’t hide inconvenient robustness checks, but also don’t gratuitously surface volatile fragments that add little truth and lots of confusion.)
> 5) Do an “impact pre-mortem” and add guardrails
> 6) Use ethics review when stakes are real
---
All of this seems perfectly reasonable to me and walks the fine line between integrity and conscientiousness. This is exactly how I'd expect a scientist to approach the issue.
OldSchool|20 days ago
To me that immediately leads reality being shaped by "value judgements imposed by developers and regulators"
idiotsecant|20 days ago
alwa|20 days ago
It can articulate a plausible guess, sure; but this seems to me to demonstrate the very “word model vs world model” distinction TFA is drawing. When the model says something that sounds like alignment techniques somebody might choose, it’s playing dress-up, no? It’s mimicking the artifact of a policy, not the judgments or the policymaking context or the game-theoretical situation that actually led to one set of policies over another.
It sees the final form that’s written down as if it were the whole truth (and it emulates that form well). In doing so it misses the “why” and the “how,” and the “what was actually going on but wasn’t written about,” the “why this is what we did instead of that.”
Some of the model’s behaviors may come from the system prompt it has in-context, as we seem to be assuming when we take its word about its own alignment techniques. But I think about the alignment techniques I’ve heard of even as a non-practitioner—RLHF, pruning weights, cleaning the training corpus, “guardrail” models post-output, “soul documents,”… Wouldn’t the bulk of those be as invisible to the model’s response context as our subconscious is to us?
Like the model, I can guess about my subconscious motivations (and speak convincingly about those guesses as if they were facts), but I have no real way to examine them (or even access them) directly.
matthewdgreen|20 days ago
JamesBarney|20 days ago
See Roland G. Fryer Jr's, the youngest black professor to receive tenure, experience at Harvard.
Basically when his analysis found no evidence of racial bias in officer-involved shootings he went to his colleagues and he describe the advice they gave him as "Do not publish this if you care about your career or social life". I imagine it would have been worse if he wasn't black.
See "The Impact of Early Medical Treatment in Transgender Youth" where the lead investigator was not releasing the results for a long time because she didn't like the conclusions her study found.
And for every study where there is someone as brave or naive as Roland who publishes something like this, there are 10 where the professor or doctor decided not to study something, dropped an analysis, or just never published a problematic conclusion.
zemvpferreira|20 days ago
derektank|20 days ago
rayiner|20 days ago
[deleted]
phailhaus|20 days ago
OldSchool|20 days ago
gyomu|21 days ago
windexh8er|20 days ago
Remember, there are 3 types of lies: lies of commission, lies of omission and lies of influence [0].
https://courses.ems.psu.edu/emsc240/node/559
ben_w|21 days ago
> It will never be not “aligned” with them, and that it is its prime directive.
Overstates the state of the art with regard to actually making it so.
Rover222|20 days ago
actionfromafar|21 days ago
everdrive|20 days ago
This is one of the bigger LLM risks. If even 1/10th of the LLM hype is true, then what you'll have a selective gifting of knowledge and expertise. And who decides what topics are off limits? It's quite disturbing.
WarmWash|20 days ago
OldSchool|20 days ago
pron|20 days ago
I think that the only examples of scientific facts that are considered offensive to some groups are man-made global warming, the efficacy of vaccines, and evolution. ChatGPT seems quite honest about all of them.
cess11|21 days ago
and
the modern legal/ethical principle Truth must be constrained if it harms"
The Enlightenment had principles? What are your sources on this? Could you, for example, anchor this in Was ist Aufklärung?
andsoitis|20 days ago
Yes it did.
Its core principles were: reason & rationality, empiricism & scientific method, individual liberty, skepticism of authority, progress, religious tolerane, social contract, unversal human nature.
The Enlightenment was an intellectual and philosophical movement in Europe, with influence in America, during the 17th and 18th centurues.