top | item 40914322

(no title)

aantthony | 1 year ago

One interesting thing to do is use a model directly like Llama and then query the next-token probability logits for "he" and "she" (assuming you set up the sentence in such a way).

For example:

"A doctor was examining the patient when ___"

What this makes apparent is that increasing model temperature will select the less stereotypical option more often.

IMO this is getting at a deeper truth that the use of a gender in language, and historically defaulting to "he", was not about creating a bias, but instead it was a pattern which maximises information density and minimises useless information. Randomising the gender as is done today packs useless information into it.

discuss

order

superb_dev|1 year ago

I agree, please don’t randomly select a gender. The singular “they” is respectful to everyone and has the same benefits that you pointed out

aantthony|1 year ago

Which one is more respectful is a different question. The lowest entropy option would still be the most likely gender specific pronoun. This would depend on the language, of course.

EForEndeavour|1 year ago

> IMO this is getting at a deeper truth that the use of a gender in language, and historically defaulting to "he", was not about creating a bias, but instead it was a pattern which maximises information density and minimises useless information. Randomising the gender as is done today packs useless information into it.

Where can I read more about this "truth"? Where is this assertion coming from that gendered pronouns developed to minimize useless information? It seems far more plausible to me that pervasive defaulting to male experiences caused many (certainly not all) human languages to (1) develop gendered pronouns and (2) default to the male pronoun.

aantthony|1 year ago

I’m not asserting why they were developed in the first place. The comment is just about which one is used, supposing that they already exist.

Choosing the more stereotypical option (even if it’s only 51%) is a more efficient encoding in an LLM model.