top | item 45246368

(no title)

bigdict | 5 months ago

What's the point of the relu in the loss function? Its inputs are nonnegative anyway.

discuss

order

Nevermark|5 months ago

Let's try to keep things positive.

GolDDranks|5 months ago

I wondered the same. Seems like it would just make a V-shaped loss around the zero, but abs has that property already!

fancyfredbot|5 months ago

RELU would have made it flat below zero ( _/ not \/). Adding the abs first just makes RELU do nothing.

andy_ppp|5 months ago

In reality it’s probably not a RELU modern LLMs use GeLU or something more advanced.

meindnoch|5 months ago

Sometimes a cosmic ray might hit the sign bit of the register and flip it to a negative value. So it is useful to pass it through a rectifier to ensure it's never negative, even in this rare case.

lblume|5 months ago

Indeed, we should call all idempotent functions twice just in case the first incantation fails to succeed.

In all seriousness, this is not at all how resilience to cosmic interference works in practice, and the probability of any executed instruction or even any other bit being flipped is far greater than the one specific bit you are addressing.

fancyfredbot|5 months ago

I thought the belt and braces approach was a valuable contribution to AI safety. Better safe than sorry with these troublesome negative numbers!

naniwaduni|5 months ago

Well, I guess it's helping to distinguish authors who are doing arithmetic they understand from ones who are copying received incantations around...