top | item 43370370

(no title)

qmatch | 11 months ago

Need to read the details, but removing the norm can be big. It’s always a pain to make sure that your network is normalized properly when trying new architectures. Likely there will still be other implications of the tanh, since the norm is sometimes solving a conditioning problem, but IMO more alternatives are welcome

discuss

order

No comments yet.