I feel like that blogpost was almost just ragebait for ai researchers. It goes between calling not including the +1 an error (which to me implies it would improve training losses, which it doesn't really https://news.ycombinator.com/item?id=36854613) and saying possibly it could help with some types of quantization (which could very well be true but is a much weaker statement) and the author provides basically no evidence for either.
godelski|1 year ago
I know he's an economist btw. I was also surprised he got a job at anthropic a few months after. I wonder if they're related.