fdalvi's comments

fdalvi | 7 months ago | on: GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2

It is indeed not something clarified by the code snippets; In normal feedforward layers, it is common to choose the "hidden_dim = 4 x emb_dim", while in GLU feedforward layer, the convention is to use "hidden_dim = 2/3 * regular_ffn_hidden_dim" (to keep the overall number of parameters roughly the same). In the case of gpt-oss, they chose to go a bit more extreme and set "hidden_dim = emb_dim", thus reducing the overall number of parameters!

fdalvi | 3 years ago | on: VALL-E: Microsoft’s new zero-shot text-to-speech model

Previous discussion here: https://news.ycombinator.com/item?id=34270311

fdalvi | 5 years ago | on: How to convert existing web extensions for Safari

Actually certain functionalities necessary for uBlock origin are still blocked, so the port is not possible as of yet: https://www.reddit.com/r/uBlockOrigin/comments/hdz0bo/will_u...

fdalvi | 5 years ago | on: Show HN: Quake 1 movement physics reinforcement learning project

It wouldn't be difficult at all if the optimal running technique was known before hand; I think the goal of many of these RL exercises is to either i) find a better solution than what we may have imagined or ii) confirm that our knowledge was indeed the best possible solution!

fdalvi | 8 years ago | on: Show HN: Can you think like a word vector? A game for exploring word embeddings

This is really awesome!

I don't have any ideas on how to make it more "human-like", but for some words it would have been nice to have their definitions (maybe in a tooltip over the words).