GloamingNiblets | 1 month ago | on: Eat Real Food
GloamingNiblets's comments
GloamingNiblets | 1 month ago | on: Sugar industry influenced researchers and blamed fat for CVD (2016)
GloamingNiblets | 3 months ago | on: Cognitive and mental health correlates of short-form video use
From the paper: "repeated exposure to highly stimulating, fast-paced content may contribute to habituation, in which users become desensitized to slower, more effortful cognitive tasks such as reading, problem solving, or deep learning. This process may gradually reduce cognitive endurance and weaken the brain’s ability to sustain attention on a single task... potentially reinforcing impulsive engagement patterns and encouraging habitual seeking of instant gratification".
GloamingNiblets | 3 months ago | on: Waymo robotaxis are now giving rides on freeways in LA, SF and Phoenix
GloamingNiblets | 3 months ago | on: Waymo robotaxis are now giving rides on freeways in LA, SF and Phoenix
GloamingNiblets | 3 months ago | on: Waymo robotaxis are now giving rides on freeways in LA, SF and Phoenix
GloamingNiblets | 4 months ago | on: Unexpected patterns in historical astronomical observations
GloamingNiblets | 7 months ago | on: A Photonic SRAM with Embedded XOR Logic for Ultra-Fast In-Memory Computing
It's still very niche but could offer enormous power savings for ML inference.
GloamingNiblets | 8 months ago | on: 3D-printed device splits white noise into an acoustic rainbow without power
GloamingNiblets | 8 months ago | on: 3D-printed device splits white noise into an acoustic rainbow without power
GloamingNiblets | 9 months ago | on: Wendelstein 7-X sets new fusion record
[1] https://tae.com/tae-technologies-delivers-fusion-breakthroug...
GloamingNiblets | 9 months ago | on: Compiling a neural net to C for a speedup
GloamingNiblets | 9 months ago | on: Running GPT-2 in WebGL: Rediscovering the Lost Art of GPU Shader Programming
GloamingNiblets | 10 months ago | on: Waymo and Toyota outline partnership to advance autonomous driving deployment
GloamingNiblets | 10 months ago | on: Does RL Incentivize Reasoning in LLMs Beyond the Base Model?
GloamingNiblets | 10 months ago | on: Does RL Incentivize Reasoning in LLMs Beyond the Base Model?
Here's the condensed and formatted transcription in a single paragraph: This is the last thing I want to highlight this section on why RL works. Here they evaluate different things - they evaluate specifically pass at K and maj at K. Maj at K is like majority voting, so what you do is you have a model, you have a question, and you output not just one output but an ordered set. So you give your top 20 answers - 0 is your best answer that the model wants to give most, then the second most answer, third most answer, and so on. They could all be correct, just different reformulations of the same answer or different derivations stated in different ways. What you're interested in is how many of the top K results are correct - that's the pass at K. And if you had to vote if majority voting on the top K, how often would you be correct then? There's a slight difference, and that slight difference is actually made more drastic by reinforcement learning. They say, "As shown in figure 7, reinforcement learning enhances majority at K performance but not pass at K." These findings indicate that reinforcement learning enhances the model's overall performance by rendering the output distribution more robust. In other words, it seems that the improvement is attributed to boosting the correct response from Top K rather than the enhancement of fundamental capabilities. This is something we've come to learn in many different ways from reinforcement learning on language models or even supervised fine-tuning - what's happening most likely is that the capabilities of doing all of these things are already present in the underlying pre-trained language model. Summary: Reinforcement learning improves language model performance not by enhancing fundamental capabilities but by making the output distribution more robust, effectively boosting correct responses within the top results rather than improving the model's inherent abilities.
GloamingNiblets | 10 months ago | on: Generate videos in Gemini and Whisk with Veo 2
GloamingNiblets | 11 months ago | on: Bored of It
GloamingNiblets | 11 months ago | on: It’s not mold, it’s calcium lactate (2018)
Just because something has been used since 1955 doesn't mean it's all good.
GloamingNiblets | 1 year ago | on: Farallon Islands live (and controllable) webcam