MAXPOOL | 1 year ago | on: Darwin Machines
MAXPOOL's comments
MAXPOOL | 1 year ago | on: The Engineer’s Guide to Deep Learning: Understanding the Transformer Model
1/ The Annotated Transformer Attention is All You Need http://nlp.seas.harvard.edu/annotated-transformer/
2/ Transformers from Scratch https://e2eml.school/transformers.html
3/ Andrej Karpathy has really good series of intros: https://karpathy.ai/zero-to-hero.html Let's build GPT: from scratch, in code, spelled out. https://www.youtube.com/watch?v=kCc8FmEb1nY GPT with Andrej Karpathy: Part 1 https://medium.com/@kdwa2404/gpt-with-andrej-karpathy-part-1...
4/ 3Blue1Brown: But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning https://www.youtube.com/watch?v=wjZofJX0v4M Attention in transformers, visually explained | Chapter 6, Deep Learning https://www.youtube.com/watch?v=eMlx5fFNoYc Full 3Blue1Brown Neural Networks playlist https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_6700...
MAXPOOL | 1 year ago | on: Some geometric intuition for single-layer ReLU networks
Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks https://arxiv.org/abs/1703.02930
MAXPOOL | 2 years ago | on: Vernor Vinge has died
Firstly, Kurzweil underestimates the number connections by order of magnitude.
Secondly, dentritic computation changes things. Individual dentrites and the dendritic tree as a whole can do multiple individual computations. logical operations low-pass filtering, coincidence detection, ... One neuronal activation is potentially thousands of operations per neuron.
Single human neuron can be equivalent of thousands of ANN's.
MAXPOOL | 2 years ago | on: Why do tree-based models still outperform deep learning on tabular data? (2022)
Transformers with positional encoding have embeddings are invariant to the input order. CNN's have translation invariance and can have little rotational invariance.
It's harder to find similar invariances to tabular data. Maybe applying methods from GNN's would help?
MAXPOOL | 2 years ago | on: Effect of exercise for depression: systematic review, meta analyisis
Conclusions Exercise is an effective treatment for depression, with walking or jogging, yoga, and strength training more effective than other exercises, particularly when intense. Yoga and strength training were well tolerated compared with other treatments. Exercise appeared equally effective for people with and without comorbidities and with different baseline levels of depression. To mitigate expectancy effects, future studies could aim to blind participants and staff. These forms of exercise could be considered alongside psychotherapy and antidepressants as core treatments for depression.
MAXPOOL | 2 years ago | on: Introduction to State Space Models (SSM)
Mamba: Linear-Time Sequence Modeling with Selective State Spaces https://arxiv.org/abs/2312.00752
https://github.com/state-spaces/mamba
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model https://paperswithcode.com/paper/vision-mamba-efficient-visu...
MAXPOOL | 2 years ago | on: Making Real-World Reinforcement Learning Practical [video]
A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning
Grow Your Limits: Continuous Improvement with Real-World RL for Robotic Locomotion
Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention
REBOOT: Reuse Data for Bootstrapping Efficient Real-World Dexterous Manipulation
FastRLAP: A System for Learning High-Speed Driving via Deep RL and Autonomous Practicing: https://sites.google.com/view/fastrlap
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions: https://qtransformer.github.io/
Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators: https://rl-at-scale.github.io/
MAXPOOL | 2 years ago | on: Ask HN: Is Knuth's TAOCP worth the time and effort?
MAXPOOL | 2 years ago | on: Ask HN: What's the most compelling AI prompt result you've seen?
CHESS IS A FUN SPORT, WHEN PLAYED WITH SHOT GUNS
COWS FLY LIKE CLOUDS BUT THEY ARE NEVER COMPLETELY SUCCESSFUL.
These are from MegaHal that entered 1998 Loebner Prize Contest. MegaHal was able to produce mind-blowing insightful sayings but most were just bs.
It seems that creativity is easy for computers. Just push randomness through some generative algorithm. Curating and selecting the best output makes all the difference. The ability to select, critique, and understand what is generated and what the meaning is is much harder.
MAXPOOL | 2 years ago | on: Ask HN: What is Q* (Q star) at OpenAI and how does it threaten humanity
Q* might be name derived from Q-learning and A* search algorithm.
In that case it would be informed best best-first search using reinforcement learning.
MAXPOOL | 2 years ago | on: Show HN: Convert any screenshot into clean HTML code using GPT Vision (OSS tool)
The learning algorithm used: Backpropagation with Stochastic Gradient Descent is not the universal learner. It's not guaranteed to find the global minimum.
MAXPOOL | 2 years ago | on: Copy is all you need
Faith and Fate: Limits of Transformers on Compositionality https://arxiv.org/abs/2305.18654
Transformers solve compositional reasoning tasks by reducing multi-step compositional reasoning into linearized subgraph matching without problem-solving skills. They can solve problems when they have reasoning graphs in the memory.
MAXPOOL | 2 years ago | on: Modern language models refute Chomsky’s approach to language
That's Chomsky's argument. A small set of constraints for organizing language.
MAXPOOL | 2 years ago | on: Modern language models refute Chomsky’s approach to language
20 year old human has
* heard ~220 million words, talked 50 million words.
* read ~10 million words.
* experienced 420 million seconds of wakeful interaction with the environment (can be used to estimate the limit to conscious decisions, or number of distinct 'epochs' we experience)
From a machine learning perspective human life is surprisingly small set of inputs and actions, just a blip of existence.
MAXPOOL | 3 years ago | on: Chess Investigation Finds U.S. Grandmaster ‘Likely Cheated’ More Than 100 Times
When there is money in the game, there is incentive to cheat.
> The report says dozens of grandmasters have been caught cheating on the website, including four of the top-100 players in the world who confessed.
There are probably smart cheaters already playing who are able to evade detection.
MAXPOOL | 3 years ago | on: Paradigms of Artificial Intelligence Programming (1992)
MAXPOOL | 3 years ago | on: Paradigms of Artificial Intelligence Programming (1992)
Old AI is today's bleeding edge computer engineering. There is an enourmous amount of free lunches for computer engineers and software startups in the old school artificial intelligence.
* modern SAT solver performance is impressive. They can solve huge problems.
* Writing a complex systems configurator with Prolog or Datalog can be like magic.
* Expert systems. There has never been so much use for them than today. Whenever you see expensive systems utilizing complex mess of "business logic" and expensive consultants, you should know there is a better way.
(I use SAT-solvers to partially initialize neural network parameters).
MAXPOOL | 3 years ago | on: AI model finds potential drug molecules a thousand times faster
>... Existing methods are computationally expensive as they rely on heavy candidate sampling coupled with scoring, ranking, and fine-tuning steps. We challenge this paradigm with EquiBind, an SE(3)-equivariant geometric deep learning model performing direct-shot prediction of both i) the receptor binding location (blind docking) and ii) the ligand's bound pose and orientation. ...
MAXPOOL | 3 years ago | on: Economic incentives help explain a longstanding puzzle (Flynn effect)
>The stock of human capital plays an important role in economic growth and prosperity (e.g. Bishop 1989, Toivanen and Väänänen 2013, Aghion et al. 2017). Cognitive scientists studying population trends in measured intelligence have tended to emphasise the role of factors affecting the supply of skills. Our analysis suggests that the mix of skills in a society also evolves in response to the society’s demands and suggests the value of economic reasoning in the study of population trends in measured intelligence.
Money and popularity are orthogonal to pathfinding that leads to breakthroughs.