lukemerrick | 1 year ago | on: Sharing new research, models, and datasets from Meta FAIR
lukemerrick's comments
lukemerrick | 1 year ago | on: Cosmopolitan v3.5
lukemerrick | 1 year ago | on: Show HN: Huewords, a Word and Logic Puzzle
lukemerrick | 1 year ago | on: New attention mechanisms that outperform standard multi-head attention
While lazy me wants them to explain how their approach compares to these approaches, it looks like their exposition is pretty clear (quite nice for a preprint!) and I guess I'll just have to actually read the paper for real to see for myself.
Given how well I've seen Simplified Transformer blocks work in my own playground experiments, I would not at all be surprised if other related tweaks work out well even on larger scale models. I wish some of the other commenters here had a bit more curiosity and/or empathy for these two authors who did a fine job coming up with and initially testing out some worthwhile ideas.
lukemerrick | 2 years ago | on: The Pile: An 800GB dataset of diverse text for language modeling (2020)
lukemerrick | 2 years ago | on: JupyterLab 4.0
I actually started in notebooks and then learned to love the REPL as a simplified "scratchpad notebook." I'd say in many ways notebooks are an improvement that cater heavily to REPL-lovers, but that for some quick tasks, the extra complexity isn't always worth it.
lukemerrick | 2 years ago | on: Building a Lox Interpreter in Julia
lukemerrick | 2 years ago | on: Building a Lox Interpreter in Julia
I'm not sure how far down the compiler I actually will enjoy going vs. exploring ideas around type systems, linters, etc. up near the AST level, but if I do venture down this advice will certainly come in handy!
lukemerrick | 2 years ago | on: Building a Lox Interpreter in Julia
lukemerrick | 2 years ago | on: Building a Lox Interpreter in Julia
Reconsidering now, it seems that there might be benefits beyond type dispatch to having a typed syntax tree, so maybe I'll give that a shot as a next step!
lukemerrick | 2 years ago | on: Building a Lox Interpreter in Julia
I'm not sure exactly how Rust and rust-analyzer keep track of the info necessary to their excellent error messages and diagnostics, but I wouldn't be surprised if pinpoint messages were not the primary motivation for rust-analyzer to do lossless parsing.
lukemerrick | 2 years ago | on: Building a Lox Interpreter in Julia
lukemerrick | 3 years ago | on: Show HN: Yaksha Programming Language
For context on where I'm coming from, about two weeks ago I picked up Crafting Interpreters [1] for fun. I'm finding your clear-yet-concise Compiler internals [2] to be particularly compelling reading, and jumping back and forth between those "how this all works" docs and the live example of this language you actually built do a WASM-compiled tree-blowing-in-the-wind animation is just... just wow. So freaking cool!
I also enjoyed reading the comment thread that inspired you to start on Yaksha and seeing how this project has a wholesome start as inspiration-by-programming-hero. I hope you recognize that a few years later you've now ascended from inspiree to inspirer. I also hope you're still having tons of fun building out Yaksha!
[1] https://www.craftinginterpreters.com/
[2] https://yakshalang.github.io/documentation.html#compiler-int...
lukemerrick | 3 years ago | on: Optimizing utility-scale battery storage dispatch
I was torn whether to share my own post, but I figured the HN crowd might include a few others who will also really geek out about this topic and appreciate it. It's mathematical optimization and forecasting used to guide giant batteries hooked up to the electrical grid, after all.
There is some accompanying code I got to share publicly, too, if you want to run this yourself [1]. While I'm at it, I'll also mention some papers for anyone who wants a true deep dive [2, 3].
[1] https://gist.github.com/lukemerrick/4e1f9921a19ec97f7b949909...
[2] Linear Programming for battery optimization -- https://www.osti.gov/servlets/purl/1244909
[3] Mixed-Integer Linear Programming for battery optimization [PDF] -- https://www.sandia.gov/ess-ssl/wp-content/uploads/2018/08/20...
lukemerrick | 3 years ago | on: Show HN: Investorsexchange.jl – parse trade-level stock market data in Julia
For anyone who wants the naming backstory, InvestorsExchange.jl was originally IEXTools.jl, but Julia's package registration automatic name checks didn't like it ("Name does not meet all of the following: starts with an uppercase letter, ASCII alphanumerics only, not all letters are uppercase. Name is not at least five characters long") [1]. So to Wikipedia I went to find the non-acronym name of the IEX exchange, which is "Investors Exchange" [2]. Thank you all for helping me understand why IEX goes by IEX in all of their branding.
lukemerrick | 3 years ago | on: Show HN: Investorsexchange.jl – parse trade-level stock market data in Julia
lukemerrick | 6 years ago | on: Why are we using black box models in AI when we don’t need to? (2019)
However, in this article and elsewhere Professor Rudin has cited compelling evidence of cases in which black box models have been demonstrated to be no more accurate than interpretable alternatives. I feel this fairly justifies the question in the title of the article. For example, based upon available evidence, it appears reasonable that some onus should lie on the creators and buyers of COMPAS (a proprietary black box recidivism model) to demonstrate COMPAS actually is more accurate than an interpretable baseline. While it may not be the case, as the article seems to suggest, that in all modeling cases there is an interpretable alternative with comparable accuracy, in cases which there is, there doesn't seem to be any justification for using a black box model.
On the matter of "human-style" interpretability, we are brought to the difference between "interpretability" and "explainability." Humans have a complex capacity for constructing explanations for the thoughts and actions of ourselves and others (among other things). As OP points out, a lot of famous psychological experiments by Kahneman and others have shown how much of our reasoning appears to be post-hoc, often biased, and often inaccurate (in other words, human explanations are not actually true transparent interpretations of our thoughts and actions). However, we humans do have a powerful capacity to evaluate and challenge the explanations presented by others, and we are able to reject bad explanations. For those interested, a great book on this topic is "The Enigma of Reason" by Mercier and Sperber (https://www.hup.harvard.edu/catalog.php?isbn=9780674237827), but the gist here is that we must understand that while explanations are not the same as transparent interpretability, they are still useful.
I would conjecture that at some level of complexity (which some predictive tasks like pixel-to-label image recognition seem to exhibit), true end-to-end interpretability is not possible -- the best we can do is to construct an explanation. However, two very important points should be observed when considering this conjecture: 1. (Professor Rudin's point in the article) In cases which are not too complex for interpretable models to achieve comparable accuracy to black-box models, we can and should use them, as they offer super-human transparency at no cost in accuracy. 2. Constructing no explanations (or bad explanations) is not the same as reaching the same level of semi-transparency that humans offer. If we want to use human interpretability as a benchmark, black box models with no explanations are not up to par.