fwilliams | 1 month ago | on: Ask HN: Share your personal website
fwilliams's comments
fwilliams | 1 year ago | on: ScanAllFish
Source code is here: https://github.com/fwilliams/unwind
Paper is here: https://arxiv.org/abs/1904.04890
It got accepted to Chi 2020 which was cancelled so the paper never got presented, sadly!
fwilliams | 3 years ago | on: Numba: A High Performance Python Compiler
[1] https://github.com/pybind/pybind11 [2] https://github.com/fwilliams/numpyeigen
fwilliams | 3 years ago | on: The US-Canada border cuts through the Haskell Free Library and Opera House
I grew up in Stanstead. I have fond memories of story time as a child in the library, borrowing movies and comic books, and playing age of empires 2 with my best friend on the two shared computers in the front room.
There’s also a street in the town (aptly named Canusa st.) which is half in the US and half in Canada. Interestingly, the houses on one side have flags reminding you where you are, while there are no flags on the other. Figuring out which side is left as an exersize to the reader ;)
fwilliams | 3 years ago | on: Show HN: VoxelChain – An Experimental Voxel Engine
Unfortunately it doesn't work on Firefox on Ubuntu 20.04 with an NVIDIA GTX 3090Ti. :(
fwilliams | 3 years ago | on: A Case of Plagarism in Machine Learning Research
fwilliams | 3 years ago | on: A Case of Plagarism in Machine Learning Research
For example (Emphasis mine):
> The risks of data memorization, for example, the ability to extract sensitive data such as valid phone numbers and IRC usernames, are highlighted by Carlini et al. [41]. While their paper identifies 604 samples that GPT-2 emitted from its training set, we show that over 1 of the data most models emit is memorized training data. In computer vision, memorization of training data has been studied from various angles for both discriminative and generative models Deduplicating training data does not hurt perplexity: models trained on deduplicated datasets have no worse perplexity compared to baseline models trained on the original datasets. In some cases, deduplication reduces perplexity by up to 10%. Further, because recent LMs are typically limited to training for just a few epochs
fwilliams | 3 years ago | on: A Case of Plagarism in Machine Learning Research
> But even putting aside the fact that claiming someone else's writing as one's own is wrong, the value in survey papers is in how they re-frame the field. A survey paper that just copies directly from the prior paper hasn't contributed anything new to the field that couldn't be obtained from a list of references.
Good survey papers can be important contributions in their own right (e.g. [1]). A good survey should contextualize works within a subject area with respect to each other and identify high level trends/ideas in that subject. These connections are not only useful for learning a topic, but also for positioning novel work or identifying under-researched areas to focus on.
If the authors felt that one of the papers they plagiarized concisely expressed what they wanted to say, they could simply quote and cite that work. Otherwise, it could be construed that the authors are claiming to be the ones drawing the conclusions they wrote. Moreover, from the article, the survey in question seems to be pretty egregiously plagiarizing, which deserves to be called out/shamed.
fwilliams | 4 years ago | on: Ask HN: How do you find funds to invest in?
My time horizon is longer than 5 years, and I buy broad market index funds split up as follows: 55% US large cap (e.g. VIIIX, VTSAX, SWTSX), 15% US mid cap (e.g. VMCPX), 10% US small cap (e.g. VSCPX), and 20% international (e.g. VTSNX, SWISX, VXUS).
I also highly recommend dollar cost averaging. i.e. buying a fixed amount of your portfolio at fixed periods. I have my bank do this automatically every 2 weeks. The benefit of dollar cost averaging is (1) it takes the emotion out of investing, and (2) over a long time window, more of your assets will be purchased at a low prices than high prices (because you're buying a fixed dollar amount of assets every N days, fewer you will buy fewer assets when prices are high and more assets when prices are low).
fwilliams | 4 years ago | on: The illustrated guide to a Ph.D. (2010)
During my PhD, I got to spend time learning, and attending talks/seminars/conferences. Gaining deeper background knowledge in my field as well as learning how to quickly evaluate and explore new ideas gave me the tools to have the type of job I wanted. I'm a research scientist at an industrial lab now and quite enjoy it.
That being said, I agree with the grandparent post that doing a PhD can be a grueling experience. I had to carry the bulk of the work for many of the papers I submitted. If I took a day off, nobody would pick up the slack. Tight deadlines meant the only way to succeed was putting in long hours. My advisors were also spread very thin so it was difficult to get a lot of time with them. There were times when I felt very alone. This was a really stark contrast to how collaborative engineering in industry was and I don't think I ever fully adjusted to it. My current job feels like a happy middle ground. I publish papers alongside other people and we split the work.
fwilliams | 5 years ago | on: My 90s TV: Browse 90s Television
fwilliams | 6 years ago | on: An Overview of the Python Tooling Landscape
fwilliams | 7 years ago | on: Gradient Descent Finds Global Minima of Deep Neural Networks
The major contribution of the work is showing that ResNet needs a number of parameters which is polynomial in the dataset size to converge to a global optimum in contrast to traditional neural nets which require an exponential number of parameters.
fwilliams | 7 years ago | on: Universal Method to Sort Complex Information Found
fwilliams | 7 years ago | on: Autopsy of a deep learning paper
Section 5 claims "the corresponding CoordConv GAN model generates objects that better cover the 2D Cartesian space while using 7% of the parameters of the conv GAN". There isn't really quantitative analysis beyond a couple of small graphs discussing this any further. Section 7.2 and 7.3 visually compares the results between the generator's output of interpolated noise vectors in the latent space. The results look good but without quantitative analysis, they are very preliminary.
Generative modeling is tricky and I think in your first comment, the jump from a few nice images to CoordConv can "significantly improve the quality of the representations" is a big one given the sparsity of evidence in the paper. I'm not saying that you're wrong but your original comment seemed a bit misleading to me.
fwilliams | 7 years ago | on: Autopsy of a deep learning paper
These images look impressive, but without doing a proper in-depth analysis, more general claims of improvement on a task are hard to make And while it's totally possible that, in this case, the improvements are significant, it's dangerous to extrapolate from just a few examples in a paper.
fwilliams | 7 years ago | on: Autopsy of a deep learning paper
This work shows that the network architecture is a strong enough prior to effectively learn this set of tasks from a single image. Note that there is no pretraining here whatsoever.
More to your point, I think a big problem with toy tasks are not so much the tasks but the datasets. A lot of datasets (particularly in my field of geometry processing) have a tremendous amount of bias towards certain features.
A lot of papers will show their results trained and evaluated on some toy dataset. Maybe their claim is that using such-and-such a feature as input improves test performance on such-and-such problem and dataset.
The problem with these papers often comes when you try to generalize to data that is similar but not from the toy dataset. A lot of applied ML papers fail to even moderately generalize, and the authors almost never test or report this failure. As a result, I think we can spend a lot of time designing over-fitted solutions to certain problems and datasets.
On the flipside, there are plenty of good papers which do careful analysis of their methods' ability to generalize and solve a problem, but when digging through the literature its important to be judicious. I've wasted time testing methods that turn out to work very poorly.
fwilliams | 7 years ago | on: Why ActivityPub is the future
fwilliams | 7 years ago | on: Understanding Machine Learning: From Theory to Algorithms
As others have mentioned this is a fairly theoretical take on machine learning which may not be useful if you just want to use a deep learning library. That said, I think there is a lot of value in having a deeper theoretical grasp of a topic even when practicing.
fwilliams | 8 years ago | on: The differences between tinkering and research (2016)
In machine learning and computer vision, the default is to put your work on ArXiv immediately upon completion. There is a lot of openly available research depending on the field. I find most fields in computer science are good for this.
In the case of the Mario example, I very much doubt that the author was not able to find other work because of closed journals.
Research can seem non-transparent to non-researchers because when problems are new, they are often poorly understood. Academic papers discuss novel problems and contextualize them based on other cutting edge work. Reading and understanding these papers requires a lot of context and takes time. Research is a skill that takes years to learn.
After some time has passed and we collectively gain a better understanding of a problem, academic papers may seem abstruse and overly complicated, but when these works were first published, this was the best way to understand them. For somebody looking for a recipe solution to a problem, an academic paper is likely not an the ideal place to look, which is why we write books, blog posts, etc... as we come to better understand a problem and its solutions.
I also own https://stonks.money and am looking for good ideas for what to do with it