dplavery92 | 1 year ago | on: Tencent Hunyuan-Large
dplavery92's comments
dplavery92 | 1 year ago | on: C++ proposal: There are exactly 8 bits in a byte
dplavery92 | 2 years ago | on: Kalman Filter Explained Simply
dplavery92 | 2 years ago | on: Kalman Filter Explained Simply
Also common in robotics applications is the Particle Filter, which uses a Monte Carlo approximation of the uncertainty in the state, rather than enforcing a (Gaussian) distribution, as in the traditional Kalman filter. This can be useful when the mechanics are highly nonlinear and/or your measurement uncertainties are, well, very non-Gaussian. Sebastian Thrun (a CMU robotics professor in the DARPA "Grand Challenge" days of self-driving cars) made an early Udacity course on Particle Filters.
dplavery92 | 2 years ago | on: Simulating fluids, fire, and smoke in real-time
dplavery92 | 2 years ago | on: OpenAI's board has fired Sam Altman
In the words of Brandt, "well, Dude, we just don't know."
dplavery92 | 2 years ago | on: UHZ1: NASA telescopes discover record-breaking black hole
dplavery92 | 2 years ago | on: Mars has a layer of molten rock inside
dplavery92 | 2 years ago | on: Medieval staircases were not built going clockwise for the defender's advantage
dplavery92 | 2 years ago | on: A non-mathematical introduction to Kalman filters for programmers
Unlike the OP article, it does make use of the math formalism for Kalman filters, but it is a relatively gentle introduction that does a very good job visualizing and explaining the intuition of each term. I have gotten positive feedback (no pun intended!) from interns or junior hires using this resource to familiarize themselves with the topic.
If you are making a deeper study and are ready to dive into a textbook that more thoroughly explores theory and application, there is a book by Gibbs[1] that I have used in the past and is well-regarded in some segments of industry that rely on these techniques for state estimation and GNC.
[1] https://onlinelibrary.wiley.com/doi/book/10.1002/97804708900...
dplavery92 | 2 years ago | on: Like diffusion but faster: The Paella model for fast image generation
From the Paella paper[2]: "Our proposal builds on the two-stage paradigm introduced by Esser et al. and consists of a Vector-quantized Generative Adversarial Network (VQGAN) for projecting the high dimensional images into a lower-dimensional latent space... [w]e use a pretrained VQGAN with an f=4 compression and a base resolution of 256×256×3, mapping the image to a latent resolution of 64×64indices." After training, in describing their token predictor architecture: "Our architecture consists of a U-Net-style encoder-decoder structure based on residual blocks,employing convolutional[sic] and attention in both, the encoder and decoder pathways."
U-Net, of course, is a convolutional neural network architecture. [3]. The "down" and "up" encoder/decoder blocks in the Paella code are batch-normed CNN layers. [4]
[1] https://arxiv.org/pdf/2012.09841.pdf [2] https://arxiv.org/pdf/2211.07292.pdf [3] https://arxiv.org/abs/1505.04597 [4] https://github.com/dome272/Paella/blob/main/src/modules.py#L...
dplavery92 | 2 years ago | on: Like diffusion but faster: The Paella model for fast image generation
dplavery92 | 2 years ago | on: Like diffusion but faster: The Paella model for fast image generation
dplavery92 | 2 years ago | on: AI Canon
dplavery92 | 2 years ago | on: Translating Akkadian clay tablets with ChatGPT?
dplavery92 | 3 years ago | on: Llama.cpp 30B runs with only 6GB of RAM now
dplavery92 | 3 years ago | on: C++ Neural Network in a Weekend (2020)
They also get tons of use in results-oriented modeling of lots of other statistics questions in structured data (home prices, resource allocation, voter turnouts, etc.) but in this luddite's opinion, these sorts of applications tend to be pretty fraught if they short-change the convenience of the model training paradigm for a deeper understanding of the data phenomenology.
dplavery92 | 3 years ago | on: US Department of Energy: Fusion Ignition Achieved
dplavery92 | 3 years ago | on: US Department of Energy: Fusion Ignition Achieved
dplavery92 | 3 years ago | on: Demo of =GPT3() as a spreadsheet feature
I get the feeling that my visual system and the language I use are respectively pretty bad at processing and conveying precise information from a plot, (beyond simple descriptors like "A is larger than B" or "f(x) has a maximum"). I guess I would find it mildly surprising if any Vision-Language model were able to perform those tasks very well, because the representations in question seem pretty poorly suited.
I get that popular diffusion models for image generation are doing a bad job composing concepts in a scene and keeping relationships constant over the image--even if Stable Diffusion could write in human script, it's a bad bet that the contents of a legend would match a pie chart that it drew. But other Vision-Language models, designed for image captioning or visual question answering, rather than generating diverse, stylistic images, are pretty good at that compositional information (up to, again, the "simple descriptions" level of granularity I mentioned before.)
[0] https://arxiv.org/pdf/2411.02265 [1] https://llm.hunyuan.tencent.com/