Piezoid's comments

Piezoid | 2 months ago | on: The UK is shaping a future of precrime and dissent management (2025)

Not a word on Palantir. Is this because of the adept wording by the ministry of justice? I highly doubt they are developing this in a vacuum.

As re reminder, In the UK Palantir holds extensive contracts across defense (multi-billion MoD deals for AI-driven battlefield and intelligence systems) and healthcare (7y £330m+ NHS Data Platform). In France, its involvement is narrower but concentrated on *domestic* intelligence.

Piezoid | 1 year ago | on: Next generation LEDs are cheap and sustainable

The interstimulus interval (ISI) for vision is much longer than most flicker rates or frame intervals in displays and projectors. However flicker can be perceived through temporal aliasing. For lighting, even simple motion in the scene can reveal flicker. Waving your spread fingers in front of your eyes is a sure way to detect flicker.

What you're describing is likely saccadic masking, where the brain suppresses visual input during eye movements. It "freezes" perception just before a saccade and masks the blur, extending the perception of a "frame" up to the point in time of the sharp onset of masking. That's how you get a still of a partially illuminated frame instead of the blended together colors.

I’m no expert in this, but if you're curious, check out the Wikipedia pages on interstimulus interval, saccadic masking, chronostasis, and related research.

Piezoid | 2 years ago | on: Attention Is Off By One

Implementations usually replace replace the 1 in the denominator with exp(-max(x)) for this reason.

Piezoid | 2 years ago | on: Scaling Transformer to 1M tokens and beyond with RMT

In neuroscience, predictive coding [1] is a theory that proposes the brain makes predictions about incoming sensory information and adjusts them based on any discrepancies between the predicted and actual sensory input. It involves simultaneous learning and inference, and there is some research [2] that suggests it is related to back-propagation.

Given that large language models perform some kind of implicit gradient descent during in-context learning, it raises the question of whether they are also doing some form of predictive coding. If so, could this provide insights on how to better leverage stochasticity in language models?

I'm not particularly knowledgeable in the area of probabilistic (variational) inference, I realize that attempting to draw connections to this topic might be a bit of a stretch.

[1] The free-energy principle: a unified brain theory: <https://www.fil.ion.ucl.ac.uk/~karl/The%20free-energy%20prin...>

[2] Predictive Coding: Towards a Future of Deep Learning beyond Backpropagation?: https://arxiv.org/abs/2202.09467

Piezoid | 3 years ago | on: New Ghostscript PDF interpreter

Code reuse is achievable by (mis)using the preprocessor system. It is possible to build a somewhat usable API, even for intrusive data structures. (eg. the linux kernel and klib[1])

I do agree that generics are required for modern programming, but for some, the cost of complexity of modern languages (compared to C) and the importance of compatibility seem to outweigh the benefits.

[1]: http://attractivechaos.github.io/klib

Piezoid | 3 years ago | on: BMList – A list of big pre-trained models (GPT-3, DALL-E2...)

I can think of many specialized applications where the versatility is superfluous while the size of the model prohibit inference on the edge.

Do you know if there is available methods for shrinking a fine-tuned derivative of such big models?

Beside generating a specialized corpora using the big model and then train a smaller model on it, is there a more direct way to reduce the matrices dimensions while optimizing for a more specific inference problem? How far can we scale down before the need of a different network topology?

page 1