daanx | 1 year ago | on: Float Self-Tagging
daanx's comments
daanx | 1 year ago | on: Damas-Hindley-Milner inference two ways
About duplicate labels.. one needs to retain the duplicate field at runtime _if_ there is a "remove_l" or "mask_l" operation that drops a field "l". For example, `{x=2,x=True}.remove_x.x` == `True`. (Where the type of `remove_l` is `{l:a|r} -> {r}`)
This comes up with effect systems where we could have 2 exception handlers in scope, and the current effect would be `<exn,exn>` (which corresponds to a runtime evidence vector `evn` of `[exn:h1,exn:h2]` where the h1,h2 point to the runtime exception handlers.). If a user raises an exception it'll select `evn.exn` and raise to `h1`. But a user can also "mask" the inner exception handler and raise directly to `h2` as well as `evn.mask_exn.exn`.
One could design a system though with different primitives and not have a remove or mask operation, such that the duplicate fields do not have to be retained at runtime (I think).
(Anyway, feel free to contact me if you'll like to discuss this more)
daanx | 1 year ago | on: Damas-Hindley-Milner inference two ways
For those interested, I recently have been thinking of a better way to specify type inference with principal derivations that lends itself better for type system extensions:
https://www.microsoft.com/en-us/research/uploads/prod/2024/0...
Still a bit preliminary but hopefully fun to read :-)
daanx | 1 year ago | on: The Essence of Compiling with Continuations (1993) [pdf]
https://www.college-de-france.fr/fr/agenda/seminaire/structu...
daanx | 4 years ago | on: Structurally-Typed Condition Handling
If you want to play with "structurally typed condition handling", then the Koka language has "row-typed algebraic effect handlers" that compile to C: <http://koka-lang.org>
daanx | 4 years ago | on: All you need is call/cc (2013)
That is, you indeed need delimited continuations to implement effect handlers, but they have more structure than, say, shift/reset. In particular, instead of being able to just yield (shift) with any function to an innermost context (reset), you can only perform a certain set of operations that are handled by a function defined in the handler.
This gives various advantages, in particular,
- you can give simple types to effect handlers (while shift/reset for example needs an answer type system), and you can use fine grained effect typing (instead of just a "shift" effect, you have "<exn,console>" effects)
- you can do better reasoning as the set of operations is restricted
- and (as a consequence) you have more opportunity to optimize them. In particular, tail-presumptive operations can be implemented to execute "in-place" without capturing the continuation for example.
- finally, different effect handlers are compositional and can be combined freely (unlike shift/reset; for example shift aways shifts to the innermost reset so we cannot compose arbitrarily compose with multiple resets).
daanx | 4 years ago | on: All you need is call/cc (2013)
Having said that, tracking effect types can be very beneficial, especially if you do interesting control flow like async/await for example.
daanx | 4 years ago | on: All you need is call/cc (2013)
If you are interested in this, Ningning Xie and I give a tutorial about effect handlers (and more) at ICFP'21 tomorrow (Thursday 12:30 EST): <https://icfp21.sigplan.org/details/icfp-2021-tutorials/5/Pro...>
daanx | 5 years ago | on: Exotic Programming Ideas, Part 3: Effect Systems
Still, it is a good extra check and I can see more use for the `div` effect for future verification tools where total functions can be used as a predicates (but non-terminating ones cannot).
daanx | 5 years ago | on: Exotic Programming Ideas, Part 3: Effect Systems
However, since most data types in Koka are inductive, any recursion over such inductive data types are still inferred to be always terminating.
In practice, it looks like about 70% of a typical program can usually be `total`, with 20% being `pure` (which is exceptions + divergence as in Haskell), and the final 10% being arbitrary side effects in the `io` effect.
daanx | 5 years ago | on: Exotic Programming Ideas, Part 3: Effect Systems
fun add-refs( a : ref<h,int>, b : ref<h,int> ) : st<h> int {
a := 10
b := 20
(!a + !b)
}
where indeed the effect is `st<h>` as the updates are observable. How the function was written before, the two arguments use a "rank-2" polymorphic type and the heaps are fully abstract -- in that case it would be unobservable but you cannot create such values :-)daanx | 5 years ago | on: Exotic Programming Ideas, Part 3: Effect Systems
This is an interesting point and comes down to the question -- what is an effect really? I argue that effect types tell you the type signature of the mathematical function that models your program (the denotational semantics). For example, the function
fun sqr(x : int) : total int { x*x }
has no effect at all. The math function that gives semantics to `sqr` would have a type signature that can be directly
derived from the effect type: [[int -> total int]] = Z -> Z
(with Z the set of integers). Now, for a function that raises an exception, it would be: [[int -> exn int]] = Z -> (Z + 1)
That is, either an integer (Z), or (+), a unit value (1) if an exception is raised. Similarly, a function that modifies a global heap h, would look like: [[int -> st<h> int]] = (H,Z) -> (H,Z)
that is, it takes an initial heap H (besides the integer) and returns an updated heap with the result.Now, non-termination as an effect makes sense: a function like "turing-machine" may diverge, so:
[[int -> div int]] = Z -> Z_\bot
That is, its mathematical result is the set of integers (Z) together with an injected bottom value that represents non-termination. (note: We don't use "Z + \bot" here since we cannot distinguish if a function is not terminating or not (in contrast to exceptions)).In a language like Haskell every value may not terminate or cause an exception -- that is a value `x :: Int` really has type `Int_\bot`, and we cannot replace for example `x*0` with `0` in Haskell.
Note that in almost all other languages, the semantic function is very complex with a global heap etc, any function `[[int -> int]]` becomes something like `(H,Z_32) -> (H,(Z_32 + 1))_\bot`. This is the essence of why it much harder for compilers and humans to reason about such functions (and why effect types can really help both programmers and compilers to do effective local reasoning)
daanx | 5 years ago | on: Safe-Linking – Eliminating a 20 year-old malloc() exploit primitive
In [mimalloc](https://github.com/microsoft/mimalloc) we use a similar strategy to protect the free list in secure mode. However, there are some weaknesses associated with using a plain _xor_ -- if the attacker can guess `P` that reveals the key immediately (`P^key^P == key`) or even if there is a read overflow, and the attacker can read multiple encodings, they can xor with each other, say `(P1^key)^(P2^key)` and then we have`(P1^P2)` which may reveal information about the pointers (like alignment).
What we use in _mimalloc_ instead is two keys `key1` and `key2` and encode as `((P^key2)<<<key1)+key1`. Since these operations are not associative, the above approaches do not work so well any more even if the `P` can be guesstimated. For example, for the read case we can subtract two entries to discard the `+key1` term, but that leads to `((P1^key2)<<<key1) - ((P2^key2)<<<key1)` at best. (We include the left-rotation since xor and addition are otherwise linear in the lowest bit).
Just some thoughts. Of course, this may be too much for the use-case. However, we found just little extra overhead for the extra operations (as most programs are dominated by memory access) so it may be of benefit.
daanx | 7 years ago | on: Expresso: A simple expressions language with polymorphic extensible row types
E.g. a type like:
foo :: r/x => { x :: int | r } -> int
foo r = r.x
Gets translated at runtime to a function where the lacks constraint r/x becomes an actual offset parameter.
daanx | 7 years ago | on: Expresso: A simple expressions language with polymorphic extensible row types
For those interested in more of the beautiful theory on extensible row types, this project seems to be based on an earlier paper I wrote on scoped labels: https://www.microsoft.com/en-us/research/wp-content/uploads/...
daanx | 8 years ago | on: Implementing Algebraic Effects in C
As an aside, I feel algebraic effect handlers are safer to use than most coroutine or shift/reset implementations because they offer more structure. As put by others: "Algebraic Effects+Handlers" are to "delimited continuations" as "while" is to "goto"
[1]: https://www.chiark.greenend.org.uk/~sgtatham/coroutines.html
daanx | 8 years ago | on: Implementing Algebraic Effects in C
Just wanted to add that you can find the library at: https://github.com/koka-lang/libhandler
The `dev` branch contains a sample of `libuv` integration. There is still a lot to be done in terms of creating a better interface and providing implementations of standard effects but the core is working quite well by now.
Enjoy!
Thanks for the nice summary -- looking forward to read the paper!
The same idea of self-tagging is actually also used in Koka language [1] runtime system where by default the Koka compiler only heap allocates float64's when their absolute value is outside the range [2e-511,2e512) and not 0, infinity, or NaN (see [2]). This saves indeed many (many!) heap allocations for float intensive programs.
Since Koka only uses 1 bit to distinguish pointers from values, another slightly faster option is to only box negative float64's but of course, negative numbers are still quite common so it saves less allocations in general.
[1] https://koka-lang.github.io/koka/doc/book.html#sec-value-typ...
[2] https://github.com/koka-lang/koka/blob/dev/kklib/src/box.c#L...
ps. If you enjoy reading about tagging, I recently wrote a note on efficiently supporting seamless large integer arithmetic (as used in Koka as well) and discuss how certain hardware instructions could really help to speed this up [3]:
[3] https://www.microsoft.com/en-us/research/uploads/prod/2022/0... (ML workshop 2022)