daanx's comments | WingNews

daanx | 1 year ago | on: Float Self-Tagging

> This idea is so simple, so crazy, so stupid, and works so well, but I never thought of it. Bravo to the authors.

Thanks for the nice summary -- looking forward to read the paper!

The same idea of self-tagging is actually also used in Koka language [1] runtime system where by default the Koka compiler only heap allocates float64's when their absolute value is outside the range [2e-511,2e512) and not 0, infinity, or NaN (see [2]). This saves indeed many (many!) heap allocations for float intensive programs.

Since Koka only uses 1 bit to distinguish pointers from values, another slightly faster option is to only box negative float64's but of course, negative numbers are still quite common so it saves less allocations in general.

[1] https://koka-lang.github.io/koka/doc/book.html#sec-value-typ...

[2] https://github.com/koka-lang/koka/blob/dev/kklib/src/box.c#L...

ps. If you enjoy reading about tagging, I recently wrote a note on efficiently supporting seamless large integer arithmetic (as used in Koka as well) and discuss how certain hardware instructions could really help to speed this up [3]:

[3] https://www.microsoft.com/en-us/research/uploads/prod/2022/0... (ML workshop 2022)

daanx | 1 year ago | on: Damas-Hindley-Milner inference two ways

Haha, thank you for your kind reply :) really enjoyed the blog post as it shows nicely that implementing HM can be straightforward -- many papers on inference are usually quite math heavy which can be a tough read.

About duplicate labels.. one needs to retain the duplicate field at runtime _if_ there is a "remove_l" or "mask_l" operation that drops a field "l". For example, `{x=2,x=True}.remove_x.x` == `True`. (Where the type of `remove_l` is `{l:a|r} -> {r}`)

This comes up with effect systems where we could have 2 exception handlers in scope, and the current effect would be `<exn,exn>` (which corresponds to a runtime evidence vector `evn` of `[exn:h1,exn:h2]` where the h1,h2 point to the runtime exception handlers.). If a user raises an exception it'll select `evn.exn` and raise to `h1`. But a user can also "mask" the inner exception handler and raise directly to `h2` as well as `evn.mask_exn.exn`.

One could design a system though with different primitives and not have a remove or mask operation, such that the duplicate fields do not have to be retained at runtime (I think).

(Anyway, feel free to contact me if you'll like to discuss this more)

daanx | 1 year ago | on: Damas-Hindley-Milner inference two ways

That's a nice overview of Hindley-Milner in practice!

For those interested, I recently have been thinking of a better way to specify type inference with principal derivations that lends itself better for type system extensions:

https://www.microsoft.com/en-us/research/uploads/prod/2024/0...

Still a bit preliminary but hopefully fun to read :-)

daanx | 1 year ago | on: The Essence of Compiling with Continuations (1993) [pdf]

And Andrew Kennedy gave a very nice lecture on this recently as part of Xavier Leroy's seminar on control structures:

https://www.college-de-france.fr/fr/agenda/seminaire/structu...

daanx | 4 years ago | on: Structurally-Typed Condition Handling

Yes -- I think historically the power of condition handling was not well understood and algebraic effect handlers were a "rediscovery" coming from well-studied category theory (Plotkin, Power, and Pretnar).

If you want to play with "structurally typed condition handling", then the Koka language has "row-typed algebraic effect handlers" that compile to C: <http://koka-lang.org>

daanx | 4 years ago | on: All you need is call/cc (2013)

I like the characterization that Andrej Bauer uses: "while relates to goto, as effect handlers to shift/reset" :-)

That is, you indeed need delimited continuations to implement effect handlers, but they have more structure than, say, shift/reset. In particular, instead of being able to just yield (shift) with any function to an innermost context (reset), you can only perform a certain set of operations that are handled by a function defined in the handler.

This gives various advantages, in particular,

- you can give simple types to effect handlers (while shift/reset for example needs an answer type system), and you can use fine grained effect typing (instead of just a "shift" effect, you have "<exn,console>" effects)

- you can do better reasoning as the set of operations is restricted

- and (as a consequence) you have more opportunity to optimize them. In particular, tail-presumptive operations can be implemented to execute "in-place" without capturing the continuation for example.

- finally, different effect handlers are compositional and can be combined freely (unlike shift/reset; for example shift aways shifts to the innermost reset so we cannot compose arbitrarily compose with multiple resets).

daanx | 4 years ago | on: All you need is call/cc (2013)

Yes! a nice aspect of effect handlers is that they have an untyped dynamic semantics.

Having said that, tracking effect types can be very beneficial, especially if you do interesting control flow like async/await for example.

daanx | 4 years ago | on: All you need is call/cc (2013)

The modern way of call/cc and shift/reset is using Effect Handlers instead [1].

If you are interested in this, Ningning Xie and I give a tutorial about effect handlers (and more) at ICFP'21 tomorrow (Thursday 12:30 EST): <https://icfp21.sigplan.org/details/icfp-2021-tutorials/5/Pro...>

[1] <https://koka-lang.github.io/koka/doc/book.html>

daanx | 5 years ago | on: Exotic Programming Ideas, Part 3: Effect Systems

In practice, I have not (yet) found many great use cases for the distinction in Koka. It is nice to have "total" functions, but "pure" (exceptions+divergence) is still a good thing (and what Haskell gives you). And like you say, in practice we can easily have functions that just take a long long time to compute.

Still, it is a good extra check and I can see more use for the `div` effect for future verification tools where total functions can be used as a predicates (but non-terminating ones cannot).

daanx | 5 years ago | on: Exotic Programming Ideas, Part 3: Effect Systems

Just to add to this: Koka can (obviously :-)) not always determine if a function will terminate or not so it generally adds a `div` effect whenever there is the possibility of infinite recursion.

However, since most data types in Koka are inductive, any recursion over such inductive data types are still inferred to be always terminating.

In practice, it looks like about 70% of a typical program can usually be `total`, with 20% being `pure` (which is exceptions + divergence as in Haskell), and the final 10% being arbitrary side effects in the `io` effect.

daanx | 5 years ago | on: Exotic Programming Ideas, Part 3: Effect Systems

Ah, I think Stephen meant to write the following:

  fun add-refs( a : ref<h,int>, b : ref<h,int> ) : st<h> int {
    a := 10
    b := 20
    (!a + !b)
  }

where indeed the effect is `st<h>` as the updates are observable. How the function was written before, the two arguments use a "rank-2" polymorphic type and the heaps are fully abstract -- in that case it would be unobservable but you cannot create such values :-)

daanx | 5 years ago | on: Exotic Programming Ideas, Part 3: Effect Systems

(Daan here, creator of [Koka](https://github.com/koka-lang/koka)

This is an interesting point and comes down to the question -- what is an effect really? I argue that effect types tell you the type signature of the mathematical function that models your program (the denotational semantics). For example, the function

  fun sqr(x : int) : total int { x*x }

has no effect at all. The math function that gives semantics to `sqr` would have a type signature that can be directly derived from the effect type:

  [[int -> total int]]  = Z -> Z

(with Z the set of integers). Now, for a function that raises an exception, it would be:

  [[int -> exn int]] = Z -> (Z + 1)

That is, either an integer (Z), or (+), a unit value (1) if an exception is raised. Similarly, a function that modifies a global heap h, would look like:

  [[int -> st<h> int]] =  (H,Z) -> (H,Z)

that is, it takes an initial heap H (besides the integer) and returns an updated heap with the result.

Now, non-termination as an effect makes sense: a function like "turing-machine" may diverge, so:

  [[int -> div int]] = Z -> Z_\bot

That is, its mathematical result is the set of integers (Z) together with an injected bottom value that represents non-termination. (note: We don't use "Z + \bot" here since we cannot distinguish if a function is not terminating or not (in contrast to exceptions)).

In a language like Haskell every value may not terminate or cause an exception -- that is a value `x :: Int` really has type `Int_\bot`, and we cannot replace for example `x*0` with `0` in Haskell.

Note that in almost all other languages, the semantic function is very complex with a global heap etc, any function `[[int -> int]]` becomes something like `(H,Z_32) -> (H,(Z_32 + 1))_\bot`. This is the essence of why it much harder for compilers and humans to reason about such functions (and why effect types can really help both programmers and compilers to do effective local reasoning)

daanx | 5 years ago | on: Safe-Linking – Eliminating a 20 year-old malloc() exploit primitive

Interesting article -- thanks! I see they used the `key ^ P` encoding where the `key` is `L >> PAGESHIFT` to use the ASLR randomized bits from the free list position `L`.

In [mimalloc](https://github.com/microsoft/mimalloc) we use a similar strategy to protect the free list in secure mode. However, there are some weaknesses associated with using a plain _xor_ -- if the attacker can guess `P` that reveals the key immediately (`P^key^P == key`) or even if there is a read overflow, and the attacker can read multiple encodings, they can xor with each other, say `(P1^key)^(P2^key)` and then we have`(P1^P2)` which may reveal information about the pointers (like alignment).

What we use in _mimalloc_ instead is two keys `key1` and `key2` and encode as `((P^key2)<<<key1)+key1`. Since these operations are not associative, the above approaches do not work so well any more even if the `P` can be guesstimated. For example, for the read case we can subtract two entries to discard the `+key1` term, but that leads to `((P1^key2)<<<key1) - ((P2^key2)<<<key1)` at best. (We include the left-rotation since xor and addition are otherwise linear in the lowest bit).

Just some thoughts. Of course, this may be too much for the use-case. However, we found just little extra overhead for the extra operations (as most programs are dominated by memory access) so it may be of benefit.

daanx | 7 years ago | on: Expresso: A simple expressions language with polymorphic extensible row types

Thanks :-) TREX (by Mark Jones and Benedict Gaster) is great. The beauty is that a lacks constraints can get translated to fields offsets at runtime making it super efficient!

E.g. a type like:

foo :: r/x => { x :: int | r } -> int

foo r = r.x

Gets translated at runtime to a function where the lacks constraint r/x becomes an actual offset parameter.

daanx | 7 years ago | on: Expresso: A simple expressions language with polymorphic extensible row types

Looks like a fun project, very nice :-)

For those interested in more of the beautiful theory on extensible row types, this project seems to be based on an earlier paper I wrote on scoped labels: https://www.microsoft.com/en-us/research/wp-content/uploads/...

daanx | 8 years ago | on: Implementing Algebraic Effects in C

I tried to cite much related work on co-routines and threading libraries in C -- including Simon Tatham's page on co-routines in C [1] -- But I was not yet aware of the work on Icon nor did I know co-routines were used in PuTTY -- thanks! Although now that I reread Simon's page I see he actually mentions PuTTY explicitly :-)

As an aside, I feel algebraic effect handlers are safer to use than most coroutine or shift/reset implementations because they offer more structure. As put by others: "Algebraic Effects+Handlers" are to "delimited continuations" as "while" is to "goto"

[1]: https://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

daanx | 8 years ago | on: Implementing Algebraic Effects in C

Hi, I am the author of this report -- thank you for the interest :-)

Just wanted to add that you can find the library at: https://github.com/koka-lang/libhandler

The `dev` branch contains a sample of `libuv` integration. There is still a lot to be done in terms of creating a better interface and providing implementations of standard effects but the core is working quite well by now.

Enjoy!