top | item 17510902

Unix system programming in OCaml (2014)

269 points| jxub | 7 years ago |ocaml.github.io

48 comments

order
[+] themckman|7 years ago|reply
For those interested, some of the underlying libraries that make up Docker for Mac (and, I think, Windows) are written in OCaml (or have components written in OCaml): VPNKit[0], DataKit[1] and HyperKit[2] (qcow2 support is implemented in OCaml).

0: https://github.com/moby/vpnkit

1: https://github.com/moby/datakit

2: https://github.com/moby/hyperkit

[+] dfee|7 years ago|reply
To be fair, GitHub attributes 1.5% of hyperkit’s code to ocaml
[+] a0|7 years ago|reply
This is such a great book. OCaml’s type system feels like a superpower specially in the context of Unix development which is traditionally done in C.
[+] emersion|7 years ago|reply
I'm contributing to a linker written in ~OCaml [1], and now I understand writing systems code in OCaml is really a bad idea. It makes things more complicated when you really don't want them to be. As James Mickens says [2]:

>You can’t just place a LISP book on top of an x86 chip and hope that the hardware learns about lambda calculus by osmosis.

[1]: https://github.com/rems-project/linksem [2]: http://scholar.harvard.edu/files/mickens/files/thenightwatch...

[+] phaer|7 years ago|reply
Can you provide an example of the things, OCaml makes more complicated and explain which languages you compare it to? I guess it's quite different in comparison to C, C++ or maybe Rust than to other garbage-collected 'system programming' languages like Go?
[+] rezeroed|7 years ago|reply
Aargh. Downvoting is so lazy. I'd be interested to hear both sides flesh this out. (Yes, I know the rule - don't complain about downvoting -- why do you think there has to be a rule about it‽)
[+] tntn|7 years ago|reply
Only somewhat related,but I've tried to learn ocaml before but have never been able to figure out when to put what kind of delimiters where. Can anyone recommend a resource that explains the usage of the double comma and the like?
[+] laylomo2|7 years ago|reply
The double semicolon only matters in the REPL (aka the "toplevel").

Off the top of my head, I can think of at least two other situations where punctuation matters:

1) Whenever you have nested match cases:

    match left with
    | Some left ->
      match right with
      | Some right -> Some (left, right)
      | None -> None
    | None -> None
The compiler does not use indentation to figure out scoping, so the above gets treated as:

    match left with
    | Some left ->
      match right with
      | Some right -> Some (left, right)
      | None -> None
      | None -> None
So you need to surround the inner match with parens

    match left with
    | Some left ->
      (
        match right with
        | Some right -> Some (left, right)
        | None -> None
      )
    | None -> None
2) If you are doing imperative things inside an if statement:

Consider the following function:

    let cache_file file =
      if not (file_exists file) then
        printf "downloading file\n";
        download file
      else
        printf "file already exists\n"
The problem here is that OCaml allows one to create an if statement with no else clause. And since the compiler doesn't use indentation to figure out scoping, then the code is essentially treated like the following:

    let cache_file file =
      (if not (file_exists file) then printf "downloading file\n");
      download file;
      else (printf "file already exists\n");
Which is just totally wrong. So in order to do multiple imperative actions within an if block, you should use either parens, or begin/end delimiters (which are treated exactly the same as parens)

    let cache_file file =
      if not (file_exists file) then begin
        printf "downloading file\n";
        download file
      end else
        printf "file already exists\n"
[+] thedufer|7 years ago|reply
I can't think of any time a double-comma would be syntactically correct. Double-semicolons, which may be what you're referring to, are never required (although in some situations they will cause syntax errors to appear closer to the actual problem). They can be placed after all top-level value bindings.

Probably your best bet is to follow an intro series so you see some examples (https://dev.realworldocaml.org/ is the best at the moment, I believe), although if you're so inclined you could just read the BNF for the language (https://caml.inria.fr/pub/docs/manual-ocaml/language.html).

[+] richeyryan|7 years ago|reply
Not a direct answer to your question but you could look at Reason. It is a C style syntax that layers on top of the OCaml semantics. Writing in Reason is writing OCaml. However, the new syntax makes it more approachable for people coming from C style languages and removes idiosyncrasies like the double semicolon.

Its still progressing and more focus has been given to getting it up and running in a compile to JS context but it should be equally as capable of systems work, especially as time goes on.

[+] ofrzeta|7 years ago|reply
For some real world applications take a look at libguestfs http://libguestfs.org, a library that can inspect, mount, edit VM filesystem images. Also includes tools for p2v and v2v migration.
[+] rpcope1|7 years ago|reply
I think one major turn off, especially for systems programming, was the lack of multicore threading support (i.e. it had no way to get parallelism using threads). Does anyone know if this has changed?
[+] _0w8t|7 years ago|reply
One of the fastest web servers is nginx that uses child processes (typically one child per core) and asynchronous IO within the child without any threads. The same model with a parent process that monitor threadless child workers is often used in embedded system. One can and people do exactly the same in OCaml.

Surely it requires more code to setup things especially if the child processes need to communicate over shared memory to max the performance, but the big plus is that the resulting system is much more robust. In case of troubles one can just let a child die or even use kill -9 and start a new child. This is not possible with most threading implementations. For example, try to recover from out-of-memory in multi threaded C++/Java/Go etc. application. It is very hard. With threadless children it is almost trivial.

[+] ZirconiumX|7 years ago|reply
While this is an issue, there are cooperative threading libraries (mainly Lwt and Async, though the community prefers Lwt) which can mitigate most of the reasons you'd need multithreading. Hell, because it's only single threaded, you can mostly ignore locking.
[+] toolslive|7 years ago|reply
You really don't want to use the Unix module directly. If you're serious about system programming in OCaml, use Lwt or Async to allow for concurrency.
[+] testaccount7|7 years ago|reply
IIRC in Coders at Work, Brendan Eich talks about hiring a programmer who wrote an OS in OCaml.