The limits of type theory: computation vs. interaction

[+] paganel|10 years ago|reply

I'm in a no way a computer-science expert, just a mere programmer, but by the looks of it this discussion all boils down to the (quite big, I'd say) impedance mismatch between real life ("computer programs" in this article) and us, as humans, trying to understand said life and applying rules to it (by in this case applying "type theory").

For example, looking at this:

>>> 0 if 1 else 'a'

made me remember one of me older thoughts on this subject, more exactly if the word 2/two from a sentence like this: "there are two apples on the table" should be treated as an Integer or a String. Some would say that we should only treat it as an Integer if we intend to do computations on it (for example adding those 2 apples to another 3 apples), to which I'd ask how come the word's essence (its "transformation" from a String to an Integer) changes depending on the type of action we intend to apply on it? (so to speak, English is not my primary language).

Anyway, things in this domain are damn complicated, and have been so for the last 2500 years at least, starting with Socrates (or was it Plato, in fact?) who was asking about the essence/idea of a table (i.e about its type), i.e. is a table still a table if it only has 3 legs remaining? what about two?, and continuing with Aristotle's categories and closer to our days with Frege and Cantor (somehow the table problem can be connected to the latter's continuum hypothesis, but I digress). So one of my points is that this is not only a computer science problem.

[+] Retra|10 years ago|reply

That 'table' problem is not that complicated once you understand the notions of 'approximation' and 'continuity'. A table with three legs is just a better approximation to what is meant by "table" than one with two, or worse than one with four.

It basically leads to things like this:

http://en.wikipedia.org/wiki/Domain_theory

[+] dllthomas|10 years ago|reply

"I'd ask how come the word's essence (its "transformation" from a String to an Integer) changes depending on the type of action we intend to apply on it?"

I think the problem is viewing "type" as "essence". Type is all about how we intend to use it, and which invariants we wish to enforce.

[+] snarfy|10 years ago|reply

Plato's Forms:

http://en.wikipedia.org/wiki/Theory_of_Forms

[+] dschiptsov|10 years ago|reply

I wouldn't say that heterogeneous ifs are silly for languages with explicitly type-tagged data (Lisps) where a value has a type, not a variable (actually, there are no variables - there are bindings - symbols acting as pointers to a type-tagged values). Moreover, I would say that homogenous conds and cases (pattern matching) with a "default" branch are way more silly.

Also I am very surprised by such statement from 'the author of a several papers with the word type in the title'.

I also would argue that the design of CPUs such that any sequence of bits could be loaded in any in any register and interpreted depending of the current context, instead of having typed registers is not a shortcoming but a fundamental feature. It makes code small and elegant. There is whole TAOCP to see how.

[+] userbinator|10 years ago|reply

I also would argue that the design of CPUs such that any sequence of bits could be loaded in any in any register and interpreted depending of the current context, instead of having typed registers is not a shortcoming but a fundamental feature. It makes code small and elegant.

Indeed, the fact that digital data is fundamentally "untyped" and its meaning depends entirely on interpretation is the entire reason why computers are so powerful and general-purpose.

[+] stevan|10 years ago|reply

I'd disagree with the two statements:

> People who think they understand something about the theory of programming languages, including me, tend to agree that what Python does is wrong.

> In fact you can program heterogeneous lists in dependently typed languages, but it’s unreasonably complicated.

Here's some Agda code that shows that both heterogeneous if statements and lists make sense and are easy to work with in type theory.

    data Bool : Set where
      true false : Bool

    if : ∀ {ℓ} {P : Bool → Set ℓ} (b : Bool) → P true → P false → P b
    if true  t f = t
    if false t f = f

    postulate Char : Set
    {-# BUILTIN CHAR Char #-}

    data ℕ : Set where
      zero : ℕ
      suc  : ℕ → ℕ

    {-# BUILTIN NATURAL ℕ #-}

    ex₁ : ℕ
    ex₁ = if {P = λ b → if b ℕ Char} true 0 'a'

    infixr 5 _∷_

    data List {a} (A : Set a) : Set a where
      []  : List A
      _∷_ : A → List A → List A

    data HList : List Set → Set where
      []  : HList []
      _∷_ : ∀ {A As} → A → HList As → HList (A ∷ As)

    ex₂ : HList (ℕ ∷ Char ∷ (ℕ → ℕ) ∷ [])
    ex₂ = 1 ∷ 'a' ∷ suc ∷ []

[+] danghica|10 years ago|reply

I am aware of this -- I teach Agda :) But I think Agda is not quite as accessible as Python, that's what I meant. Cheers!

[+] murbard2|10 years ago|reply

It's great to have an example, but could you add inline comments for those of us who aren't familiar with Agda's syntax?

[+] ngrilly|10 years ago|reply

How can you compare that code with the Python one liner in the original article? It's interesting, of course, but it's not "easy to work with".

[+] mafribe|10 years ago|reply

Maybe the author plans to write about it in later parts of the article, but it's misleading to assume that (1) there is no work on types for interacting processes and (2) that types always have to be based on some underlying ideas form functional programming.

Much recent research in programming languages is about types for interacting processes. The most well-known, but by no means only example are the session types pioneered by K. Honda. The key idea is that each process has a bunch of interaction points, and the interaction at each interaction point is constrained by a type. Such types typically

- What direction does data flow? E.g. channel x is used for input while channel y does only output.

- What kind of data is exchanged on the channel? E.g. x is used to exchange a boolean, while y exchanges pairs, the first component of which is a double, and the second component is a channel name which is used for inputing a string.

- How often is a channel used? E.g. x is linear (used exactly once), y is affine (used at most once) while z can be used an arbitrary number of times.

This setup has been investigated from many angles, and one of the most beautiful results in this space is that normal types known from lambda-calculus (e.g. function space) can be recovered precisely as special cases of such interaction types, using Milner's well-known encoding of functional computation as interaction.

[+] pron|10 years ago|reply

> it's misleading to assume that ... types always have to be based on some underlying ideas form functional programming.

One of FP's more annoying achievements is somehow convincing people that it's the only way to make programs more verifiable or more "mathematical". Imperative, stateful computation can be (and is) just as mathematical (whatever that means) and just as verifiable as pure-FP (it must be constrained in some ways, but not as extreme as requiring complete referential-transparency).

It's good to learn about applying types to process calculi. I wasn't aware of that work at all.

[+] danghica|10 years ago|reply

Indeed, I was going to talk about session types in a follow-on post :) Well anticipated! Cheers!

[+] danblick|10 years ago|reply

The article isn't loading for me right now, but I thought I'd mention co-inductive types and how they relate to "interaction".

In Coq, co-inductive types are used to model infinite objects like (never-ending) streams or (non-terminating) program executions.

  Inductive List (A: Set): Set :=
  | Nil: List A
  | Cons : A -> List A -> List A.

  CoInductive Stream (A: Set): Set :=
  | SCons: A -> Stream A -> Stream A.

  Fixpoint len {A: Set} (x: List A): nat :=
    match x with
     | Nil => 0
     | Cons _ y => 1 + len y
    end.

  CoFixpoint ones : Stream nat :=
    SCons nat 1 ones.

If you want to talk about how systems interact with the world, you can use bisimilarity relations to prove "these two systems interact with the world in the same way".

You can also use co-inductive types to embed the temporal logic of actions (TLA) in Coq; TLA is the language Leslie Lamport works with for "Specifying Systems".

* http://www.labri.fr/perso/casteran/RecTutorial.pdf

* http://www.amazon.com/Specifying-Systems-Language-Hardware-E...

* http://en.wikipedia.org/wiki/Bisimulation

[+] belovedeagle|10 years ago|reply

> Each monad indicates one particular kind of interaction, such as using memory or perform input-output.

NO NO NO. Stop. It was cute when people got this wrong in 2010; now it's just ignorant to pretend to know enough about Haskell to tell us something about it and still get this so wrong. There is no inherent connection between monads and statefulness; none whatsoever. Monads just happened to be a useful abstraction to make opaque for the purposes of IO.

[+] cousin_it|10 years ago|reply

People keep getting it wrong because monads are the wrong abstraction. The right abstraction is algebraic effects. They correspond much more directly to "interactions", because the types of effect handlers are exactly the types of allowed "interactions". As a bonus, algebraic effects commute with each other, while monads usually don't. (For example, there's a gratuitous difference between Maybe (List a) and List (Maybe a), and people who want to program with effects shouldn't have to think about such differences.)

There was a great quote saying that when you get something almost right but not completely right, you have to explain it to people over and over again. IMO people should pay more attention to that quote, it applies to monads perfectly.

[+] edtechdev|10 years ago|reply

[0; 'a'; fun x -> x + 1];;

The common type is object, or perhaps more specifically, non-nil object (object!), if 0 is not treated as nil (value type).

If you try to interact with an object (add it to a number, call a method, etc.), your language has to decide what to do: fail because that method is not part of the type object, or force people to cast to the proper sub-type, or allow a kind of dependent typing/casting (if type(it)=int: it+1 else if type=char: it.uppercase), or (like boo, python, ruby, etc.) allow "duck typing": allow any calls/methods on the object, and check it at run-time instead of compile-time

[+] wyager|10 years ago|reply

Not all type systems allow for subtyping. In particular, Hindley-Milner does not. Nor, as far as I'm aware, do any of the consistently-typed proof languages (like Agda, Coq, etc.)

[+] Dewie3|10 years ago|reply

Typing in the Haskell et. al. tradition don't like subtyping. And they also like their parametericity.

[+] muraiki|10 years ago|reply

"To me this is one methodological weakness of type theory, the commitment of having types for all terms in the language. Why is that? Types are designed to facilitate composition, but the natural unit for program composition is not the term but the function, or even the module. It makes sense to assign types to modules — it is methodologically consistent. But how we calculate these types could conceivably be more flexible, allowing a global perspective within the unit of composition. Types, as properties at the interface could be calculated using the entire arsenal of (decidable!) logics and static analyses."

I admit to not being that familiar with type theory (I've done a little Scala but most of my FP experience is in Clojure) but when I read this my gut reaction was: "Isn't that kind of what Go is doing?"

Meaning that in Go, the focus of compositionality isn't on basic types and the functions that act on them (see Go's lack of generics) but rather the focus is on getting different modules to agree on "what should be done to some stuff" using interfaces. This is putting interaction first but also getting compositionality in a "generic" way given that interfaces in Go are implemented implicitly.

[+] Dewie3|10 years ago|reply

Yeah, Go has interfaces. Just like any sane, decently modern statically typed language. And it uses it to declare interfaces between things. A great concept, like the idea behind abstract data types -- defining data types by what operations can be performed on them.

But what is so special about Go's approach to interfaces? The only special thing I see is that interfaces are implemented implicitly, which I see more as a "implicit over explicit" decision rather than anything having to do with composition per se. Yet people keep raving about it.

[+] unknown|10 years ago|reply

[deleted]

[+] nabla9|10 years ago|reply

> The result is to provide memory elements as black box library components, defined and tested in a different, lower-level, language. But at the interface the two levels of abstraction are reconciled and composition can occur.

I feel the same way. Uniqueness typing can work as the clue between component and upper lever abstraction.

[+] pron|10 years ago|reply

The question of computation vs. interaction is an fascinating one, but I think there's another question (perhaps related) to the applicability of "rich" types to practical programming, and that is regarding the role of types, and how far they should be stretched. I think there is little question that "simple" type systems (like in C/C++/Java/C#) aid in developing large systems. But much of the fascination with type theory (as the article states) has to do with the Curry-Howard isomorphism, and the ability of programs to carry proofs to some of their properties. This is believed to be a powerful tool in software verification. The question then becomes to what extent should this tool be used? What properties can be proven? What properties programmers are willing to try and prove? How much harder is it to prove a property than write a possibly-incorrect algorithm without proof?

The benefit of proofs in code would only be worthwhile if it was relatively easy for common programmers to prove the absence of bugs that would have been more costly to find and fix using other methods, and that that ease would not come at the cost of other properties (such as performance). So far I have a feeling that this is not being delivered. Your run-of-the-mill clever use of types usually prevents very "cheap" bugs, and hard bugs are even harder to prevent with types.

I'll give a concrete and recent real-world example. A few months ago, a bug was discovered in Python's and Java's sequential sorting algorithm[1], Timsort. It was introduced to Python in 2002, and its adoption by Java in 2011 has probably made it the most used sorting algorithm in the world. Yet, it contained a fatal bug that could cause it to crash by overflowing an array. There are a few interesting observations one could make about the bug and its discovery with relation to type systems.

1. The bug was discovered by a verification program that does not rely on type-based proofs in the code. The output from the verification system quickly showed where the bug actually was. AFAIK, the verification program required no assistance with proving the correctness of the algorithm, only that the variants the algorithm seeks to maintain be stated.

2. Proving the invariants required by the (rather complicated) algorithm with types would have required dependent types, and would have probably been a lot harder than using the verification program used. It is also unclear (to me) whether the problem and the fix would have been so immediately apparent.

3. The interaction of "type-proofs" with program semantics (namely, the requirement of referential transparency) would have made the type-proven algorithm much slower, and this particular algorithm was chosen just for its speed.

4. The corrected version ended up not being used because it could have adversely affected performance, and it was noticed that the "incorrect" algorithm with a slightly larger internal array would still always produce correct results for the input sizes currently supported in Java.

This example of a purely computational real-world bug shows some more limitations of the type-system approach to verification, I think.

Going back to effects and interactions, pure-FP approaches have a hard time dealing with timing properties. I already gave the same example on another discussion today, but the Esterel language (and languages influenced by it) has been used successfully to write 100% verifiable programs for safety-critical domains in the industry for a long time now. It is an imperative language (though it's not Turing complete) with side effects and a lot of interactions, yet it has a verifier that uses Pnueli's temporal logic to guarantee behavior, without requiring the programmer to supply proofs.

To summarize, type systems have shown their usefulness in reducing the cost of software development, but they need to be considered as a component in a larger verification space, and they need to strike a balance between usefulness and ease of use.

[1]: http://www.envisage-project.eu/proving-android-java-and-pyth...

[2]: http://www.envisage-project.eu/key-deductive-verification-of...

[+] sfvisser|10 years ago|reply

> The benefit of proofs in code would only be worthwhile if it was relatively easy for common programmers to prove the absence of bugs that would have been more costly to find and fix using other methods, and that that ease would not come at the cost of other properties (such as performance).

I don't understand why proofs need to be universally easy to write/understand to be useful. Why can't they be useful for a subset of programmers on a subset of projects?

> So far I have a feeling that this is not being delivered. Your run-of-the-mill clever use of types usually prevents very "cheap" bugs, and hard bugs are even harder to prevent with types.

Types can be used to encode complicated invariants of software systems, which will be automatically be machine checked throughout the entire lifetime of your project. There is a vast gradient from simple properties like 'this value is a number' to 'your data migrations are guaranteed to be completely reversible without any information loss' to 'the sequential composition of a set procedures will terminate within 600 cycles'.

Types are not only for preventing bugs. They model invariants, help design and allow easy reasoning over large software projects.

> This example of a purely computational real-world bug shows some more limitations of the type-system approach to verification, I think.

No it doesn't. You just pick one example for which it didn't really work out for some reason. That tells you exactly zero about if type based verification could be useful in the general case. (Or a specific subset of verification cases)

> To summarize, type systems have shown their usefulness in reducing the cost of software development, but they need to be considered as a component in a larger verification space, and they need to strike a balance between usefulness and ease of use.

Of course they need to. Luckily that's what smart people are working on every day, making types easier to use.

[+] jfoutz|10 years ago|reply

Far and away the most obvious is null checking. It's not a big deal in a thousand line program, but as things grow, the need to always remember to check for null becomes a real problem.

Related to null, you only get one null. the classic assoc problem. if i lookup a key in a hash, does the hash have the key, or is they value of the key null? Optional/Maybe is just wonderful for this case.

Semantic meaning of common types.

    update(String name, String email, String address, ...)

vs

    update(Name name, Email email, Address address, ...)

Strings get overloaded for lots of uses, and it's so easy to swap the order when you have multiple arguments.

Status codes, stuff like Email vs VerifiedEmail. Encoding those random flags in types makes it so much easier to enforce rules.

Yes, they are all "easy", but we've all lost hours or days to those problems. The sooner we can stop worrying about bonehead, forgot to check for null errors, the sooner we can concentrate on real problems. They're trivial problems, but we waste a big fraction of our lives dealing with them. Just taking those issues off the table makes it possible for average programmers to find bugs in timsort. The cognitive load of these dozens of details detracts from average programmers (like me) ability to actually detect the hard cases.

Typing isn't a panacea, but it is a help. It's akin to a socket driver vs a wrench. sockets abstract away the need to remove rotate and refit the wrench for every stinking quarter turn. You can do it with a wrench, but it takes so much longer.

There are no silver bullets. but there are regular bullets, and a good type system is one of them.

[+] murbard2|10 years ago|reply

One can also create a largely generic type, for example a json like object and define polymorphic operators on these objects which handle casting as needed.

The program will need to keep track of the types at runtime, but that's also true of Python, so what?

66 comments