What features would a type system require to be able to soundly type check 100% of the patterns that are found in dynamically typed programs? Dependent types? Refinement types? Is it possible?
It is impossible. Dynamically typed programs are able to look at the name of the type, write arbitrary Turing complete code reasoning about it, and then do whatever they want with it.
For example a unit test framework might walk your class hierarchy, identify all classes whose name matches a particular pattern, and then start doing stuff with that.
Of course there is always a way to accomplish the same thing without abusing the type system. But as soon as you do so, it is a different program.
Occurrence typing, perhaps? (In addition to dependent types, rather than instead of them.) That is, the result of certain runtime tests can refine the type of an object.
"Dynamic types are static types", so as long as you've got recursive types, union types and function types (and maybe subtypes, depending on the language) you can type check all dynamic programs. See section IX of Practical Foundations for Programming Languages or https://existentialtype.wordpress.com/2011/03/19/dynamic-lan...
If you did model, say, Python's type system in that way, it wouldn't buy you much. The real problem with dynamic types is that they're trivially true: all dynamic programs are well-typed and semantically meaningful. Any term which a human might point at and say "error" or "meaningless", a dynamic language considers to be a perfectly valid, semantically meaningful value. Sometimes those values are of a particular form, maybe with a name like "exception" or "error message", other times they might be unpredictable artefacts of arbitrary implementation details. There's no mechanism to tell such values apart from "proper" values (if there were, we would call that mechanism a "static type system", and violate our premise of dynamic types); even those values of an "exception" or "error" form may be perfectly valid components of a program, since they may be accepted as arguments, returned from functions, branched on, selected between, "thrown" (in the case of exceptions), etc.
If you did want to restrict the particular form of values in particular places (e.g. "the return value of this function should not have the form of an exception"), you can do that with dependent types. It wouldn't be pretty though, as you would have to manually transform those guarantees on your outputs into preconditions for your inputs; supply proofs of the guarantees, assuming the preconditions; have your callers guarantee to satisfy your preconditions; then repeat the process, over and over, until you reach back to the input/data-creation part of your program.
In my experience, that's basically the hardest way to use dynamic types. It's far easier to write distinct types with "correct by construction" invariants, rather than passing around brittle proof objects. Doing that would mean you're no longer "dynamic" though.
For example, consider the infamous vector type, used as the "hello world" of dependent types:
data Vector (t : Type) : (n : Nat) -> Type where
Nil : Vector 0 t
Cons : (n : Nat) -> t -> (Vector t n) -> Vector t (1+n)
A value of type "Vector Foo 5" contains 5 elements of type "Foo". To get the first element of such a vector, we can use "1+" in our type to completely rule out the empty case (since there is no Natural number before 0):
first : (t : Type) -> (n : Nat) -> Vector t (1+n) -> t
first (Cons _ x _) = x
Now let's consider dynamically typed values instead. The simplest way to model them is using tags:
data Tag : Type where
INT : Tag
STRING : Tag
BOOL : Tag
OBJECT : Tag
-- and so on for the fixed set of "dynamic types" our language has built-in
-- Turn "dynamic types" into static types
typeOf : Tag -> Type
typeOf INT = Int
typeOf STRING = String
typeOf Bool = Bool
-- and so on
-- Values of "dynamic type"
data Dynamic : Type where
Wrap : (t : Tag) -> (v : typeOf t) -> Dynamic
Unfortunately, since all we know about dynamic values is that they are "Dynamic", we have to keep track of all our knowledge about them separately:
getTag : Dynamic -> Tag
getTag (Wrap t _) = t
getValue : (d : Dynamic) -> typeOf (getTag d)
-- Exercise for the reader
We can still implement a safe "first" function, but we have to do a lot more checking:
first : (d : Dynamic) -> getTag d = LIST -> (valueOf d = nil -> Void) -> Dynamic
-- Exercise for the reader
Here we take our "Dynamic" value "d" and two extra arguments: one is a proof that "d" is a "LIST" and the second is a proof that "d" is not "nil". This is the same amount of information as the vector example, but not only is it more tedious and verbode, but our result is just another "Dynamic", so we have to start from scratch to prove whatever we need to about it.
btilly|9 years ago
For example a unit test framework might walk your class hierarchy, identify all classes whose name matches a particular pattern, and then start doing stuff with that.
Of course there is always a way to accomplish the same thing without abusing the type system. But as soon as you do so, it is a different program.
lmm|9 years ago
catnaroek|9 years ago
chriswarbo|9 years ago
If you did model, say, Python's type system in that way, it wouldn't buy you much. The real problem with dynamic types is that they're trivially true: all dynamic programs are well-typed and semantically meaningful. Any term which a human might point at and say "error" or "meaningless", a dynamic language considers to be a perfectly valid, semantically meaningful value. Sometimes those values are of a particular form, maybe with a name like "exception" or "error message", other times they might be unpredictable artefacts of arbitrary implementation details. There's no mechanism to tell such values apart from "proper" values (if there were, we would call that mechanism a "static type system", and violate our premise of dynamic types); even those values of an "exception" or "error" form may be perfectly valid components of a program, since they may be accepted as arguments, returned from functions, branched on, selected between, "thrown" (in the case of exceptions), etc.
If you did want to restrict the particular form of values in particular places (e.g. "the return value of this function should not have the form of an exception"), you can do that with dependent types. It wouldn't be pretty though, as you would have to manually transform those guarantees on your outputs into preconditions for your inputs; supply proofs of the guarantees, assuming the preconditions; have your callers guarantee to satisfy your preconditions; then repeat the process, over and over, until you reach back to the input/data-creation part of your program.
In my experience, that's basically the hardest way to use dynamic types. It's far easier to write distinct types with "correct by construction" invariants, rather than passing around brittle proof objects. Doing that would mean you're no longer "dynamic" though.
For example, consider the infamous vector type, used as the "hello world" of dependent types:
A value of type "Vector Foo 5" contains 5 elements of type "Foo". To get the first element of such a vector, we can use "1+" in our type to completely rule out the empty case (since there is no Natural number before 0): Now let's consider dynamically typed values instead. The simplest way to model them is using tags: Unfortunately, since all we know about dynamic values is that they are "Dynamic", we have to keep track of all our knowledge about them separately: We can still implement a safe "first" function, but we have to do a lot more checking: Here we take our "Dynamic" value "d" and two extra arguments: one is a proof that "d" is a "LIST" and the second is a proof that "d" is not "nil". This is the same amount of information as the vector example, but not only is it more tedious and verbode, but our result is just another "Dynamic", so we have to start from scratch to prove whatever we need to about it.