top | item 10409507

Why do we need modules at all? (2011)

79 points| norswap | 10 years ago |erlang.org | reply

45 comments

order
[+] bitwize|10 years ago|reply
This is why computing needs to be reinvented from first principles on up: the existing models are brittle, having been designed for 1975 problems, and they break down under 2015 workloads.

Eliminating modules and namespaces is one of the problems we solved early on in the Burpit project. Modules and namespaces just add unnecessary cruft to the program text, are still prone to collisions, and are not globally unique -- hacks which do not actually solve the problem in the manner of Java's com.xyz.whatever.SomeClass notation notwithstanding.

In the Burpit programming language, Poon, all functions are named according to a globally unique, 1024-bit hash of their representation in the Kock VM, which has been transliterated into a human-pronounceable string of CVC units, such as %pip-yod-rat-bag-hej-mig-wab-lik-sac-duf-top-kek. This is a bit hard to remember, but for the low introductory price of 100 BTC you may reserve a "quark", or block of 32-bit function names such as %pip-yod-rat. We've eliminated the need for tools like git too: all publicly available Poon functions live in the Burpit network itself in a distributed blockchain database indexed by these hash names.

This means you have to give up human-meaningful function names, but in practice this has been shown not to be a problem: the Burpit developers have no problem remembering what %zoq-fot-pik does, and you can search on a function's description in the global database.

There are some functions, called traps in Burpit, which are the equivalent of system calls and do not follow this convention. For example, the trap %sporc creates what is roughly the Burpit equivalent of a process (called a kite in Burpit).

[+] hibikir|10 years ago|reply
Joe talked about this in Strange Loop 2014, and it seems to me that the idea is far less crazy than it seems. The more functional programming I do, the more sense it makes.

A major reason for modules in OO languages is to manage mutability: There's some hidden state, and some conventions surrounding that state, that the module tries to abstract away from us. But the price is very high: We have to make sure our executables are built with the right versions of modules, and avoid incompatibilities: What we know as dependency hell.

But why do we need dependency hell at all? For instance, in Scala, why do I have to do gymnastics to use a modern version of Shapeless along with spray-routing? Because the JVM is not happy with multiple, incompatible versions of shapeless in the same classloader.

Under Joe's plan, all of that disappears. It's just a question of how much we lose in exchange for sidestepping a problem like this. Given that any libraries I write nowadays tend to have just a few hundred lines of production code, I am very tempted.

[+] danarmak|10 years ago|reply
That problem exists because the JVM doesn't have a module system. If it did, then you could load two different versions of Shapeless and use them from different parts of the code without interference. Which some people simulate by using per-library classloaders.
[+] oleks|10 years ago|reply
> Joe talked about this in Strange Loop 2014, and it seems to me that the idea is far less crazy than it seems. The more functional programming I do, the more sense it makes.

Someone (sorry, don't remember who) also mentioned this at ICFP in 2012 wrt. Hackage, the Haskell community's package archive - that the better method of delivery is a function, not a library. This makes sense in the Haskell community where you have Hoogle (https://www.haskell.org/hoogle/) by your side.

[+] moron4hire|10 years ago|reply
>> A major reason for modules in OO languages is to manage mutability...

Where did you get that idea from? That's pretty out there, even for OO.

[+] serichsen|10 years ago|reply
I am not on that mailing list, but I'd like to point out Emacs Lisp, which doesn't have modules (they would rather be called "packages" as in Common Lisp, but anyway, they just are not there).

Instead, the idiom has emerged to prefix every function name with its "package name". I'd take a deep look at the discussions in the Emacs Lisp community, in order to see the matter from the other side.

[+] lispm|10 years ago|reply
Though Common Lisp 'packages' are not really 'modules', they are namespaces for symbols. Common Lisp does not really hide things. It also does not allow fine grained control of interfaces: the class FOO and the function FOO with be both exported, when exporting the symbol FOO. Packages are also no compilation/deployment targets - there is for example no mechanism to compile or load a package.
[+] agumonkey|10 years ago|reply
In the last few years there were a few proposals to bring namespaces, and still emacs is a single flat namespace.
[+] danharaj|10 years ago|reply
I would like to structure my codebase as a relational database and have my editor based on a view of that database, rather than subordinating the organization of my code to the file system of data organization, or a flat key-value store.

Modules can then be views and I think they would still be useful. I think the coincidence we currently have where modules are files makes it seem like mutual exclusivity and tree-like hierarchy is a defining feature of modules. We can throw that out. I think a lot of Joe Armstrong's complaints go away now.

Modules are still useful for human purposes. They are useful units of responsibility, accountability, maintenance, learning, and presentation. In fact, these facets of modules are enhanced if modules can intersect as views into a codebase: Overlap and dependence of code can be modeled as intersections and joins of modules-as-views. That these can be computed from the structure of code rather than implicitly understood on top of the directory hierarchy of a file system, or worse, forcibly mangled to fit into a tree-like hierarchy like files is a tantalizing possibility.

[+] derefr|10 years ago|reply
I've been considering writing a "programming system of the future, backwards-compatible with the cranky programmers of today" by having just such a relational (or key-value) underlying representation... and then a FUSE server that presents an editable filesystem+text view of it.

Among a thousand other benefits, the "cutest" one is that all code gets canonicalized on save. The filesystem will literally reject a write(3) that can't be translated into a valid AST to be stored into the underlying DB. It's like a super-powered version of running "go fmt" as a git pre-commit hook.

[+] weinzierl|10 years ago|reply
This has been done before but it didn't take hold. As far as I know the most successful and widespread attempt in this direction was IBM's Visual Age. Eclipse is based on Visual Age but did away with the source code data base.
[+] ddouglascarr|10 years ago|reply
Modules as views on this database of functions is genius.

Most of the problems that Joe talks about with modules go away if modules become just a mutable collection of functions, and functions can be in more than one module.

You can keep your cake and eat it too!

[+] sz4kerto|10 years ago|reply
" there are no "open source projects" - only "the open source Key-Value database of all functions"

There would be a function name rush in open source (similar to domain name rush) to claim cool, short function names.

[+] derefr|10 years ago|reply
I don't imagine plain names would be very usable. The lowest level would be SHAs; then you'd have UUIDs mapping to signed feeds of SHAs (like Freenet's SSKs); and then maybe URLs that resolve to a server that 302-redirects to the UUIDs as URNs (where caching those resolutions becomes part of JITing code.)

A "module", then, would be e.g. a web server: a dictionary under your control, mapping symbols of your choice (paths) to [a particular trusted maintainer's ABI-locked sequence of fixes for] the functions you want them to map to.

A "library" wouldn't be the canonical container for functionality, but rather would provide a particular taxonomy; people in organizations would share "libraries" in order to speak the same design language.

[+] __david__|10 years ago|reply
I don't think so. C has been around since the 70s and there's no shortage of (short) names to use as your prefix.
[+] fortytw2|10 years ago|reply
which, no doubt could be handled by `import coolfunctiondotio from github.com/username/function`, and handled very much how Go does package imports, but on a function level
[+] maerF0x0|10 years ago|reply
if it was opensource the maintainer could simply merge a PR for the correct content..

getPi() returns a link to some stupid Pot pie website? submit a PR that returns 3 (or maybe with more digits)... Community votes or somehow manages getting the "correct" implementation in .

[+] oleks|10 years ago|reply
Sometimes, a module is much more than just a collection of functions. Sometimes, a module is a framework, or a whole embedded domain-specific language. At least, it usually encompasses a way of thinking about a given problem.

Erlang is simple-minded about this: frameworks get a special place in Erlang, and EDSLs are hard to come by since you can't declare, or override syntactic structures.

Taking modules out (in any language), you remove the programmer's option to semantically structure what they deliver into a holistic "thing". This may be a good thing: you can avoid monolithic modules.

So I think both options would be nice.

[+] skybrian|10 years ago|reply
Wikipedia uses a flat namespace for millions of articles, so it can be done. However, it's centrally administered, you need quite a few disambiguation pages, and there needs to be a way to rename articles when someone didn't get it right the first time. Also, by the time a concept is worthy of a Wikipedia article, there's already a pretty good idea about what to call it.

Hierarchical namespaces can certainly be overdone, but they do allow for quick, local decision-making.

[+] PythonicAlpha|10 years ago|reply
Why not go further?

In some old Fortran compilers, names of functions could be at most 6 characters long and I am not sure, if upper and lower case where distinguished, I guess not. But with only lower case letters and digits, you roughly get 1.5 billion different names. That is a lot -- you will have a busy time to use them all!

The only problem is, that the overhead remembering such names is huge, but it can be done!

I still think, that in programming, managing complexity is key. You need different levels of simplification -- and here come modules in. Even when you want to use some kind of key-naming scheme without modules as described -- you at least will end up better of, when you use common prefixes to make life a little bit easier ... (and one could come up with the idea, calling the prefixes "module").

Beside that, modules bring other advantages into the game besides naming.

[+] jack9|10 years ago|reply
Modules are for us. They provide context. You want to use a function name as a namespace and that creates problems for organization and conceptualization. Any compiler should be able to look at foo() and module:foo() (with identical side effects) and optimize. So what if I duplicate code? There's no clear advantage to focusing on that as a problem. Let the machines do the machine optimizations and you do your code organization (which is a harder problem).
[+] icebraining|10 years ago|reply
So what if I duplicate code? There's no clear advantage to focusing on that as a problem.

Sure there is: it makes it much more likely that fixes will only be applied to certain copies of the function.

A better solution would be to convert modules in groups: a function is not "in" a module, but it can be added to one or more groups, like emails in a tag system or hardlinks in an Unix filesystem.

[+] tomekowal|10 years ago|reply
The idea is appealing to me, because I always thought, that there should be much more meta data on functions, than module can provide.

I always thought about it like about dimensions.

1. "Logic/test dimension" I can have functions doing actual logic and for tests. Do I put it in one module (to make it easy to change both if I need to) or in separate module (to make business logic more easy to follow).

2. "Data structure dimension" I can have functions operating on different data structures like lists and sets.

3. "Operating dimension" I can have functions operating on "Enumerables". Do I put map in the Lists module or Enumberable? I need to know, where someone else put it when debugging. Database operating on meta data would solve that problem.

4. There can be metadata for time/space complexity, so I can easily make tradeoffs between functions that do the same thing in different ways.

5. "prod/stg/dev dimension" is another one. Maybe I want to use completely different logging mechanism for stg and prod, because I pay per logline...

6. "Quality dimension" could show, what is the code coverage for the function or if it follows naming conventions/practices.

7. "Popularity dimension" could show, how often is given function referenced, which would show most important functions and where to focus on optimizing.

Some of this problems are solved in different ways. IDEs can jump from function usage to its implementation. If you follow conventions, you can jump to a test code for given function. In Elixir, you can jump to protocol implementation. I can use inversion of control for switching implementations between environments. Those problems would have single solution, if there was a central database for functions.

There are many, many more dimensions and even relations between them.

[+] gluczywo|10 years ago|reply
The prerequisite for such a way of programming is to use plain universal interchangeable data structures.

It does not seem possible in mainstream languages where the OOP dogma tells you to obfuscate your sets, lists and maps and turn them into non-reusable classes.

[+] oleks|10 years ago|reply
> The prerequisite for such a way of programming is to use plain universal interchangeable data structures.

"Write programs to handle text streams, because that is a universal interface." - Doug Mcllroy

I'm not sure it's a prerequisite though, "as long as the programmers know what they're doing" - as is often the case in Unix as well.

[+] maerF0x0|10 years ago|reply
Or simply have a repo per language. That at least dedupes across each language (where there are often many projects reimplementing the same functions)
[+] al2o3cr|10 years ago|reply
Funny, given that a systems programmer would argue that the "FP dogma" has convinced you that "sets, lists and maps" are "plain data structures" with universally-applicable performance characteristics in terms of runtime and space. And you can't even use no-longer-relevant chunks of their memory as scratch!
[+] jfaucett|10 years ago|reply
This idea works quite nicely for "stdlib" sort of things. At least I've done this with JS - albiet using a very function / immutable style (so I don't know if this could work for OOP like languages).

Anyway, you end up with packages named "map" and "filter" or "toString" that export functions of the same name and are all very easy to require and consume/understand/test. Then on top of these you can build your libraries or apps.

"the only place where modules seem useful is to hide a letrec." - JS already has this via require you can easily choose what you want to export.

"The unique names bit is interesting - is this a good idea. Qualified names (ie names like xxx:foo/2) or (a.b.c.foo/2) sounds like a good idea but but when I'm programming I have to invent the xxx or the a.b.c which is very difficult." - I agree 100% with Joe here, with a few developers it works beautifully without any namespacing for the sake of it, I can attest to this. You have a canonical "isTextNode" function for example which you can use when you want to know if a dom node is a text node, its simple and its much easier than remembering the whole namespacing of things like org.example.subrouter.MyRidicClass...

So in summary I really like this idea, I've really liked it for a while now in js, I just don't know how well it would work at a larger scale where "anyone?" (or who exactly) can plop funcs into the DB.

[+] moron4hire|10 years ago|reply
The scaffolding to do this in JS and try it out could probably be setup in a weekend. Pair it with one of the various JSFiddle, CodePen, etc. sites and see where it gets you.

Then realize that the mini-trend not so long ago to make NPM modules a single function was a freaking nightmare, scratch the whole idea, and remember that modules were invented for a reason.

If two, seemingly unrelated modules contain an equivalent function, does that not suggest the need for a third module? And if that module happens to only contain that one function, aren't we just at the same point as the suggestion of uniquely naming all functions, yet not stuck with it as the only way to invoke functions?

But seriously, this thread title needs a "(2011)" on it. It's been 4 years. If someone thinks this is a good idea, just freaking try it already and see what happens. Lead by example.

[+] noiv|10 years ago|reply
Interesting. I believe the better question is: How to manage context? In a hierarchy or flat or ...? Natural languages have the same problem and a plethora of solutions mostly based on conventions or rules. You make even break them to end up with a joke or confuse your opponent. Scientists often think 'this makes no sense in given context' and rely on very strictly defined contexts. In common language the situation at hand loosely defines a context leading to all kind of misunderstandings when strangers meet. However, communication can drastically improve if a shared context exists and was found.

Is there a progamming language where one can construct/negotiate contexts in a clever way? Something more powerful than Python's 'import' or Javascript's 'with'?

[+] polemic|10 years ago|reply
It sounds like rather than modules, OP wants a vocabulary. That could be interesting but it seems like the inevitable conclusion is function names that are so long and explicit that they're as long or longer than the composition functions they're made of. At which point, there is no point.

Less constructively: what is described gives me flashbacks to pre-OO PHP4, endlessly looking up function docs. Maybe there is a a better way, but the proposed solution doesn't seem to provide fundamentally different/improved paradigms from a Big List Of Functions.

[+] gluczywo|10 years ago|reply
"There's a good an bad side to modules: Good: Provides a unit of compilation, a unit of code distribution. unit of code replacement"

I think that Joe ignores the major rationale for modularization.

Modules provide the way to cut out the fragment of a complex project and apply local reasoning to it.

In other words a module is the way to "divide and conquer" the complexity that crosses the border of the single mind's capacity.

As such a module is not about code compilation, distribution and upgrades. It's about human cognitive abilities.

[+] ilaksh|10 years ago|reply
If you really think about this type of thing, eventually you will realize most of the structures originated from the early constraints, i.e. complex textual source in individual files.

If you keep going down that path, eventually even smart people run into a cognitive wall, because the sad reality is that programming is defined by difficult to parse textual code in a bunch of text files.

Good luck trying to break through that wall.