top | item 17635413

Tuple Spaces – Good Ideas Don't Always Win (2011)

109 points| panic | 7 years ago |software-carpentry.org | reply

43 comments

order
[+] nostrademons|7 years ago|reply
The CTO at my first job out of college was the chief architect for JavaSpaces. I think we'd used it in an early version of the product but had migrated away from it by the time I joined full-time.

Aside from the fairness issues that other commenters mention, tuple-spaces work at an awkward level of abstraction. You can easily implement many other distributed concurrency models with them - semaphores, message queues, producer/consumer channels, broadcasts, even MVCC and transactions - but oftentimes, in the application it's more natural to just use the more specific abstractions. For example, you could implement a message queue by putting tuples of [type tag, sequence number, ...data] in and taking out the next message of a given type - but usually you'd want guarantees that there are no name collisions among type tags, and that sequence numbers are never skipped, and a mechanism to resend missing messages if for whatever reason they aren't produced correctly. At that point you'd rather just use golang-style channels or a real message queue library rather than roll that on top of the tuple-space.

There are relatively few problem domains I can think of that map directly to a tuple-space, without building some other concurrency abstraction on top of it. Dependency graphs and workflows, perhaps, but there are other libraries to handle that specific problem which also handle things like tracing, debugging, and error correction.

[+] ralphc|7 years ago|reply
Early to mid 2000's a product I worked on had a messaging abstraction that you could use JMS, MQSeries, MSMQ, etc. We tried, really tried, to use JavaSpaces but it had no FIFO guarantee. You could theoretically add and get tuples and one could stay in the space for days. A non-starter for that.
[+] esfandia|7 years ago|reply
It introduces a nice level of indirection between different actors, so that they don't need to know about each other's existence in order to communicate and solve problems together. The actors can be added and removed dynamically, and so the problems to solve and the problem-solvers can evolve over time. It's a pretty powerful and flexible idea! To use GoF terminology, it's like having a Mediator for a bunch of asynchronous and fully decoupled pub-sub Observers and Observables. It's a way to implement Barbara Hayes-Roth's Blackboard model.

JavaSpaces, TSpaces emphasized the over-the-internet aspects of it, which didn't matter to us for our project. We ended up finding and using something much simpler called LighTS. With all the nice support for concurrency in Java these days though, it wouldn't be much work to put one together from scratch.

[+] yagyu|7 years ago|reply
I used tuplespaces to implement a poor man's distributed computing for Matlab back in 2009 or so.

It would simply put Matlab code and parameters in a tuple, a worker would pick it up, compute, and put the results back. Used it to distribute the heavy function evaluation in a genetic optimization.

It was very easy and trouble free..

Edit: mixed up the implementations, I used TSpaces by IBM http://www.almaden.ibm.com/cs/TSpaces/Version3/ClientProgrGu...

[+] Immortalin|7 years ago|reply
Sounds like the current Function-as-a-service trend?
[+] DonHopkins|7 years ago|reply
In the discussion of "X and NeWS History", I mentioned "PIX", which integrated PostScript with tuple spaces on Transputers, in thread about how X-Windows is actually just a terribly designed and implemented distributed database with occasional visual side effects and pervasive race conditions:

https://news.ycombinator.com/item?id=15327211

Jon Steinhart: "Had he done some real design work and looked at what others were doing he might have realized that at its core, X was a distributed database system in which operations on some of the databases have visual side-effects. I forget the exact number, but X includes around 20 different databases: atoms, properties, contexts, selections, keymaps, etc. each with their own set of API calls. As a result, the X API is wide and shallow like the Mac, and full of interesting race conditions to boot. The whole thing could have been done with less than a dozen API calls."

To that end, one of the weirder and cooler re-implementations of NeWS was Cogent's PIX for transputers. It was basically a NeWS-like multiprocessing PostScript interpreter for Transputers, with Linda "tuple spaces" as an interprocess communication primitive:

http://ieeexplore.ieee.org/document/301904/

The Cogent Research XTM is a desktop parallel computer based on the INMOS T800 transputer. Designed to expand from two to several hundred processors, the XTM provides a transparent distributed computing environment both within a single workstation and among a collection of workstations. Using Linda tuple spaces as the basis for interprocess communication and synchronization, a Unix-compatible, server-based OS was constructed. A graphic user interface is provided by an interactive PostScript window server called PIX. All processors see the same set of system services, and within protection limits, programs capable of using many processors can spread out over a network of workstations and resource servers, acquiring the services of unused processors.

https://en.wikipedia.org/wiki/Transputer

http://wiki.c2.com/?TupleSpace

https://en.wikipedia.org/wiki/Tuple_space

[+] sitkack|7 years ago|reply
Don, I absolutely love your posts. Please keep them coming!
[+] toast0|7 years ago|reply
This sounds interesting, but feels like magic. How do the tuples get to where they're going? In my experience, computing bits that feel like magic have hidden costs, that are usually rather high.

Contrast this approach with Erlang, there's still a lot of tuples, but you have to (somehow) know where to send them at a (sometimes high) human cost to developers, but usually low runtime cost.

[+] makmanalp|7 years ago|reply
It's better to think of tuple spaces as a concurrency / communications model than an implementation. So it's more like "the actor model" rather than Erlang's or Java/Akka's specific implementation of it. It's more about "if we had this type of system with these constraints and these features, abstracting away these details, what would we gain or lose?". You're right that in the end a good or bad implementation can make or break things (take a look at this paper: https://arxiv.org/abs/1612.02979), but that's not the point, at least with the original paper.

The interesting thoughts from the paper as far as I can see were: 1) Tuple spaces are programming language or architecture or program independent, and vastly different programs can communicate with each other 2) You don't communicate directly to other agents by address, you write to a topic and read from a topic, which is a form of decoupling producers and consumers 3) The "block when nothing to read in this topic" idea, which makes programming coordination SO easy. I guess it's a bit like unix pipes.

If tuple spaces don't seem that interesting and novel, it's probably because of the benefit of hindsight and that a lot of these ideas are so subsumed into the tools of today. I can't definitively make the claim that Linda is the cause of this, but I suspect it had some effect. I think the original author also had a lot of wacky ideas around "cyberspace" and all that, but that's another deal and I don't think it's why people find the Linda paper interesting now. The closest useful descendants of Linda to be seem to be modern Pub / Sub systems or coordination databases like RabbitMQ, Kafka, Zookeeper.

[+] teilo|7 years ago|reply
I immediately thought of ETL when I read this.

I don't see how this tuple space concept can work possibly replace robust supervised process management. Also, how is tuple space different than message passing? Somehow a process needs to know what values it is supposed to consume. Something has to manage that, and before you know it, you are passing messages through tuple space and have, essentially, re-invented the wheel.

[+] protomyth|7 years ago|reply
If you are interested in tuple spaces then I recommend reading David Gelernter’s “Mirror Worlds: or the Day Software Puts the Universe in a Shoebox...How It Will Happen and What It Will Mean” and any Jini documentation you can get.
[+] jarpineh|7 years ago|reply
Huh. That's interesting. I remember reading Wired article about Jini as a pudding programmer and thinking "Yes, this can't happen soon enough". And I can still continue thinking...

Article is from 1998 and mentions (among other things) Tuple Spaces:

https://www.wired.com/1998/08/jini/

I wonder what ultimately decided Jini's fate.

[+] sitkack|7 years ago|reply
A few jumping off points if you like this sort of thing

LuaTS — A Reactive Event-Driven Tuple Space https://pdfs.semanticscholar.org/91cb/8c359920682fda35abd9c2...

https://redis.io/ (uses Lua for scripting)

https://en.wikipedia.org/wiki/Comparison_of_triplestores

Comet: An Active Key Value Store

https://vanish.cs.washington.edu/pubs/osdi2010comet.pdf

https://vanish.cs.washington.edu/pubs/osdi2010comet_presenta...

ZeroMQ http://zeromq.org/

I'd love to hear to distributed languages with first class support for tuple space type operations (not Erlang).

[+] riffraff|7 years ago|reply
Ruby had a bundled distributed tuplespace implementation (Rinda) for many years, built on top of druby.

I remember playing with it, and wondering why it wasn't more popular.

[+] mpweiher|7 years ago|reply
"And compile-time analysis of tuple in/out patterns can make it run efficiently in most cases; adhering to some simple patterns can help too."

Sounds like it might be overly generalized, with the developer having to implement actual mechanisms ("simple patterns") on top and the compiler/runtime having to figure out efficient implementations by presumably sophisticated analysis, all the time hoping that the two align.

[+] lukego|7 years ago|reply
Nix reminds me of tuplespaces. Each derivation in the store is a tuple describing how to evaluate a result. Active evaluations also activate their dependencies.
[+] macintux|7 years ago|reply
Somewhere I should still have an old JavaSpaces book I picked up specifically because I found tuplespaces such a compelling idea. I’ve tried to find a good way to use it over the years, but it’s never quite matched any problem I was trying to solve.

I seriously considered it as a way to share monitoring events between various systems that might be interested in consuming them: logging, billing, alerting, etc.

[+] mcguire|7 years ago|reply
The paper mentioned in the article, "Generative communication in Linda" by David Gelernter is available from citeseer: ://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.113.9679

With broken links. Nice. Try http://signallake.com/innovation/p80-gelernter.pdf

[+] mhd|7 years ago|reply
One does wonder how tuple spaces would've fared without the influence of the unabomber…
[+] sulam|7 years ago|reply
I wonder more how it would have fared if DG weren’t kind of an asshole.
[+] dangoor|7 years ago|reply
I worked at a company in 2002 that used tuplespaces for managing distribution of searches to worker machines. It worked really well! I don't remember us ever having trouble with that part of our system.
[+] aghillo|7 years ago|reply
Gosh, for my Bachelor’s final year dissertation in 1990 I implemented a distributed version of Linda / Tuplespace in C++ across Ansaware and Tanenbaum’s Amoeba OS. Seems a lifetime ago!
[+] jmount|7 years ago|reply
With JavaSpaces I remember really getting burned by the lack of queuing fairness guarantees. The ideas were nice- but execution was very laggy.
[+] galaxyLogic|7 years ago|reply
This would still run on top of threads or something. So how could a basic Java -based wen-server be better if it took advantage of JavaSpaces?
[+] klodolph|7 years ago|reply
Could we equally ask, “how something written in machine code be better if it took advantage of the JVM, because the JVM is implemented in machine code?”
[+] DrJosiah|7 years ago|reply
Built a python tuplespace around '06. Realized it was really just a task queue and / or RPC abstraction with a database.

Now mostly just use RPCs and queues directly, depending on whether I need something now, or later.