top | item 25848542

Flow-Based Programming

217 points| hypomnemata | 5 years ago |jpaulm.github.io

103 comments

order
[+] samuell|5 years ago|reply
The core of the FBP principles are the holy grail of true componentized function architecture.

Instead of losing yourself in ever more complex syntax convolutions as has happened in a lot of functional programming, you make the components (even long running stateful ones) self contained with ports for data input and output as the sole means of communicating with them, over buffered channels to allow asynchronous computation, and most importantly keep the network definition separate.

Just this idea in itself is just brilliant. Hats off to Mr Morrison for that!

It allows to decouple complex software into reusable components, without clever FP syntax.

(Though, FP is perfect for implementing the components themselves. It just doesn't really scale all too well for whole program architecture, in my experience).

One point to note: The visual component of many fbp systems is completely optional, as is the idea of using novel DSLs. You can as well define your networks and components in pure code. See GoFlow https://github.com/trustmaster/goflow) and my own little experiment FlowBase (https://flowbase.org) for examples of that, in Go.

I successfully built a rather complex little app to convert from Semantic web RDF format to (semantic) mediawiki XML dump format, in two weeks straight, of linear development time: for each component (of ca 7), implement, test, go to the next component (See: https://github.com/rdfio/rdf2smw)

The same implementation in procedural PHP took months, and still doesn't have all bugs and strange behaviours filed out.

[+] the_duke|5 years ago|reply
> you make the components self contained with ports for data input and output as the sole means of communicating with them, over buffered channels to allow asynchronous computation

This concept sounds exactly like actor systems like Erlang/OTP and Akka, only with a different set of terminology.

The submitted site and your comment don't mention those anywhere.

Are there appreciable differences between actor systems and FBP?

[+] megameter|5 years ago|reply
Having studied and implemented FBP systems in the past, one major takeaway I've gleaned is that most automation problems start off as a linear sequence of processes, and so the branching-graph of FBP looks unwieldy and superfluous. But this is deceptive; you probably don't want to have a huge number of branches in the design, but you will want them as an optimization step or a way to combine inputs.

So it's useful to design with FBP in mind but with a linear interface as the entry point.

Another aspect of this is that FBP graphs are static but you may have a need to reconfigure them frequently; that is, you may want to have a graph compilation step drawn from a source language, rather than manually wiring it up.

A way of making that graph compilation more than a syntax is to include a formal constraint solver: Excel, for example, flows the data after determining a solution for how cells relate to each other. The power of the spreadsheet paradigm really lies in these combinations of concepts.

Lastly, there isn't really magic in the algorithmic/implementation aspects of FBP. It grew out of 1960's mainframe types of problems, and so it can be implemented in a low level way with static pools of memory and pieces of assembly code. But it remains conceptually just as relevant to today's massive distributed systems.

[+] vitalhead|5 years ago|reply
Visual aspect of FBP is also brilliant and helps to navigate/comprehend inherently parallel execution flow. The same complexity network expressed in linear text would require more effort to grasp. And it allows you to leverage your brain's visual capacity which is quite powerful - humans are visual creatures (https://www.seyens.com/humans-are-visual-creatures). For example in Excel the update dependency network is completely hidden. The maintenance of complex Excel models is really hard.

For same reasons we start looking at FBP and functional reactive programing to simplify the design and maintenance of complex interactive UIs. End up implementing Kelp (https://kelp.app) with visual FBP editor and reactive framework (https://kefirjs.github.io/kefir/).

[+] codetrotter|5 years ago|reply
Your flowbase.org domain redirects to a different domain. I thought that perhaps you had a typo in the URL or forgot to renew your registration of the domain, but upon a closer look I think it’s probably just a misconfiguration in your webserver config causing it to redirect to another one of your projects.

My guess is that you forgot to put the definition of the domain sans www in your webserver config.

If my guess is correct then this link should work:

https://www.flowbase.org

Edit: doesn’t work either. Perhaps you don’t have a TLS cert for that domain? In which case maybe this link will work instead:

http://www.flowbase.org

Edit 2: Without https it works. And like the person responding to this comment said it redirects to the GH repo.

[+] dustingetz|5 years ago|reply
FP decouples the AST (symbolic functions with input ports and an output port) from the evaluation context, which might be async, or sync, backpressured, stateful, exceptional, incremental/reactive ... mix-n-match whatever behaviors you want, all for the same abstract AST
[+] goliatone|5 years ago|reply
The flowbase.org link is redirecting me to a polish website, not sure what’s going on there
[+] ledauphin|5 years ago|reply
I don't mean to derail the conversation, but this really does remind me of the game Factorio, though sort of in reverse.

In Factorio, you build a larger and larger factory out of pre-established functional components (assemblers, labs, chemical plants, etc) that take in a limited set of inputs and produce (usually) a single output. Your challenge is not to define the functional core processes, but instead to wire together those functional components by connecting their inputs and outputs in ever-more-automated fashion, starting by hand, then using simple belts (pipes) that eventually allow arbitrary load-balancing via "splitters", and eventually through to trains (the forking and load balancing happening via backpressure in the train system) and robots (where everything is managed essentially as a single state database of requests, and backpressure is provided by output limitations, usually per functional component).

Naively, I think that someday a decent chunk of programming might actually look like this, and parts even be represented visually (though in my opinion likely still defined formally as text). Only I think programmers will continue to write the functional components themselves, unlike in Factorio. They'll just live on different levels of the "codebase", and the "pipes" level will likely be a lot more abstracted than it is in Factorio.

As a software developer, I find this paradigm to map very well to serverless architectures, because you generally want to think a level higher than the per-machine basis. It does require a willingness to forgo handy and well-established tools like the filesystem and Unix pipes in favor of higher level abstractions around transfer and storage of data.

[+] freeqaz|5 years ago|reply
You have, from first principles, reconstructed a huge portion of my thought process for building https://refinery.io

Factorio and Minecraft automation mods are a big inspiration! Check out InfiniFactory too :)

Bridging existing applications to the Serverless paradigm is far from simple. That's one of the biggest struggles I've experienced trying to build a Flow-based software platform.

Learn more every day though. Thank you for the interesting comment!

[+] dgb23|5 years ago|reply
Another video game that works similarly is Oxygen not Included.

One of the most compelling arguments for a data-flow/flow-based programming is the mental model and the visualization aspect of it. This opens up opportunities for monitoring, visual, data-driven programming and it is white board friendly.

In Elements of Clojure[0], the author discusses the concept of "principled components and adaptive systems". And a flow based design reminds me of exactly that. The semantics of composition and communication are well-defined and universal, but internally the components can (should) be specific and concrete.

Similar can be said about Small Talk as well. A primary aspect of its design was the mental model, understanding and learning. The core idea was that learners (especially children) understand things in terms of their operational semantics.

> I don't mean to derail the conversation, but this really does remind me of the game Factorio, though sort of in reverse.

So no, I don't think this is a derailment, but likely one of the most important aspects of paradigms like this.

[0]https://elementsofclojure.com/

[+] dang|5 years ago|reply
[+] homieg33|5 years ago|reply
Very useful list. Thanks for providing.
[+] jgraettinger1|5 years ago|reply
We're building a tool, Estuary Flow, which seeks to be an end-to-end realization of practical, configuration driven, and scale-out flow-based programming -- with an important twist.

The central concept is a "collection", an append-only set of schematized documents, which can be captured and materialized into other systems (e.x. pub/sub, S3 buckets, etc). "Derivations" are collections defined in terms of source collections, and stateful transformations/joins/aggregations which are applied to them.

A key twist is that collections are simultaneously a batch dataset (backed by cloud-storage) and also a real-time stream. They unify the current dichotomy of "historical" vs "streaming" data into a single addressed concept. Declare a new derivation, and it automatically back-fills over history right from S3, then seamlessly transitions to live data.

If this sounds interesting, check out our docs [0]. We're early, but love feedback!

[0] https://estuary.readthedocs.io/en/latest/README.html

[+] bergie|5 years ago|reply
Nice to see this here, as it has been the inspiration for quite a few years of my work: https://noflojs.org/
[+] bmitc|5 years ago|reply
Do you have a shining example of someone using NoFlo? It doesn't have to be big, but one that you think really exemplifies what it excels at.

(By the way, I'm a big fan of visual programming, so I'm just curious.)

[+] leetrout|5 years ago|reply
NoFlo really is impressive. Great work.
[+] hpoe|5 years ago|reply
As I was reading this I was enthusiastically agreeing with the idea, but something felt super familiar about it. Then I realized this is basically the same concept as Unix pipes.

I have my little individual programs, grep, awk, sed, jq, etc, and then I can endlessly mix and match those different components to do what I want.

The limitation that I have seen with Unix pipes isn't often with the ability to process or manage the data it is that it only works if all the data is setup the proper way. As I have been in the industry longer it seems to me that most code that gets written isn't about actual computing but just centered around importing and transforming data that is expressed in different ways.

Is there something that makes reading in disparate data records easier so I can focus on the computing part of computer program and less time on parsing data?

[+] adamnemecek|5 years ago|reply
Unix pipes make it hard to have a large graph. Like most pipes take one thing and produce one thing. You don’t really have complex data flow graphs.
[+] akavel|5 years ago|reply
[+] pantsforbirds|5 years ago|reply
Luna is one of the coolest projects I've seen. I've really enjoyed watching the progress.

Curious to see if it (or concepts from it) gets picked up for programming education one day.

[+] samuell|5 years ago|reply
I'm super intrigued by luna/enso. Only that every time I've tried it, the editor has had serious stability problems. I wonder if a more traditional tooling and editor support wouldn't be more successful for wide adoption?
[+] mikewarot|5 years ago|reply
Side effects are forbidden by structure, flows could be monitored in a GUI/Debugger, and as a result components can be tested as a unit, instead of a whole system. I love it!

It is easier to design digital circuits when you have a whole catalog of 7400 and 4000 series gates, than it is using individual transistors. It is easier to wire a house when you're not making wires and switches with a hammer and a forge.

I welcome this new higher level of abstraction, and am willing to pay the cost in terms of CPU and Memory to get there, just as I'm willing to waste transistors or copper wire to have something done and working.

[+] teknopurge|5 years ago|reply
https://nodered.org/ - great project for all sorts of needs.
[+] jcims|5 years ago|reply
I recently used nodered for some process automation and it was absolutely fantastic for quickly prototyping, dashboarding, doing sensor integration, etc. Highly recommended!!!
[+] jpaulmorrison|5 years ago|reply
General comment about the Wikipedia article - https://en.wikipedia.org/wiki/Flow-based_programming : given all the interesting discussion on the 'Net, including this thread, it might be time to update the Wikipedia article.

The most substantive addition in recent years is Akaigoro's comment about Actors, added Jan. 2020 (@guitarvydas, care to jump in?!), preceeded, I think, by Joe Witt's reference to NiFi in mid-2018. This general lack of activity I feel might lead readers to assume that FBP is an outdated concept, whereas this thread proves it's definitely alive and kicking! I, personally, am not allowed to update the article, due to the WP ban on self-promotion, so I would like to encourage people to add topics, controversies, anything, to the WP article... Freshen it up a bit, as it were! Thanks, and stay safe, everyone!

[+] amelius|5 years ago|reply
Flow-based programming is cool, until the flow starts altering the flow.
[+] cjohnson318|5 years ago|reply
I think at that point it's a complex dynamic system.
[+] devmunchies|5 years ago|reply
Similar concept to what I do when programming in F#/OCaml by default. You create your data types and then they "flow" through functions. Each function is `input -> output` but bigger workflow are also input -> output.
[+] adamnemecek|5 years ago|reply
The main difference is that with data flow, the flows happen conceptually at the same time.
[+] analog31|5 years ago|reply
It might not tick all of the boxes for being a complete software development tool, but Excel strikes me as being a dataflow programming model. I wonder if it's a reason why it's easy for laypeople to learn.
[+] phreeza|5 years ago|reply
Apache Beam seems to be an implementation of this idea. It works well when the things you have to do matches the logic, but it gets tricky if you need to do stuff iteratively or recursively.
[+] samuell|5 years ago|reply
Beam has some overlap, but in my understanding it has a rather involved syntax for defining the data flows, quite far from the simple list of connections between in and out ports that you see in FBP systems following J P Morrisons principles more closely.

Apache NiFi comes quite a lot closer, with the main difference that they only have a single in-port, instead of separate named ones. Also it seems to be among the more heavy and complex implementations (for good and bad).

[+] galaxyLogic|5 years ago|reply
In Dataflow -programming (or something like that) how do you program a decision that depends on the result of some component?

It would seem like I need to suspend my computation then "ask" the result from some component, get the result back, and then alter my computation based on that result.

Pure flow-forward would not seem to support this easily. Or can it? Or does it come down to that we always will need BOTH sync- and async- functions? (unless of course we limit the problem domain)

[+] bmitc|5 years ago|reply
For dataflow programming, you introduce structures. For example, in LabVIEW, one has many types of structures such as for loops, while loops, case structures (for making decisions), event structures (for listening to and responding to user events), type specialization structures (for reacting, at development/compile time, to types in a dynamic way as a type of development/compile time polymorphism), conditional compile structures (for conditionally compiling different parts of code), and disable structures (for disabling certain parts of code).

Here's a simple case structure for a decision based upon a boolean value: https://imgur.com/a/YyzXVhU

This is really no different than the type of dataflow you have in other functional languages like F# or Racket. Values flow into special syntax forms such as if or match that allow branching.

[+] pantsforbirds|5 years ago|reply
You can write pretty interesting data pipelines using akka streams with a very similar idea to FBP. It's probably not technically FBP, but the whole reactive stream implementation is similar in thought but allows for things like disparate data source speeds without blowing out part of the flow.
[+] Ericson2314|5 years ago|reply
This is wildly under developed, mathematically, but eventually with enough FP and category theory and things, we will get back there.

The problem is everyone wants to write little nodes and "just" wire them up. But all the real complexity is not in the nodes, but the nature of the wires and their composition.

We'll get there though, stay tuned.