top | item 14634947

Designing state machines

211 points| adipasquale | 8 years ago |drivy.engineering | reply

57 comments

order
[+] idbfs|8 years ago|reply
A paper that changed my approach to designing state machines is "Statecharts: A Visual Formalism for Complex Systems" by Harel (http://www.wisdom.weizmann.ac.il/~harel/papers/Statecharts.p...).

Statecharts (also called hierarchical state machines) are essentially generalized state machines which allow for nesting and parallel composition of states. The 'nesting' part is my favourite, since it allows one to delegate event handling logic shared by multiple states to a 'parent' state, reducing code duplication.

The great thing about this paper is that you can glean most of its key ideas by just looking at the diagrams.

[+] nickpsecurity|8 years ago|reply
Exactly. UML looked like over-complicated crap when I saw it after reading on Statecharts and Yourdon. The latter are still in use on occasion in high-assurance with Tenix's Dats Diode being an example.
[+] vanderZwan|8 years ago|reply
Funny how most visual programming languages (for example, puredata) look like these diagrams.
[+] activatedgeek|8 years ago|reply
While I understand the importance of state machines and its direct usage in complex Business Logic Workflows (and indirectly almost everywhere), there is nothing special in this post. It is a generic comment on State Machines. Am I missing something here?
[+] bri3d|8 years ago|reply
A surprising number of entrepreneurs and developers are unfamiliar with the application of state machines to business logic workflows, so it's useful to see the topic presented in a succinct way. Good learning content tends to get upvoted on HN, especially when the discussion around the concept (for example, the discussion of State Machines and state machine libraries in this thread) is interesting as well.
[+] rebootthesystem|8 years ago|reply
I am always surprised to see how many software developers I come across who have no experience whatsoever applying FSM's in the course of their work.

I can unequivocally say that FSM's have made me tons of money. By that I mean that they have simplified projects, made them far more robust, easily extensible, simple to modify and, in general terms, quicker to develop with less bugs and errors.

I've used FSM's in FPGA's (for example, optimized DDR memory controllers), embedded projects, robotics and system software. In other words, everywhere.

[+] ge96|8 years ago|reply
Is "state machines" a global term that includes finite state machines like how vending machines deal with varying coin amounts that go in to trigger the price and dispense proper change?

Googled looks like it is. That was a cool homework/project.

[+] zcdziura|8 years ago|reply
There was an article posted here a while back on how to implement a state machine in Rust, purely using its type system to facilitate state transitions. It's a neat concept, even if it's a bit more verbose than what you'll find in other languages.

https://hoverbear.org/2016/10/12/rust-state-machine-pattern/

[+] nuclx|8 years ago|reply
How does this compare to the expressivity of Idris' dependent types? I'm intrigued by the idea of type-safe state-machines as e.g. file i/o and network socket code is almost always error-prone. So it feels crucial to rely on the support of a strong type system here.

http://docs.idris-lang.org/en/latest/st/machines.html

[+] estsauver|8 years ago|reply
I can't recommend Akka's state machines enough. It's an incredibly simple and rock solid state machines built right into the library. (http://doc.akka.io/docs/akka/snapshot/scala/fsm.html) We're already on a Play/Scala backend so adding these was incredibly easy for us, but I think they're one of the hidden amazing features of the akka/scala world.
[+] saosebastiao|8 years ago|reply
Hmm. I've tried using them multiple times, but have ended up ditching them for one reason or another. There always seems to be one thing that makes it easier to just implement one via a regular akka actor or a regular scala class.
[+] djb_hackernews|8 years ago|reply
I just looked through and I may have missed it but do you know if this does distributed FSM management?
[+] keithnz|8 years ago|reply
I use it with akka.net, it's pretty cool
[+] karboosx|8 years ago|reply
There is good tool for making diagrams (so state machines as well) called mermaid[1], just by writing simple readable text.

[1] https://github.com/knsv/mermaid

[+] nerdponx|8 years ago|reply
I wanted to like Mermaid, then I realized it was just a neutered version of GraphViz. I had wanted to write a Mermaid-to-Graphviz compiler but never got around to it.

It produces pretty ugly graphs that can't be exported to any other format. Meanwhile just about ever graph manipulation program (e.g. Gephi) can handle GraphViz DOT format.

[+] troels|8 years ago|reply
When dealing with business processes, I feel that the evented nature of a formal fsm can be a bit off-putting. I have on several occasions used a batch-processing approach, that maintains state in a persistent record, giving many of the same characteristics as an fsm. Let's say we have our movie from the example (pseudo-code, obviously):

    class MovieState < Model
      belongs_to :movie
    end

    class InitialStep
      def select
        MovieState.where(state: 'initial')
      end

      def process(movie_state)
        movie_state.state = 'in_production'
      end
    end

    class InProductionStep
      def select
        MovieState.where(state: 'in_production')
      end

      def process(movie_state)
        if movie_state.movie.finished_shooting?
          movie_state.state = 'in_theaters'
        end
      end
    end

    class MovieWorkflow
      def initialize
        @steps = []
        @steps << InitialStep.new
        @steps << InProductionStep.new
      end

      def run
        @steps.each do |step|
          step.select.each do |state|
            step.process(state)
            state.save
          end
        end
      end
    end
This makes it very clear when things are happening. It's also easy to debug and test.

Of course, there are a lot of limitations, most notably that it can't deal with an async event happening. In my experience though, this can be dealt with by storing the payload of that event somewhere in the db, where the step-selector can access it on next run. One thing this model deals very well with, is delays, which is often part of business processes.

Any thoughts? Does this design have a more formal name?

[+] rebootthesystem|8 years ago|reply
That's the problem with kids these days. Everything has to be a friggin class.

You know we did some pretty amazing things with computers before OO. We even landed people on the Moon.

Can anyone program without OO any more?

Just saying.

[+] bushin|8 years ago|reply
Looks like an event loop.
[+] pwm|8 years ago|reply
I have recently created some dead simple code to explain the concept to colleagues whom, as the article notes, were not using this very simple and effective construct at all to control state transitions in business apps.

As the article does not really explain how state machines work let me try to do it here so maybe you will go back to your code and put one in place :) In general, if anywhere in your code you have some property/field called status/state/etc... that you update depending on events then chances are that a state machine would improve the robustness of your code.

Ok, so you have some states. For business apps this usually comes from business. For our dummy example we will simulate a day (using Haskell here for conciseness, but no worry, it's trivial and you don't need to know any Haskell to understand it):

    data State = Awake | Dressed | Fed | Ready | Work | Cafe | Home | Asleep
Now most apps stop here and thus they don't have a machinery to control how to move between these. If we were to draw a graph where vertices are the above states and edges are how we go from one to another then, without any restriction, we would implicitly get a complete graph where you can go from any state to any state. Eg. from "Ready" I could go straight to "Asleep" without "Work" :) Our aim is thus to restrict transitions from one state to another to transitions that actually make sense. Making sense obviously depends on the problem we are solving. For our example let's introduce the following events:

    data Event = Alarm | Dress | Eat | GoToWork | Think | Yawn | Coffee | GoHome | Read | WatchTV
Now that we have states and corresponding events we can draw a graph where vertices are states and edges are events. The graph is now explicit in that we can only move between states (vertices) via events (edges). To put this graph into code, let's create a function that takes a pair of state and event, eg. (Asleep, Alarm) and returns a new state, eg. Awake:

    myDay :: (State, Event) -> State
As said above, myDay is a function that takes a (state, event) pair, which we can also call a transition, and returns a state. This function is nothing more than a lookup table, aka. transition table, where we will write down how to transition between states:

    myDay (Asleep,  Alarm)    = Awake
    myDay (Awake,   Dress)    = Dressed
    myDay (Awake,   Eat)      = Fed
    myDay (Dressed, Eat)      = Ready
    myDay (Fed,     Dress)    = Ready
    myDay (Ready,   GoToWork) = Work
    myDay (Work,    Think)    = Work
    myDay (Work,    Yawn)     = Cafe
    myDay (Cafe,    Coffee)   = Work
    myDay (Work,    GoHome)   = Home
    myDay (Home,    Read)     = Asleep
    myDay (Home,    WatchTV)  = Asleep
    myDay (_, _)              = error "invalid state transition"
That's it, we have codified our graph. We explicitly stated the valid transitions and anything else is by definition invalid. We are pretty much done at this point. A state machine itself can be thought of as a generic function that takes a transition table, a transition and returns a state. This is only to decouple a specific transition table from the general machine itself, but it literally does nothing other than lookup whether the transition is in the table. If yes then it gives back the corresponding new state otherwise we get and error.

You can see a pretty picture of the above graph and some code here: https://github.com/pwm/fsm

[+] pavanto|8 years ago|reply
Brilliant. Superb explanation! Just couple of days I was trying to understand this from my team mates and it took him 30 mins to make it complicated.
[+] abakker|8 years ago|reply
Excellent explanation. Thanks!
[+] Dowwie|8 years ago|reply
Well done. The puml script was a nice addition!
[+] Eclyps|8 years ago|reply
For anyone else working in ruby, I've grown to love the "Workflow" gem[1] for my state machines. It really helps to streamline things, and also has a built-in way to generate a visual of your class's states and transitions.

[1] https://github.com/geekq/workflow

[+] tcopeland|8 years ago|reply
Looks interesting! I've always used https://github.com/aasm/aasm, but this looks pretty good.

More generally I get worried about my ActiveRecord objects getting too bulky and business logic getting all wound up in post-transition callbacks and such. But that's something that can be managed with service objects and being disciplined about triggering state changes.

[+] jugg1es|8 years ago|reply
Can't recommend state machines enough. A well designed state machine makes troubleshooting straightforward and makes adding new features a breeze without refactoring much.
[+] siddhant|8 years ago|reply
In case you're using Python, I can't recommend this library enough - https://github.com/pytransitions/transitions . An absolute pleasure to work with.
[+] carapace|8 years ago|reply
For goodness' sake, no! I just ripped that package out of our production code. It's terrible! It adds methods to your object at runtime, crazy stuff.

I replaced it with just this:

        class StateMachine(object):
	    '''
	    A very simple FSM class.

	    Params:
	        initial: Initial state
	        table: A dict (current, event) -> target
	    '''

	    def __init__(self, initial, table):
	        self.current_state = initial
	        self.state_table = table

	    def __call__(self, event):
	        '''Trigger one state transition.'''
	        self.current_state = self.state_table[self.current_state, event]


	class Foo(object):

	    STATE_TABLE = {
		(current_state, event): next_state,
		...
	    }

	    def __init__(self, xid, token, config):
		self.fsm = StateMachine('start', self.STATE_TABLE)

	    def begin(self):
	        while self.fsm.current_state not in {'success', 'error'}:
	            method = getattr(self, self.fsm.current_state)
	            self.fsm(method())
	        if self.fsm.current_state == 'error':
	            self.report_error()


If you're using Python anything more involved than a dict mapping (current, transition) -> next is just overkill. Use it, don't abuse it. ;-)
[+] jbritton|8 years ago|reply
I typically try to avoid state machines. I like to have a function like model = reduce(model, event) where model is an environment of variable:value bindings. This function then just uses (if expr/else if expr) to find a condition in which to process the event. Processing the event involves changing the values in the model environment. I suppose on some level the two approaches are the same, but they feel and look different to me. If the "if/else if" pattern is deeply nested, then this is kind of like traversing a state machine to reach the correct state for processing the event. So in this case maybe a state machine is more efficient. However, a state machine, involves a history. You really need an external diagram. You need to see how you get to a state and how you transition out of a state. An "if/else if" type approach puts all the decision making in the code right in front of me. I don't need a diagram. The "if/else if" approach is often much more concise as the expressions are a powerful modeling approach. Finally, I prefer to avoid deeply nested if/else if" conditions, because it can be a chore to traverse the tree and really understand the path taken to get to the event handling spot. If the conditions can be written as linear list of rules where each rule can be thought about in isolation, I find this easier. This often involves repeating test conditions. Just flatten the decision tree by repeating the preconditions. This is again less efficient, but if you save the results of the expression computations then it becomes a simple test of Boolean values. I like to think of this approach as a rules engine instead of a state machine. A state machine encodes information in the state diagram so that it doesn't need to be stored in variables. The rules engine encodes all the knowledge in variables and rules.
[+] rebootthesystem|8 years ago|reply
You might want to read-up on table driven state machines. I'm not sure you have a full view of what's possible.
[+] jrmiii|8 years ago|reply
In regard to tip 4, I've been using statesman[1] which allows you to store the transitions as table backed active record objects along with any contextual metadata.

Depending on how much auditability you need on why transitions happened, this may be preferable to papertrail.

[1] https://github.com/gocardless/statesman

[+] agopaul|8 years ago|reply
What a coincidence, I've been working on a PHP state machine implementation myself in the last few days.

Anyway, I find the concept of "events" a bit weird in this case and I prefer to work with "states" only (state transitions rather than events that change the state). It seems like an unnecessary abstraction to think in terms of an event rather than a state.

[+] pwm|8 years ago|reply
Ultimately state is a function of time. Events are thus temporal effects from the outside world that can, depending on their values, cause a transition from a state to another state within the machine. If you model the machine as a graph where vertices are states and edges are events then a transition can be seen as a function from a (state, event) pair to a state, ie. (CurrState, Event) -> NewState. I've posted a simple explanation below, check it out if you like.
[+] sandGorgon|8 years ago|reply
What about the DB side of things ? How do you model the state graph in the DB ?

I have been thinking of using Postgresql CTE to do this. When a cycle occurs, I just create a new step to the dB.

Anyone from CRM startups who have interesting learnings here ?

[+] adipasquale|8 years ago|reply
If you mean modeling the state graph itself, it's usually not modeled in the db but only in the code. It could indeed be interesting to store the graph itself if it evolves often, or at least a version number.

If you were speaking about storing the instances lifecycles, a simple model using a RDBS is to store one row per transition event in a separate table. This is what the papertrail[1] gem does for example.

[1] https://github.com/airblade/paper_trail

[+] felixge|8 years ago|reply
You can use user defined aggregates in postgres to model state machines. I'm using this technique quite successfully and have been meaning to write a blog post about it.
[+] Olap84|8 years ago|reply
Anyone got tips for working with FSMs in REST and/or MVC? Very rarely do you find HTTP verbs match FSM events leading to a fudge somewhere
[+] smizell|8 years ago|reply
You might find it useful to spend some time looking into the hypermedia constraint of REST itself (i.e. hypermedia as the engine of application state). The point of REST is to represent state machines by including hypermedia links in the responses of the API. Those links describe the state transitions a client may invoke in a given state.

Instead then of mapping CRUD operations to HTTP, you would map your application semantics to HTTP methods via link relations. Link relations let you go as far past CRUD as you would like to go depending on your domain and allow you to describe your states and state machine.

I always think a good way to think about a hypermedia state machine is in the context of HTML. Consider a todo app example. You might enter the application and get an empty list of todo items. If you have permissions to add a todo, you might have a "create todo" HTML form. Once you use that form, that todo has a new form called "mark complete." Once invoked, that todo has new transitions called "mark incomplete" or "archive." Once archived, you might see a new form for "unarchive." All of this captures the state machine and transitions in the REST API itself using domain-specific semantics.

Of course, there are other ways of solving this problem, but REST with hypermedia is a great way to work with state machines. There are lots of hypermedia JSON formats out there if you're interested in exploring (e.g. HAL, Siren, Collection+JSON, etc.).