Stately
All blog posts

You don’t need a library for state machines

By David Khourshid on January 20, 2021Edit this page on GitHub

  • #finite state machine
  • #statechart
  • #xstate
  • #state machine

The finite state machine is one of the oldest models of computation in computer science. It’s older than the web, older than any programming language you can think of, and probably older than you. Just ask Mealy (1955) or Moore (1956). Finite state machines (FSMs) can be implemented in any modern language using control-flow statements, yet there’s most likely a state machine library (if not many) in all of those languages.

So do you need a library to create and interpret state machines in your programs?

No. But there are more things to consider.

You probably need state machines

If you’re unfamiliar with finite state machines (FSMs), they are a visual and mathematical way of modeling stateful logic using 3 main building blocks:

  • Finite states, which represent different behaviors
  • Events, which represent something that happened that can change state
  • Transitions, which represent how the state can change and what actions are executed when an event is received

States, events, and transitions

Anything that can be described as changes in state over time due to events, from component-specific logic to application flows and even the orchestration of multiple services can be described with state machines, to some extent.

A state machine might be a different, unfamiliar way of thinking about your application logic, but they’re extremely useful. Instead of approaching logic from a "bottom-up" perspective (imperatively doing things based on events), they take a "top-down" approach and primarily consider behaviors, which describe how the logic will react to events in a given finite state (such as loading, editing, disabled, etc.).

Because of their explicit, declarative nature, state machines force you to think about the entire flow of your logic (including all the edge-cases), and make it virtually impossible to end up in an ”impossible state”, as long as your model doesn’t allow it. Only defined transitions can happen; and if an unexpected transition happens, it means there is an implicit state machine where that transition does exist. The goal of state machines is to eliminate the implicit transitions so that we can know exactly what can happen in any state for any potential event.

State machines are not a solution for everything - just like anything else, they make sense for some use-cases (workflows, processes, modes, statuses, etc.) but not all use-cases. You shouldn’t use state machines everywhere, or even implement them explicitly all of the time (that’s what abstractions are for). They make a good refactor target, and they’re great for visually modeling your logic with pencil and paper, even if you ultimately decide not to use them in your code. But when working with logic that deals with explicit states, events, and transitions (which, surprise, tends to be the majority of app logic), state machines are a brilliant, natural solution.

There are so many other benefits to thinking in terms of states, events, and transitions, but that’s not the point of this post (but it is the point of another post I wrote). Let’s say you’re already convinced in using state machines in parts of your app. Should you reach for a library?

You don’t need a library for state machines

Since state machines are not a new concept and can be implemented in any modern language using built-in language features, it follows that state machine libraries are not necessary. Again, all you need are the 3 building blocks:

  • Finite states
  • Events
  • Transitions

The transitions are what tie everything together. Transitions are represented by a state-transition function that looks like this, mathematically:

𝛅 : 𝑆 𝗑 𝛴 → 𝑆

…which might not make sense (even if you do speak Greek). This might be more understandable:

transition : (state, event) => nextState

In JavaScript, we can represent this as a reducer, which is a function that reduces values (events) to a single accumulated value (state):

function transition(state, event) {
  // state machine goes here, which
  // determines the next state based on the
  // current state + received event
  // ...

  return nextState;
}

Now, let’s draw the rest of the owl implement the rest of the state machine!

Using switch statements

Typically, when we’re determining behavior ("what happens next"), we tend to decide what should happen next based on the event. The finite state is an after-thought, if it’s even a consideration at all. This leads to fragile logic, with if-statements strewn all over the place:

// ❌ Event-first approach
    switch (event.type) {
      case 'DATA_RECEIVED':
        // defensive programming
        if (state.isLoading) {
          // do something
        } else {
          // ...
        }
      }
      // ...
    }

In contrast, state machines group behavior by finite state and narrow down what happens next based on the event received:

// ✅ Finite-state-first approach
switch (state.status) {
  case "loading":
    // narrow based on event
    switch (event.type) {
      case "DATA_RECEIVED":
      // do something, and possibly
      // change the finite state
      // ...
    }
  // ...
}

As the author of the code, the event-first (bottom-up) approach might seem fine to you; after all, if it works, it works. One of the main advantages of taking a ”finite-state-first” (top-down) approach and using state machines is that the logic is not only more clear (since it’s grouped by finite state), it’s more robust: you can ensure that an event won’t be improperly handled in a state that it shouldn’t be handled in. In other words, you prevent impossible states and impossible transitions without having to litter your code with if-statements and excessive defensive programming.

I also like to think of state machines as a formal way of communicating logic. If you were describing the above logic, here’s how it would sound with an event-first approach:

When data is received, do something, but only if the "loading" flag is true.

And with a finite-state-first approach:

In the ”loading” state, when data is received, do something.

Which one sounds more natural and easy to understand? To me, there is less cognitive load with the 2nd statement. Reactions to events are grouped by behavior (finite state) rather than being ungrouped.

Using switch statements with functions

Since finite states can be considered a way to group behavior, another way you can organize your switch statements is by “grouping” each finite state’s behavior into a function:

// 'loading' behavior
    function loadingState(state, event) {
      // switch only on the event
      switch (event.type) {
        case 'DATA_RECEIVED':
          return {
            ...state,
            status: 'success'
          }
        }
        // ...
      }
    }

    function dataMachine(state, event) {
      switch (state.status) {
        case 'loading':
          // handle the event with 'loading' behavior
          return loadingState(state, event);
        }
        // ...
      }
    }

This approach is outlined in the Redux style guide recommendation: Treat Reducers as State Machines. It’s a very organized approach, and each “behavior function” can be individually tested, since they are isolated, pure reducers.

Using objects

Using nested switch statements may feel verbose, and while using functions to organize these switch statements may look cleaner, it’s more tedious. After all, a state transition can be considered a configuration of (at least) 2 things based on the event received:

  • The next finite state, if it changes
  • Any action(s) executed, if any

A simple, built-in way to represent such a configuration is an object. We can create an object structure where each ”state node” represents a finite state with transitions for each event accepted by the state:

const machine = {
  initial: "loading",
  states: {
    // A finite "state node"
    loading: {
      on: {
        // event types
        DATA_RECEIVED: {
          target: "success",
          // actions: [...]
        },
      },
    },
    // ...
  },
};
// ...

This is much more succinct than the nested switch statements! From here, determining the next state based on the current finite state and received event is two key lookups (the finite state and the event type):

// ...
function transition(state, event) {
  const nextStateNode = // lookup next finite state based on event type
    // lookup configuration for current finite state
    machine.states[state.status].on?.[event.type] ?? { target: state.status }; // if not handled, stay on current state

  return {
    ...state,
    status: nextStateNode.target,
  };
}

transition({ status: "loading" }, { type: "DATA_RECEIVED" });
// => { status: 'success', ... }

You might be wondering why I didn’t use an even simpler object here, which you definitely can do:

const transitions = {
  loading: {
    DATA_RECEIVED: "success",
  },
  success: {
    /* ... */
  },
};

function transition(state, event) {
  const nextStateTarget = transitions[state.status][event.type] ?? state.status;

  return {
    ...state,
    status: nextStateTarget,
  };
}

In fact, I would encourage the above implementation as sort of a “transition table lookup”; it works, and it’s simple enough. However, state machines deal with more than just the next finite state; if we want to encode actions (state machine terminology for effects), we need a place to put them, so a little bit more structure is necessary.

For instance, if our DATA_RECEIVED event returns data that we want to save in our overall state, it might be convenient to place that “assign to state” action directly in the machine:

const machine = {
  initial: "loading",
  states: {
    loading: {
      on: {
        // event types
        DATA_RECEIVED: {
          target: "success",
          // represents what "effects" should happen
          // as a result of taking this transition
          actions: [{ type: "saveData" }],
        },
      },
    },
    // ...
  },
};

function transition(state, event) {
  const nextStateNode = machine.states[state.status].on?.[event.type] ?? {
    target: state.status,
  };

  const nextState = {
    ...state,
    status: nextStateNode.target,
  };

  // go through the actions to determine
  // what should be done
  nextStateNode.actions?.forEach((action) => {
    if (action.type === "saveData") {
      nextState.data = event.data;
    }
  });

  return nextState;
}

The implementation above is very small, accomplishes everything we want from a state machine (for this use-case, at least), and as a bonus, you can copy-paste the machine object code directly into the XState Visualizer, even though it's not using XState, or any libraries, at all! (Tip: wrap the object in Machine({ ... }) to get it working).

Kent C. Dodds made a similar implementation is his post Implementing a Simple State Machine Library in JavaScript. It also takes advantage of using objects for describing the state machine structure.

State machines aren't enough

So if we can get our basic state management needs met with a small, declarative, library-free state machine implementation (either using switch statements or objects), why do we need libraries such as XState?

This might be a bit of a shock coming from me, but I’ll say it: state machines are not sufficient for managing and orchestrating state at scale. State machines suffer from a fundamental problem called state explosion: when the number of states in a state machine grow, the transitions between states also tend to grow, exponentially.

Thankfully, an extension to the traditional formalism of state machines, known as statecharts, was invented by Prof. David Harel and published in his paper Statecharts: A Visual Formalism for Complex Systems. The paper is full of diagrams and is quite readable; I strongly encourage you to read it.

You can think of statecharts as essentially being state machines (statecharts can be decomposed into FSMs) with some essential features for better state organization and real-world use-cases:

  • Hierarchy (nested states)
  • Orthogonality (parallel states)
  • History (remembered states)
  • State actions (entry, exit)
  • Guarded transitions
  • Extended state (contextual data)

Notably, the first two features (hierarchy and orthogonality) mitigate the state explosion problem by allowing state nodes to be grouped in a way that reduces the number of transitions necessary to fully express all possible transitions.

For example, if you were creating a state machine to represent editing and asynchronously saving some data, and you wanted to have shared behavior between some ”idle” (before saving) and “error” (failure after saving) state (e.g., SUBMIT to try/retry), then instead of having a flat state machine:

{
      idleNormal: {
        on: {
          SAVE: {
            target: 'saving',
            actions: [{ type: 'saveAsync' }]
          }
        }
      },
      saving: {/* ... */},
      idleError: {
        on: {
          SAVE: {
            target: 'saving',
            actions: [{ type: 'saveAsync' }]
          }
        }
      },
      // ...
    }

You can represent the shared behavior under the same parent state:

{
      idle: {
        // if child states don't handle these events,
        // handle it here, in the parent state
        on: {
          SAVE: {
            target: 'saving',
            actions: [{ type: 'saveAsync' }]
          }
        },
        initial: 'normal',
        states: {
          normal: {/* ... */},
          error: {/* ... */}
        }
      },
      saving: {/* ... */},
      // ...
    }

Overall, the features of statecharts are very useful in many different situations:

  • Nested states are useful for grouping and refining behavior. Different ”finite states” can all share behavior, while all having their own specific behavior.
  • Parallel states are useful for representing behaviors that can occur simultaneously, without directly affecting each other.
  • History states are useful for recalling which nested state the machine was previously in without having to specify all the possible ”remembering” transitions.
  • State actions are useful for specifying actions that should always be executed on any transition that enters/exits a state without having to specify those actions in all incoming/outgoing transitions.
  • Guarded transitions are very important for conditionally taking transitions based on more than just the state and event type. They can take other data (extended state) and/or event data into consideration, as well.
  • Extended state is absolutely necessary. Not all state is finite; ”infinite” state also needs to be quantified. Statecharts allow you to distinguish between finite and extended state.

There's even more features of classic statecharts, such as ”activities” (actions that occur throughout a state), delays, eventless transitions, wildcard transitions, and more. And the more you work with statecharts, the more you realize just how essential most of these features actually are.

Sounds like it would be fun to implement these features on top of our state machines, right?

Implementing statecharts

I hope you have a lot of free time.

Since statecharts are more powerful than state machines, they’re also harder to implement. If you’re really curious and/or eager to implement them yourself, I strongly recommend following the W3 SCXML (Statechart XML) spec. They even include an algorithm in pseudocode for proper SCXML interpretation.

Even implementing something as seemingly straightforward as nested states is a daunting task. There are many rules about selecting transitions, resolving conflicting transitions, traversing the state node tree to determine which nodes are being exited/entered, selecting transitions in compound states if leaf nodes don't handle the event, determining action order, etc. etc.

It’s not easy, and just like you would use a date library to deal with timezones, you definitely want to use a statechart library to deal with all the excellent features that statecharts support.

So do you need a library for statecharts?

Yes.

Closing thoughts

If you’re satisfied manipulating state at any time and sprinkling if-statements to patch up edge-cases, you probably don’t need explicit state machines.

If you want to use simple state machines to help organize app behavior and logic, you don’t need a library.

If you have complex logic and want to take advantage of more powerful state machine features to better manage this logic, you need statecharts.

And you definitely need a library for statecharts. 😉

Copyright © Stately, 2021