The Event-Driven Workflow Layer is Infrastructure

← All Notes

Every non-trivial application eventually builds the same internal subsystem: something that listens for events, schedules work in response, tracks whether that work completed, and handles retries when it didn't. Email sends, payment processing, data pipeline triggers, post-signup onboarding sequences — all of these follow the same pattern. An event occurs. A function needs to run, possibly across multiple steps, with state held between them, with retries on failure, with some way to know if the whole thing succeeded or got stuck.

The engineering time spent building and maintaining these coordination systems is enormous and almost entirely undifferentiated. Teams rebuild it from scratch using a combination of message queues, background workers, database tables used as makeshift queues, and custom retry logic. Each implementation has its own failure modes, its own debugging surface, its own operational burden.

The case that this is a primitive

The argument that durable workflow coordination is a primitive — rather than a feature to be composed from existing tools — rests on a few observations.

First, the problem is universal across application domains. Payment processors, SaaS products, data platforms, marketplaces, developer tools — all of them eventually build a version of the same workflow coordination system. When you see the same problem appear at the same layer of the stack across fundamentally different application types, that's usually a sign the problem deserves its own abstraction.

Second, the correct implementation is genuinely hard to get right. Durability guarantees — the assurance that a workflow will eventually complete even if the execution environment fails midway through — are not trivial to build. They require careful thinking about state persistence, idempotency, execution semantics under partial failure, and clock drift. Most teams who build their own workflow coordination system do so without fully thinking through these properties. The result is systems that work well under happy-path conditions and fail in subtle ways under load or failure conditions.

Third, the problem compounds. As an application grows, the number of workflows grows. Each workflow adds state that needs to be tracked, failure cases that need to be handled, dependencies that need to be understood. Companies that built their coordination layer on top of message queues and database tables find themselves, at scale, operating a custom distributed system that nobody wanted to own.

What the Inngest bet is

The bet behind investing in Inngest — and, in the broader category, the bet that workflow coordination becomes a standalone infrastructure layer — is that the cost of the custom-built approach eventually exceeds the cost of adopting a dedicated primitive, and that crossover happens earlier than most teams expect.

The relevant dynamic is that serverless and edge compute environments make the problem more acute, not less. When your execution environment is ephemeral, coordinating state across function invocations requires either a very careful use of external storage or a purpose-built coordination layer. Serverless amplifies the need for workflow primitives precisely because the execution environment doesn't retain memory between calls.

Inngest's design reflects this. The coordination layer is external to the function execution environment — which is the correct architecture for serverless systems. Functions don't need to know about their own retry logic or state persistence; that's handled by the platform. This separation is what makes the abstraction clean and what makes it possible to run Inngest-coordinated functions across different execution environments.

The competitive question

The natural question is whether cloud providers will absorb this category. AWS Step Functions, Azure Durable Functions, GCP Workflows — each provider has a workflow offering. The question is whether these platform-native offerings create a ceiling on the independent category or whether the independent products can build durable differentiation.

Our read: provider-native workflow tools are optimized for their own execution environments and tend to have friction at the boundaries — when you want to run functions across providers, in edge environments, or with language-native ergonomics that don't map to the provider's model. Developer experience is also systematically better on the independent side — the open-source projects in this space attract contributors who are solving the problem because they find it interesting, which tends to produce better APIs over time.

The companies in this category that win will do so by having substantially better developer experience and a cleaner abstraction than the platform-native alternatives — and by establishing themselves in the developer workflow before teams have locked in the provider-native tool. That's the race, and it's winnable.