How do you implement (in software) a long-running "business" process?
Think of any long-running process that you want to implement. A process with multiple steps, some of which might require waiting for external events or user input. A process that cannot be enclosed within the scope of a single DB transaction. Steps might fail, might need retrying, coordination, or wait for human input or for a notification from another system. Some steps may run in parallel, some might have to wait for others to complete. Examples:
-
Processing a purchase order: A user places the order, which goes through payment processing, inventory/preparation, shipping, tracking, customer followup, etc.
-
Support ticketing system: a customer reports an issue, it gets assigned, worked on, escalated, resolved. All while complying with "SLAs".
-
Hiring new employees: candidates postulate, are screened, interviewed, evaluated, then there are background checks, voting/approvals, make offer, on-boarding.
In the late 90s, a new category of software package started to appear in the "Enterprise" world: Workflow systems, and then BPM (Business Process Management) systems.
A BPM suite gives you tools to develop software solutions for those kind of problems. They usually include software libraries, a DSL (domain-specific language), graphical modelers, runtime environments, and tools for auditing and monitoring.
Not only for "Business" processes
The word "business" may seem to limit the applicability of BPM systems. However, think about the following processes:
- The password recovery process behind that "forgot my password" link. See process diagram at the end of: Everything you ever wanted to know about building a secure password reset feature.
- Coordinating the provisioning of infrastructure in the cloud. See HashiCorp using Cadence for provisioning a Consul cluster
- Micro-services orchestration, "Sagas". This article by Bernd Rücker (founder of Camunda) describes Sagas from the point of view of BPM.
- Internet-of-Things: to coordinate behavior across devices
- "BigData": multi-step data extraction, loading, aggregation, transformation
- The build pipeline of a continuous integration system
These examples have less of a "business" flavor, but are equally likely to gain value from a BPM or workflow system. The core of the problem is still the same.
Workflow solutions for every problem
In the last decade there's been an explosion of "workflowish" software tools and libraries to tackle a wide range of needs in the "process" spectrum. Some are flexible and general, while others are more specialized. Here's just a short sample of the currently popular ones:
- "BPM" systems: JBoss jBPM, Activity, Camunda, Flowable, Camunda's new highly scalable Zeebe (architecture comparable with Cadence).
- Data crunching, "dataflow" pipelines: Apache Airflow, Apache NiFi.
- Build pipelines: Gitlab CI/CD, Github Actions, Jenkins, GoCD, Spinnaker.
- Service Orchestration: Netflix Conductor, Uber's Cadence (and its recent fork: Temporal), ING's Baker, Microsoft Azure Durable Functions, Amazon AWS Step Functions.
- ...and there are hundreds more. If you want to be overwhelmed, you may start here: Awesome Workflow Engines, Awesome Pipilines, Computational Data Analysis Workflow Systems.
What's in it for me?
So what value can you get from using such a tool? Some that come to mind:
- Easier implementation: Flow-specific APIs, DSLs, and runtime environment to implement long-running processes, with primitives for parallelism and coordination, scheduling, transaction boundaries, wait states, etc.
- Reliability: retries, state persistence, exception handling, compensation flows
- Observability: What process "instances" are running? status/progress? Timing?
- Metrics, analytics: Execution times. Process heat-maps. Identify bottlenecks. What's the distribution of jobs over time?
- Understanding and Communication: Most workflow tools provide direct graphical representations of the execution model. Not mere documentation diagrams: a direct projection of the actual implementation. Some tools generate the diagram from the executable model or code (e.g.: most build pipelines), while others give you tools to do the modeling directly in a graphical environment (e.g: BPM tools based on the BPMN stantard). It is very valuable to have executable graphical models that can be shared and understood by all stakeholders.
This year I find myself working on two projects that would benefit from a workflow engine. The main need in both of them it to "orchestrate" complex flows of internal and external services, requiring state persistence, retries, coordinated parallelism, and visibility for both end users and for internal operations. Cadence would be great fit. Embedding a BPM library like Camunda was an easier start.
I used to say that BPM/Workflow engines should have seen broader adoption: despite being general tools (like, say, databases), their usage was limited to niche projects. It probably didn't help that most of the original BPM "suites" were proprietary full-stack solutions that required a lot of investment (both in money and technology lock-in).
Now I'm glad this is no longer the case: there's been many open and modular options available for years now, and workflow systems are now much more common.
Applied to the right problems (an important premise! like any tool really), a worflow engine can bring a lot of value, allowing you to provide better solutions with less effort.
My (condensed) story around BPMSs
I got deeply involved with BPM Systems early on my career, over 20 years ago. It was not a common term on the software industry back then. There were no BPM software systems in the market... at least nothing close to the vision of our company founders. Since then, I see "processes" everywhere.
On my first job, (1997) I witnessed the initial efforts for a new ambitious software product. About a year or so after I joined that company, I found myself at the development team of jBPM. No, not JBoss' jBPM, that one came out many years later... after our product went through multiple indentity crises, being renamed to eTopware, then Fuegotech, Fuego, ALBPM (being acquired by BEA Systems) and then Oracle BPM (after Oracle's acquisition of BEA Systems).
The original version was actually implemented in C++. As soon as Java made its appearance, the project was ported: a very risky move at the time, which turned out pretty well on the long run but VERY painful in the short term. All merit goes to Emilio (as usual). We are talking Java 1.0.2 here: with no JIT compilers, no native threads, shitty APIs, and all that when a typical workstation had 16Mb of RAM.
The company got funding from investors based in the USA, and soon after that I moved there. We flew together with Eduardo, and other teammates joined us later, including my current business partner Cay. Since then, I worked for almost ten years applying BPM solutions on multiple projects across different industries. I even wrote my undergraduate "thesis", back in 2002, about BPM systems and a methodology to apply them.
Over time, and specially after I left Oracle, my involvement with BPM systems slowly decreased. However, I still see processes, everywhere.