Joram Barrez and Paul Holmes-Higgin
What we’re not going to do is suggest that software evolves like biology through random mutations, although we have seen some shockingly random code in our time [pause for someone to point at some of ours]. What we are going to suggest is that the combination of open source and an evolutionary approach to developing software can lead to strong, adaptable and long-lived products.
With open source, you put ideas out there in code, then adapt and adjust it to the real world as developers try it in a wide range of uses and environments: open source is the only way to get that breadth and volume of feedback. For all software, as adaptations accumulate, you find an accretion of “cruft”, and adjustments become necessary that may even work against some of the original ideas. Usually at this point a new architectural version is created, built from the learnings of the previous versions. Often, this is an opportunity to dump some historical baggage and, all too often, more of a revolution than an evolution (how many of us remember the pain of “upgrading” from AngularJS 1 to 2…). These cycles of major architectural versions can be thought of as generations in the lifecycle of a piece of software. The ideal goal is that each generation brings great enhancements without forcing big changes to adopt it.
What you’ll also notice is the initial generations cycle very quickly as new concepts and their architectures need to change to meet early feedback, especially in open source. As a software product matures, the major architectural shifts happen far less frequently (and often only once in proprietary software). You then see smaller evolutionary changes, some of which can bring significant improvements, otherwise stability and consistency are key. However, you still get the addition of carbuncles of code to deal with problems that don’t quite work with the conceptual elegance of the architecture. At some point, the choice for the product is whether to keep accreting “fixes” or somehow manage a revolutionary evolution.
This revolutionary evolution was the approach the architects took for the most recent iteration of Flowable. At version 6.x, it represents the sixth generation in the evolution of open source Java BPM. The story of open source Java BPM really began with jBPM, continued with Activiti and then to Flowable (though version numbers don’t always reflect the generation in all variants).
Back in 2003, jBPM 1.0 was released. It only ran in J2EE environments and used its own process definition language, jPDL. It was based on ideas from the work of Prof. Wil van der Aalst. As the first generation, this was the revolutionary start and early versions iterated quickly.
The technical landscape in 2003 was vastly different from today. Digging back in our own memories, the kind of projects we were doing at that time included applets, Swing desktop applications with heavyweight server logic implemented in EJBs. It was nascent days for Spring (the “Expert One-on-One J2EE Design and Development” book by Rod Johnson came out in 2002 and the ideas in it evolved into what Spring would become).
Since then, these technologies have declined or died, with the exception of the Java/JVM platform that was under the wings of Sun Microsystems at that time. And although the JVM’s imminent death has been proclaimed many times since its inception, it is still a dominant platform choice today. An impressive feat, more on which later.
Every BPM vendor at the time had their own process language and tooling, and jBPM was no exception with its light-weight jPDL language. It was truly a wild-west. Historical sidenote: probably unknown to many, there actually were strides to standardize business process API’s on the JVM. The JSR-207: Process Definition for Java was an attempt by a group of vendors to come to a consensus on what “processes on the JVM” meant.
In 2004 with version 2.0, jBPM joined JBoss. This release allowed the BPM engine to run in any Java environment and the significant evolution was having the process runtime as a POJO. The power of BPM now was readily available to all Java developers.
Interestingly enough, the artefacts from versions 1.x and 2.x are still downloadable online (another great virtue of open source). Strolling through the code makes for quite an interesting tour. The switch to a process engine that ran on the JVM alone without an application server was the most influential change from the first to the second generation. This still is fundamental in the current versions of Flowable.
Some concepts, such as a clear goal to be able to unit test processes, continue to hold today. A handful of service API’s look vaguely familiar when compared to the current ones. The application server dominance from that era still shines through in many parts of this second generation, though. Not suprising, given the project was under the JBoss umbrella.
Soon after in 2005, jBPM 3.0 introduced support for BPEL (the Betamax of BPM) that drove the evolution of the engine towards concepts of a “process virtual machine”. This generation was used widely enough that the open source feedback ensured features and edge cases were added around the core implementation.
From an architectural point of view, the difference between version 2 and 3 was huge. In terms of functionality, a leap was made in the kind of operations supported by the engine. It wasn’t just about using state machines in Java for BPM anymore, but about being a foundational framework for the plethora of modeling languages out there.
Persistence was done with Hibernate, which had the concept of a “session object”. This idea was also incorporated in the 3.0 design, which meant that all related interactions with the engine needed to be put in a contextual block. This block laid the foundations for what would be known as the commands and the command interceptors for later generations.
After a bit more time, jBPM 4.0 reached alpha release, in 2009. The major technical evolution was the Process Virtual Machine (PVM) now at the core of the engine. The business evolution was support for the emerging BPMN standard, alongside the already supported jPDL and BPEL. However, there was never a final release because the team left to start Activiti. The next release, jBPM 5.0 took a completely different approach to BPM, based on the general-purpose Drools rules engine.
Compared to the third generation, major architectural changes were made. Because the previous version was used in such a wide range of different industries, it had led to many ideas for improvement.
Probably the biggest difference was the switch to the stateless service APIs that are still fundamental in Flowable today. Another major change was the separation of runtime and history data, which had been intermixed before then. Splitting this up solved some scaling problems cleanly by keeping the runtime persistence small and fast.
Although versions 4 and 3 were available on the JBoss platforms, many developers had been using it in Spring environments. By this time, Spring had a similar share of developer mindset as the Java EE counterparts.
In 2010, the first release of Activiti was a completely new codebase built from the lessons of the first 4 generations. At the time, the LGPL license of jBPM dictated this was a not a trivial migration, in that schemas and APIs changed, but the conceptual model was an evolution. Switching to Apache licensing meant this would not need to be a problem for any future generations. Activiti was focused on supporting only BPMN and enhanced the PVM for performance and scale. In the middle of this time, a fork was taken from Activiti to create Camunda (currently version 7, although still based on the 5th generation process engine). A number of subsequent minor evolutions added dynamic changes to process parameters; persistence and job optimizations; and multi-tenancy.
The stateless service API of the fourth generation and the POJO approach of the second were worked out to the fullest. Developers could still use the process engine in EE environments, but there was no fundamental difference anymore between that and running it in other environments, such as Spring.
Worthy of note was the rise in that period of Java frameworks for “full application development”. It was the time of Grails, Dropwizard and Play frameworks, and many others we’ve since forgotten. Spring Boot came out as one of the dominant choices (and still is as of writing). The fifth generation BPM architecture, focused on being light-weight, with simple API’s and easy integration into other frameworks made it a natural fit.
It was early 2017 when Flowable 6.0 was released. There were a number of significant architectural changes but at its heart was a move away from the PVM approach and to a BPMN-native one. The difference and potential this brought is something for multiple blog posts, from truly dynamic process execution to complex process migration. Perhaps counter-intuitively, it also enabled higher performance. Another significant change was a completely abstract data source that opened the possibility of using NoSQL databases, such as MongoDB. Finally, the addition of a new case management engine based on CMMN added a sophisticated tool to model automation from a different perspective. It too deserves several additional posts to describe how it augments and enriches intelligent automation. All of this was done with minimal changes to APIs and schemas: it even included an embedded 5th Gen micro-engine to allow in-flight processes to continue as they were for the paranoid. With minimal effort it was possible to adopt what is a revolutionary change in core BPM execution.
We mentioned the JVM in the section of the first generation. Throughout all this time, the JVM kept evolving by adding powerful features and deployment options, without losing backwards compatibility. An impressive feat of engineering, for sure. The JVM has continued to prove adaptable to the evolving technical landscape. It underlies many of the cloud tech giants services and has shown to scale beyond what we thought feasible decades ago. Think of the recent rise of GraalVM, frameworks such as Quarkus (and others) and the serverless movement. The Flowable architecture has been able to benefit from these changes: for example, see our blog on running Flowable as a serverless function.
We’re not planning or expecting another generation anytime soon as there’s still plenty of advances possible based on the 6th generation architecture. We’re not planning a revolution either by creating a completely new product on a different architecture and losing all the hard-won benefits of over 15 years evolution in the real world. We are planning evolutions to make it easier to exploit new technology environments, so we’re not sitting still. Trends in technology can often seem like clothing fashions – a different haircut and a few new layers on top makes you cool – it doesn’t mean you can do your job any better 🙂