blaubart.com

sidekiq_workflows: a workflows API on top of Sidekiq Pro

Sep 24, 2021
/assets/images/blog/blog-post-5.jpg
The Stanley Parable Adventure Line™ - Following Stanley

Sidekiq is one of the most popular background job processing systems in the Ruby ecosystem, which I’ve been using for almost a decade in at least half a dozen projects. The basic version is open-source and available under the LGPL license, while the paid Pro and Enterprise versions add functionality on top of it. A major feature introduced with Sidekiq Pro are batches.

It was during the development of the software architecture for an application of a major client of mine that we discovered a pattern: many of the feature requirements boiled down to background processes involving complex job execution pathes. Some prerequisite work needed to be performed before the main jobs could begin. Some of those jobs could be parallelized, others required sequential execution. Eventually, upon successful completion, follow-up workers sending notifications and triggering further work should be launched.

The client was majorly invested in the Ruby ecosystem and already in possession of a Sidekiq Pro license, so it was standing to reason to build upon the existing solutions.

The problem

I was looking for a way to cleanly define and describe workflows like this within Ruby code:

image

A number of open-source Ruby gems already existed that extended Sidekiq to enable the definition and execution of such workflows. However, none of them integrated with Sidekiq Pro’s batches, opting to re-implement fundamental building blocks such as batches instead.

Yet, the “batches” feature of Sidekiq Pro appeared generally very suitable to tackle the aforementioned requirements. However, while Sidekiq Pro’s batches are powerful, only a rather low level API is provided to work with them. Take this example: this is a lot of complex code scattered in various callbacks to enable a straightforward workflow. It is easy making mistakes when writing such code, and it’s also hard to debug.

The solution

Together with some colleagues I began implementing an API that would abstract the intricacies of Sidekiq Pro’s batches away, providing an interface that allowed us to define workflows in a cleaner way and simpler way instead. Finally, we ended up with a solution that we extracted to and published as an open-source gem called sidekiq_workflows. A workflow can be defined as follows:

class A; include Sidekiq::Worker; def perform(x); end; end
class B; include Sidekiq::Worker; def perform(x, y); end; end
class C; include Sidekiq::Worker; def perform(x, y, z); end; end
class D; include Sidekiq::Worker; def perform(x); end; end
class E; include Sidekiq::Worker; def perform(x); end; end
class F; include Sidekiq::Worker; def perform; end; end

workflow = SidekiqWorkflows.build do
  perform(A, 'first param to perform')
  perform(B, 'first', 'second').then do
    perform(C, 'first', 'second', 'third')
    perform([
      {worker: D, payload: ['first']},
      {worker: E, payload: ['first']}
    ]).then do
      perform(F)
    end
  end
end

The full description of functionality is available on GitHub. Since the workflow is implemented purely using Sidekiq Pro batches, the entirety of the existing, well-tested and reliable batches API can be used for monitoring and manipulation as necessary, which I consider a crucial advantage over existing other solutions. Comparatively, it is also rather lightweight employing only a thin abstraction layer on top of Sidekiq Pro’s batches.

At the time of writing, sidekiq_workflows has been used in production systems for millions of workflows (and billions of individual workers), so I consider the library very mature. Recently, I’ve published version 1 of the gem - Sidekiq’s retry, success and death semantics are now fully supported as well. Any feedback or pull requests are appreciated!