post-title Sikilabs - Pipeline Exec | Blog

Pipeline Exec

A python module story

Posted by ibrahim on Jan. 8, 2020

Recently I found myself wanting to apply multiple filters on a large set of Django objects. These filters were based on multiple conditions that were far from being trivial. It was a small but important feature at the beginning, so what I started doing was writing a single ‘filter’ function with lots of loops and conditions. As you may imagine, this function ended up being too large and too difficult to maintain and update. It was a key part of an early stage product I was working on that required regular updates. In addition to not being very ‘pythonic’, it was not very ideal.

After multiple update requests and back-and-forth communications with the client, I decided to refactor that key part. My goal was to make it not only more maintainable, but also easier to talk about with the client. At that time, it became more obvious to me that the code we write has a huge impact on how easily we can communicate with a client. As a matter of fact, you can even evaluate your code by how comfortable you are in providing a mid-level product description to a project manager or a product owner. Code structure, organization and readability are very good ingredients for a great sprint demo.

Back to the technical stuff. I started by separating the function into smaller functions that I ended up calling funnels. Because that’s what they really were. Each funnel produced a data subset from the input. The subset would be fed to the next funnel as input and so on, until the last one produced the final pipeline result. That provided a tremendous help in improving the application but also in describing the concept solution to the client. We only had to show them an image like the following:

sikilabs.com pipeline_exec

There, an image was worth a thousand lines of code.

Before doing that refactoring, as usual I started by checking online to see if there wasn’t already a tool doing that to avoid reinventing the wheel, but none of the ones I found were suitable for this case. That’s why I decided to go the extra mile and make the tool more generic and open source it.

pipeline-exec is available on Github and Pypi. It can be defined as an execution Pipeline to apply to an Object Collection. It’s like a group of functions to run against a list of objects in a sequential manner. As described earlier, the original goal behind this project was to implement a framework to help create a funnel pipeline for Django model instances. We’re sure there can be many more uses.

The tool is still at an early stage and there are still features needed but it serves its purpose very well. Feel free to leave feedback and contribute.