How to avoid spaghetti bowl Alteryx to create readable workflows
/When I was growing up, spaghetti was one of my mom’s go-to meals. Not the fancy stuff. We’re talking boil the dry noodles and pour a jar of Ragu spaghetti sauce on it. After teaching first graders all day, I imagine the last thing she wanted to do was cook a fancy Michelin star restaurant meal. It was quick and easy way to feed dad and us three kids.
Like cooking Ragu-sauced spaghetti is an easy way to feed a family, creating an Alteryx spaghetti workflow is easy to do, too.
What is an Alteryx spaghetti workflow?
I’m frequently asked to look over other people’s Alteryx workflows, and it feels like looking at a bowl of spaghetti. The connector lines go every which way, tools are placed haphazardly on the Canvas, and I have to start clicking on tools to try to see what they are contributing to the data flow.
I find these workflows disorganized and difficult to understand. It takes time to figure out what is going on, how data is being processed, and where certain changes are being made. This spaghetti system is a far cry from easy-to-understand analytics that Alteryx can deliver.
But why is this important and what is the alternative?
I came to Alteryx from coding data ETL and analytical processes in Python. There is a saying by the founder of Python, Guido van Rossum: “Code is read much more often than it is written.”
Even though we are dragging and dropping in Alteryx, that quote still holds true. If I can’t follow your workflow logic, I’m going to have a harder time understanding what the workflow is accomplishing and how it is accomplishing its task. If I need to make changes, taking the time to try to understand the flow is going to increase my time and frustration making those changes.
I believe in clean, clear, beautiful Alteryx workflows. If I take the spaghetti workflow above and clean it up, just look at the end result. The spaghetti bowl is now organized into a delicious presentation.
So, how did I go about cleaning up the spaghetti?
And how do I develop my own workflows for readability? Here are three ideas I use for a more readable Alteryx workflow:
Idea 1. Designing based on horizontal main-flow lines
Idea 2. Grouping into containers and sub-containers
Idea 3. Using custom notations for tools
Let’s look at each of these.
Idea 1. Designing based on horizontal flow lines
Western reading is based on the tradition of left-to-right and top-to-bottom eye movement. Given our trained reading habits, it makes sense to design our workflows with that pattern. To replicate the reading eye movement, put the data inputs at the top left to quickly see where the workflow is starting. From there, readers can read from left to right to see how the data workflow proceeds. Aligning the tools and connectors with this thought is the first thing I work on.
As the first step, I align the tools based on primary, secondary, and tertiary contributions. The primary flow involves the main data set or data sets and the primary outputs. The secondary are enrichment data sources that add information to the main data flow, such as lookup tables. Secondary outputs may also exist for different purposes. Tertiary inputs and outputs I usually think of as temporary data checks.
Taking this one step results in this design.
As a side note, this step often helps me find ways to reduce the number of tools used and any logic problems or duplication of steps happening in the workflow.
And this leads us into idea 2, grouping into containers and sub-containers.
Idea 2. Grouping into containers and sub-containers
There is a reason books are broken up into chapters, paragraphs, and sentences. It facilitates reading. We can use that same idea by breaking the workflow into functional groupings inside containers. A person reading through the workflow can get a quick overview of the individual sections of the process, like chapters. More sub-containers can be added to break the workflow into smaller groups, such as paragraphs.
Applying containers to the spaghetti workflow results into this design. Containers are colored here by inputs, outputs, spatial analyses and transformations, and non-spatial transformations. I may use smaller containers within these larger containers to further chunk up the logic for ease of understanding.
But what about the sentences?
Idea 3. Using custom notations for tools
Now that the reader is tracking through the inputs to transformation sections to outputs, we can make following the flow even easier by using notations. Notate each tool as to how it fits into the transformation. Sentence-by-sentence, the reader can follow the flow.
Notice what this accomplishes. As we move left-to-write with the workflow flow lines, we group the tools into functional chapters in containers, and with good notation, we know what each of the tools is contributing to the data process. If we want the details, we can click on the tool to see how it’s accomplishing its task.
I’ll be honest. Notation is not my favorite thing. To help, I have a cheat sheet to get me started, such as:
Filter: “Remove” + record description OR “Retain” + record description; Examples: Remove blank records, Retain records with data, Retain records with [animal] = “Cat”
Summarize: Brief description of what is achieved by summary; Examples: Remove duplicates, Calculate number of facilities by state
And that creates our final workflow.
Which brings us to timing…
Should you do this as you are developing the workflow, or after?
Actually neither. For me, it’s a continual process as the workflow is developed and refined. Input and output tools are pretty solid from the beginning, but other containers may not become clear until after the workflow is 50-75% developed. Also, as development takes place, tools may be more logically grouped throughout the process or even after the workflow is completed. Because notation isn’t my favorite thing, I put in notation as soon as the tool is configured and working properly.
My mom’s spaghetti was quick, easy, and did its job…
But it wouldn’t pass in a professional kitchen. As data professionals, let’s move beyond boiling dry noodles and a jar of Ragu spaghetti sauce. Let’s create Alteryx workflows worthy of Michelin star restaurants.
Summary
Alteryx workflows may or may not be more often read and edited than created. But it is a good practice to make workflows easy to read and follow.
By using the three ideas of designing based on horizontal main-flow lines, grouping tools into containers and sub-containers, and using custom notations for tools we can make Alteryx workflows much easier to read and edit later.
These ideas borrow from how readers approach a document by reading left-to-right and dividing the book into chapters, paragraphs, and sentences.
The creation and refinement of each of these ideas takes place from the very beginning of workflow creation, through development refinement, to considerations after the workflow is complete.
How can you apply these three ideas to your workflows?