Concatenating Rows into String Lists in Tableau for #VisualizeNoMalaria

As part of the Tableau Foundation/PATH/Zambia Ministry of Health #VisualizeNoMalaria project to eliminate malaria in Zambia by 2020 we had a request to build a "report" that shows when health facilities have not submitted malaria incidence data so that administrators and staff can be notified in an easy-to-ready way. There are huge challenges for the staff on the ground that lead to delays in reporting such as dealing with other community health issues, lack of the necessary technology (such as spare parts for donated bicycles), etc. so tracking and notification is important. At this time the data is hand-gathered from Excel spreadsheets so it's time-consuming and error prone and automation is desired. A demo output is as follows (we're using Tableau's Superstore sample data for this, with State as a stand-in for the health facility and a week where there were sales as a stand in for a week of no reporting):

Screen Shot 2016-07-29 at 1.37.01 PM

There are other use cases for this in Tableau where we want to go beyond a tabular list such as building our own titles and tooltips, showing multiple values for a single element (such as all the product categories a customer has purchased), etc. Read on for how to do this yourself and learn a bit more about the different levels of detail of data that we work with in Tableau.

The challenge here is that the data is in a transaction level format, there's a record for each report, or in the case of Superstore each order item:

Screen Shot 2016-07-29 at 1.38.27 PM

And we need to convert that into the above image...and we can't get there by drag and drop. Now we can get a view like this in a few clicks:

Screen Shot 2016-07-29 at 1.39.30 PM

...but that view is huge, takes up a lot of space, and making a list of all the weeks of interest takes a lot of looking across and up. Also one key point here is that view has a cell for every combination of State & Week, when what we really want are 4 marks per State, i.e. one for each State & Year combination with the summary of all the weeks there were sales. So we need to work at multiple levels of detail at once in Tableau. The Superstore data is at the level of detail of order item and the end result (what I call the display level of detail) is at the level of State & Year.

Now if Tableau had a "concatenate a bunch of rows into string" function then we might be all set, but Tableau doesn't, we have to build that calculation. And that means that given how Tableau works we need to consider a third level of detail: the viz level of detail (vizLOD). This is made up of the dimensions on Rows, Columns, Pages, and the Marks Card:

Screen Shot 2016-07-29 at 1.43.17 PM

Beyond the fact that Tableau aggregates measures to the vizLOD we also really need the vizLOD as a third level of detail in this case becauser the data has a finer grain (order item) than the State & Week level that we need for the set of calculations we'll build. Then we'll use table calculations to aggregate across the weeks to get the State & Year level results. Now if you have some familiarity with table calculations and Level of Detail (LOD) expressions you might ask "Why not build this in an LOD expression?" The reason why is that one of the few things LODs can't presently do (as of Tableau 10.0) that table calculations can do is iterate over values.

So to work at these different levels of detail what we need to do is:

  • Build a view that in its vizLOD has the necessary dimensions for the display LOD (State & Year) and to get the calculations to work (Week).
  • Identify runs of sequential weeks.
  • One week at a time build out the strings to display.
  • Present the desired display LOD (State & Year) for the final strings by filtering out unneeded marks.

In the rest of the post I'll go through the steps.

Build the Workout View's Level of Detail

For the workout view I created a couple of custom dates for the Year and Week and put those on the Level of Detail Shelf, with ATTR() versions of them on Rows with the State dimension. This is a step to prevent unwanted data densification from the table calculations we'll be using.

Screen Shot 2016-07-29 at 1.44.35 PM

Identify Runs of Sequential Weeks

The technique is based on one I built for doing statistical process control charts http://drawingwithnumbers.artisart.org/user-defined-functions-and-process-control-charts/ where a table calc using PREVIOUS_VALUE() and LOOKUP() is used to detect runs of sequential weeks.

//if this week # succeeds prior week then creating a running sum IF MIN([Order Date (Week numbers)]) == LOOKUP(MIN([Order Date (Week numbers)]),-1) + 1 THEN PREVIOUS_VALUE(0)+1 //restart since we're at a new week ELSE 1 END

PREVIOUS_VALUE() is Tableau's only iterative function, it returns the value of this very measure from prior address aka row in the partition, so with a Compute Using on Week can use it in this case to build that running sum of the # of continuous weeks that will restart each Year and State:

Screen Shot 2016-07-29 at 1.45.13 PM

Note that PREVIOUS_VALUE() requires an argument that would be used if there was no other condition in the first address in the partition, in this case we're not using it so we use 0 because that has the desire data type and is neutral. You'll see below we do a similar thing for strings but use '' as an empty string.

One Week at a Time Build Out the Strings to Display

Now a variation on the building a string list calculation (worksheet #15) in http://drawingwithnumbers.artisart.org/the-next-n-table-calculation/ builds up the displayed string week by week (again using PREVIOUS_VALUE()), and also using a compute using on the Week, here's the formula:

//first row, return week number IF FIRST()==0 THEN STR(MIN([Order Date (Week numbers)])) //at a new run of a single week, so tack on this week number ELSEIF [# of Continuous Weeks] == 1 THEN PREVIOUS_VALUE('') + ', ' + STR(MIN([Order Date (Week numbers)])) //in a run of more than one week ELSEIF [# of Continuous Weeks] > 1 THEN //if we're at the last week then close it out IF LAST()==0 THEN PREVIOUS_VALUE('') + '-' + STR(MIN([Order Date (Week numbers)])) //if the next week is a new run then close it out ELSEIF LOOKUP([# of Continuous Weeks],1) == 1 THEN PREVIOUS_VALUE('') + '-' + STR(MIN([Order Date (Week numbers)])) //we're in the middle of run, preserve the existing string ELSE PREVIOUS_VALUE('') END END

The key bit is that last ELSE PREVIOUS_VALUE(''). The rest of the IF/THEN statement is testing for cases on where we're in the middle of a continuous run and skipping those values, like this for California in 2011 between weeks 2 and 6:

Screen Shot 2016-07-29 at 1.46.23 PM

Now the last address in each partition has the completed string, so that's what we want to show, as in the row for Alabama/2011/Week 53

Screen Shot 2016-07-29 at 1.46.49 PM

Presenting the Display LOD

What we want to do is filter out all the in-process weeks. Given that the last week for each State & Year combination can be different figuring out a regular filter would be a bit challenging *plus* this happens, here I've gotten selected all the weeks but Week 53 and excluded them from the view for Alabama/2011, instead of seeing the long string I'm only seeing week 53.

Screen Shot 2016-07-29 at 1.47.22 PM

The reason why has to do with Tableau's order of operations. Filters on dimensions like Week are processed *before* table calculations are computed. What we need is a filter that is applied after table calculations, and a table calculation filter does just that. In addition Tableau has a function which will help us identify the last week in each year, and it is helpfully called LAST(). The Week String Filter has a Compute Using on Week and the following formula:

LAST()==0

This function only returns 0 in the last address of each State & Year partition, to validate it I put it on Color where we can see that only those final marks are returning True:

Screen Shot 2016-07-29 at 1.47.57 PM

Building the View

Once the data, the calculations, and the workout view have been prepared now we can build the final view entirely with drag-and-drop:

Screen Shot 2016-07-29 at 1.48.38 PM

The only other option I used here was to turn off the tooltips for the Week dimension because it's not needed in the view.

Conclusion

This is a good example of how we can work at multiple levels of detail at once in Tableau. The data is at the level of detail of order item, the display LOD is at the level of State & Year, and the actual vizLOD is at the level of State, Year, and Week. Thinking about working this way in Tableau takes some practice, and it's a great skill to have in your data analysis toolbox. Here's the concatenating string lists workbook on Tableau Public.

In Zambia a version of this view will be sent to administrators, ideally saving them hours of time each month, and variations on this will be sent via email and text messaging to the the staff responsible for capturing the data via VizAlerts  that will essentially say "Hi there, please submit malaria reporting for missing weeks 5, 7, 10, thanks!"

If you'd like to help out the work to eliminate malaria in Zambia you don't have to write table calculations, you can help a mapping project, and even your kids can help, too!