-
Notifications
You must be signed in to change notification settings - Fork 6
First Tutorial Part 2: Basic Scenario
Previous | Main | Next |
---|---|---|
<-- Part 1 | Main |
Part 3 -->
|
Let’s start with a simple application that tracks surveys filled by people online. In this tutorial, we initially look at a fictitious satisfaction survey SAT_01 consisting of three questions (Q1 through Q3) with values from 1 to 10 (from least to most satisfied) and one question, Q4, with free form text for comments. The following tables are a very common representation of that data, where formId is a foreign key to another table containing form-level details such as for example, the data the form was entered, the user who entered the form etc...
refnum | type | fillDate | userRefnum |
---|---|---|---|
111 | SAT_01 | 2019-09-03 | 543567 |
222 | SAT_01 | 2019-09-03 | 634568 |
formRefnum | field | value |
---|---|---|
111 | Q1 | 10 |
111 | Q2 | 8 |
111 | Q3 | 8 |
111 | Q4 | Blah... |
222 | Q1 | 9 |
222 | Q2 | 7 |
222 | Q3 | 7 |
222 | Q4 | Bleh... |
For this tutorial, we’ll look at how to use that data for a variety of purposes. We’ll start with Pivoting. Pivoting data is one of the most powerful data preparation processes. A lot of data is presented in row-based format which becomes a lot easier to use when transformed into column-based data. Much survey of form-based data is available in row-based formats. In healthcare for example, assessment forms such as OASIS, CAHPS or HIS form the basis for many measures and having data presented as columns is critical. Many other situations, like rolling up various counts of things, are easier to digest in an easy-to-use column format.
Conceptually, the simplest way to think about Pivoting is as rotating your data 90 degrees so that rows become columns so that we then get something that looks like this:
formRefnum | formFillDate | formUserRefnum | Q1 | Q2 | Q3 | Q4 |
---|---|---|---|---|---|---|
111 | 2019-09-03 | 543567 | 10 | 8 | 8 | Blah... |
222 | 2019-09-03 | 634568 | 9 | 7 | 7 | Bleh... |
This works obviously when you know in advance the columns you want to create. In real life, hopefully, such data is often supported by some meta-data where you know explicitly the questions, types of values and so on. We have worked with some customers where some forms were frequently changed without any thought about versioning, making any type of pivoting, and standardized analytics, hard to achieve. This generally needs to be remediated organizationally.
<-- Main
<-- Previous: Part 1
--> Next: Part 3