How to "unpivot" a table in BigQuery

Recently, I was presented with a problem in BigQuery: How do I make a table's metrics be returned as rows? Normally, metrics are presented as columns and dimensions as rows. But in this case, I needed the metrics to appear as rows too.

In short, I needed to "unpivot" a table like this:
Dimension1Dimension2Metric1Metric2Metric3
















to this:
Dimension1Dimension2Metric Data


Metric1


Metric2


Metric3


Metric1


Metric2


Metric3


Metric1


Metric2


Metric3

The solutions lay with three BigQuery features:
  • STRUCT type
  • ARRAYs
  • CROSS JOIN with the UNNEST operator

How to get unpivoted results in BigQuery

Let's say I have data like this in a table called "Campaign_Results":

RowSuperheroTitleDateImpressionsClicksConversions
1SupermanMan of Steel2013-06-14567009679619182538
2BatmanBatman v Superman: Dawn of Justice2016-03-25974252587629263812
3Wonder WomanWonder Woman2017-06-02484750642067423428

But the results need to be returned such that the "Impressions", "Clicks" and "Conversions" are in their own rows.

Organise metrics into a STRUCT

The BigQuery documentation doesn't really give a good explanation of what a STRUCT type is. But I've found that it is really a key-value object. If you're a programmer, then you would be familiar with such data types (called "object" in JavaScript, "dict" in Python, "hash" in Ruby, etc.)

Each metric is organised into one STRUCT that has one key, "Metric", and one value, "Data". So in my sample data where I have three Metrics:
  1. Impressions
  2. Clicks
  3. Conversions
Then for each row in my data, I have one STRUCT per metric like so:
  1. STRUCT('Impressions' AS Metric, Impressions AS Data)
  2. STRUCT('Clicks' AS Metric, Clicks AS Data)
  3. STRUCT('Conversions' AS Metric, Conversions AS Data)
This "unwraps" each metric into a pair of Metrics and Data like this:

RowSuperheroTitleDatef0_.Metricf0_.Dataf1_.Metricf1_.Dataf2_.Metricf2_.Data
1SupermanMan of Steel2013-06-14Impressions5670096Clicks796191Conversions82538
2BatmanBatman v Superman: Dawn of Justice2016-03-25Impressions9742525Clicks876292Conversions63812
3Wonder WomanWonder Woman2017-06-02Impressions4847506Clicks420674Conversions23428

Put all STRUCTs into an ARRAY

My next step is to re-arrange the metrics STRUCTs so that they are not "side-by-side", but "up-and-down". An ARRAY helps me do this. Just as with most regular programming languages, an array is a list of values. Unlike those arrays, BigQuery requires all of the values in an ARRAY to be of the same type. The BigQuery documentation describes more limitations of ARRAYs.

So for each row of my data, instead of having three separate STRUCTs, I have one ARRAY containing three STRUCTs like this:

[
    STRUCT('Impressions' AS Metric, Impressions AS Data),
    STRUCT('Clicks' AS Metric, Clicks AS Data),
    STRUCT('Conversions' AS Metric, Conversions AS Data)
] AS Metrics_Data

I have also named my ARRAY as "Metrics_Data". This will be used in the next, final step.

The ARRAY arranges my metrics in a "vertical" manner now:

RowSuperheroTitleDateMetrics_Data.MetricMetrics_Data.Data
1SupermanMan of Steel2013-06-14Impressions5670096




Clicks796191




Conversions82538
2BatmanBatman v Superman: Dawn of Justice2016-03-25Impressions9742525




Clicks876292




Conversions63812
3Wonder WomanWonder Woman2017-06-02Impressions4847506




Clicks420674




Conversions23428

Flatten the ARRAY with a CROSS JOIN and UNNEST

The first two steps organise my data into a "Campaign_Results_Metrics_Data" temporary table that is almost ready to be used.

The final step is to get the dimensions returned with every "row" of data. … Well, that's what non-BigQuery practitioners would say. In BigQuery terms, I want to "flatten" the data.

The BigQuery documentation describes how to perform this flattening, mentioned in the instructions for querying STRUCTs in an ARRAY. It involves a CROSS JOIN with BigQuery's own UNNEST operator.

So in my final SELECT statement, I CROSS JOIN my "Campaign_Results_Metrics_Data" temporary table with its "Metrics_Data" column (the ARRAY of STRUCTs):

SELECT
  ...,
  Metric_Data
FROM
  Campaign_Results_Metrics_Data
CROSS JOIN
  UNNEST(Campaign_Results_Metrics_Data.Metrics_Data) AS Metric_Data


That gets my data into the final form that satisfies my requirement:

RowSuperheroTitleDateMetric_Data.MetricMetric_Data.Data
1SupermanMan of Steel2013-06-14Impressions5670096
2SupermanMan of Steel2013-06-14Clicks796191
3SupermanMan of Steel2013-06-14Conversions82538
4BatmanBatman v Superman: Dawn of Justice2016-03-25Impressions9742525
5BatmanBatman v Superman: Dawn of Justice2016-03-25Clicks876292
6BatmanBatman v Superman: Dawn of Justice2016-03-25Conversions63812
7Wonder WomanWonder Woman2017-06-02Impressions4847506
8Wonder WomanWonder Woman2017-06-02Clicks420674
9Wonder WomanWonder Woman2017-06-02Conversions23428

This form should be sufficient to be used with programming code (e.g. Python, Google Apps Script) and visualisation programs (e.g. Tableau, Data Studio). The latter should also be able to handle aggregation (e.g. sums, counts, averages, etc.) of the unpivoted metrics without further scripting.

BigQuery Standard SQL Recipe for "Unpivoting" a Table


WITH
  My_Table_Metrics_Data AS (
  SELECT
    ...,
    [
        STRUCT('...' AS Metric, ... AS Data),
        STRUCT('...' AS Metric, ... AS Data),
    ] AS Metrics_Data
  FROM
    `My_Dataset.My_Table`
  WHERE
    ...
  )
SELECT
  ...,
  Metric_Data
FROM
  My_Table_Metrics_Data
CROSS JOIN
  UNNEST(My_Table_Metrics_Data.Metrics_Data) AS Metric_Data

Comments

  1. Awesomly helpful! That missing closing parenthesis in line 13 into the last example makes me feel it a bit tastier even.
    Thanks

    ReplyDelete
  2. This was extremely helpful. Clearest explanation I have found in 3 days. Thanks a lot!

    ReplyDelete
  3. Great post, how would you do the reverse of this essentially pivoting the end result into the inital one

    ReplyDelete
  4. This is EXACTLY what I needed. Thanks so much!

    ReplyDelete
  5. Hi, This is helpful. How to handle if I have many columns?

    ReplyDelete
  6. Hi, This is really helpful but how can we do pivot from long to wide table

    ReplyDelete
    Replies
    1. There are some solutions provided in other websites / blogs already. Search for "bigquery pivot table".

      Delete
  7. Thanks man! Super helped me!

    ReplyDelete
  8. You are amazing! Thank you a million times over!

    ReplyDelete
  9. Awesome content, thank you very much!

    ReplyDelete
  10. This was very useful for me, thank you very much!

    ReplyDelete
  11. Hi, I am not able to segregate the individual metric and Value in different columns. Could you please help?
    i am getting this value in my Metric_Data column
    {
    Metric: "Actuals",
    Data: 5000
    }

    ReplyDelete
    Replies
    1. Did you make an ARRAY out of your STRUCTs?

      Delete
  12. Thank's a lot! You solved my problem!

    ReplyDelete
  13. Excellent mate! Keep doing this kind of post!

    ReplyDelete
  14. Very helpful mate.. Exactly what i was looking for..

    ReplyDelete

Post a Comment

Popular posts from this blog

Querying Google Analytics data as flat tables from BigQuery: The Definitive Guide and Recipe

Setting up R Studio Server on Google Compute Engine