Thursday, June 27, 2013

Pivoting multiple columns with ordered names and attributes in sql server 2008 using the PIVOT function.

A friend of mine came across an issue where they needed to take data from a table, that had a label and a value, and then pivot it out into a single row and they were unsure of how to accomplish the goal.  So I created this little example to show them what needed to be done.

In the data they have a table of labeled attributes combined with a sort order.  The sort order is the priority of any individual attribute, and the attributes have labels, and for any foreign key has a value associated with that specific label.  

Key, AttributeName, Attribute Value, SortOrder

An example would be:

FKeyValue
SortOrder
ItemName
ItemValue
1
2
Item2
2
1
3
Item3
3
1
4
Item4
4
1
5
Item5
5
2
1
Item1
1
2
2
Item2
2
2
3
Item3
3
2
14
Item14
14
2
31
Item31
31
2
51
Item51
51

And they want the data to pivot to:


fkeyvalue
att1name
att1value
att2name
att2value
att3name
att3value
att4name
att4value
att5name
att5value
1
Item2
2
Item3
3
Item4
4
Item5
5
NULL
NULL
2
Item1
1
Item2
2
Item3
3
Item14
14
Item31
31

So how do you approach this problem?  Because SQL doesn't allow you to pivot multiple item data, you have to approach the problem in parts.  My solution is to pivot the data twice around the foreign key, and use a row number assigned at the beginning of the operations to manage the column names and column order the data is exported.

The idea is that you first create a table that has the order you want the data to be in, this gives you the attribute number, and aligns the data in the way you want it to appear in the columns.  The data after this initial sort step will look like:

FKeyValue
SortOrder
ItemName
ItemValue
row_num
1
2
Item2
2
1
1
3
Item3
3
2
1
4
Item4
4
3
1
5
Item5
5
4
2
1
Item1
1
1
2
2
Item2
2
2
2
3
Item3
3
3
2
14
Item14
14
4
2
31
Item31
31
5
2
51
Item51
51
6

Then you do two pivots, one pivot is to generate the name attribute columns, and the second pivot is to create the value attribute columns.  Then you join the columns back to their foreign key, and select the order you want the columns to be in.

So the first pivoted table data would look like:

FKeyValue
1
2
3
4
5
1
Item2
Item3
Item4
Item5
NULL
2
Item1
Item2
Item3
Item14
Item31

 And the second pivoted table will look like:


FKeyValue
1
2
3
4
5
1
2
3
4
5
NULL
2
1
2
3
14
31

And after you join both tables together, and do a bit of column renaming you get:


fkeyvalue
att1name
att1value
att2name
att2value
att3name
att3value
att4name
att4value
att5name
att5value
1
Item2
2
Item3
3
Item4
4
Item5
5
NULL
NULL
2
Item1
1
Item2
2
Item3
3
Item14
14
Item31
31

Below is a complete example, one done with temp tables, and one done with a CTE that also has a little bit of column renaming so you can see how to pull out appropriate column names dynamically using the row number from the data.


IF Object_id('tempdb..#tmp_multipivotexample') IS NOT NULL
  BEGIN
      DROP TABLE #tmp_multipivotexample
  END

/* First create some test data*/
CREATE TABLE #tmp_multipivotexample
  (
     fkeyvalue INT NOT NULL,
     sortorder INT,
     itemname  VARCHAR(255),
     itemvalue VARCHAR(255)
  )

INSERT INTO #tmp_multipivotexample
            (fkeyvalue,
             sortorder,
             itemname,
             itemvalue)
SELECT 1       AS KeyValue,
       2       AS SortOrder,
       'Item2' AS ItemName,
       '2'     AS ItemValue
UNION
SELECT 1       AS FKeyValue,
       4       AS SortOrder,
       'Item4' AS ItemName,
       '4'     AS ItemValue
UNION
SELECT 1       AS FKeyValue,
       5       AS SortOrder,
       'Item5' AS ItemName,
       '5'     AS ItemValue
UNION
SELECT 1       AS FKeyValue,
       3       AS SortOrder,
       'Item3' AS ItemName,
       '3'     AS ItemValue
UNION
SELECT 2       AS FKeyValue,
       1       AS SortOrder,
       'Item1' AS ItemName,
       '1'     AS ItemValue
UNION
SELECT 2       AS KeyValue,
       3       AS SortOrder,
       'Item3' AS ItemName,
       '3'     AS ItemValue
UNION
SELECT 2        AS FKeyValue,
       14       AS SortOrder,
       'Item14' AS ItemName,
       '14'     AS ItemValue
UNION
SELECT 2        AS FKeyValue,
       51       AS SortOrder,
       'Item51' AS ItemName,
       '51'     AS ItemValue
UNION
SELECT 2        AS FKeyValue,
       31       AS SortOrder,
       'Item31' AS ItemName,
       '31'     AS ItemValue
UNION
SELECT 2       AS FKeyValue,
       2       AS SortOrder,
       'Item2' AS ItemName,
       '2'     AS ItemValue


--select * from #tmp_multipivotexample
/* Second Now that we have data, lets grab the data and make faux row numbers for the data for each key */
IF Object_id('tempdb..#InitialSortStep') IS NOT NULL
  BEGIN
      DROP TABLE #initialsortstep
  END

SELECT *,
       Row_number()
         OVER(
           partition BY fkeyvalue
           ORDER BY sortorder ASC) row_num
INTO   #initialsortstep
FROM   #tmp_multipivotexample

--select * from #InitialSortStep

/* Now that we have the initial sort step, we can see it sorts the data and properly assigned attribute numbers
so lets do the first pivot */


IF Object_id('tempdb..#PivotedItemNameByFKeyValue') IS NOT NULL
  BEGIN
      DROP TABLE #pivoteditemnamebyfkeyvalue
  END

SELECT *
INTO   #pivoteditemnamebyfkeyvalue
FROM   (SELECT fkeyvalue,
               itemname,
               row_num
        FROM   #initialsortstep) AS InitPivot
       PIVOT ( Max(itemname)
             FOR row_num IN ([1],
                             [2],
                             [3],
                             [4],
                             [5]) ) AS pivoted

--select * from #pivoteditemnamebyfkeyvalue

/*  Now to produce the properly sorted item value data */

IF Object_id('tempdb..#PivotedItemValuesByFKeyValue') IS NOT NULL
  BEGIN
      DROP TABLE #pivoteditemvaluesbyfkeyvalue
  END

SELECT *
INTO   #pivoteditemvaluesbyfkeyvalue
FROM   (SELECT fkeyvalue,
               itemvalue,
               row_num
        FROM   #initialsortstep) AS InitPivot
       PIVOT ( Max(itemvalue)
             FOR row_num IN ([1],
                             [2],
                             [3],
                             [4],
                             [5]) ) AS pivoted

--select * from #pivoteditemvaluesbyfkeyvalue

/* now to bring all of it together */

SELECT PINBFV.fkeyvalue,
       PINBFV.[1] AS att1name,
       PINBFN.[1] AS att1value,
       PINBFV.[2] AS att2name,
       PINBFN.[2] AS att2value,
       PINBFV.[3] AS att3name,
       PINBFN.[3] AS att3value,
       PINBFV.[4] AS att4name,
       PINBFN.[4] AS att4value,
       PINBFV.[5] AS att5name,
       PINBFN.[5] AS att5value
FROM   #pivoteditemnamebyfkeyvalue PINBFV
       JOIN #pivoteditemvaluesbyfkeyvalue PINBFN
         ON PINBFV.fkeyvalue = PINBFN.fkeyvalue
/*now as a CTE, in this CTE I actually use the row number function to do column naming as well, and re-designed the pivot
I think this makes it more readable. */
;

WITH cte_initialsortstep
     AS (SELECT *,
                Row_number()
                  OVER(
                    partition BY fkeyvalue
                    ORDER BY sortorder ASC) row_num
         FROM   #tmp_multipivotexample),
     cte_pivoteditemnamebyfkeyvalue
     AS (SELECT *
         FROM   (SELECT fkeyvalue,
                        itemname,
                        'att' + Cast(row_num AS VARCHAR(10)) + 'name' AS Name
                 FROM   cte_initialsortstep) AS InitPivot
                PIVOT ( Max(itemname)
                      FOR name IN ([att1name],
                                   [att2name],
                                   [att3name],
                                   [att4name],
                                   [att5name]) ) AS pivoted),
     pivoteditemvaluesbyfkeyvalue
     AS (SELECT *
         FROM   (SELECT fkeyvalue,
                        itemvalue,
                        'att' + Cast(row_num AS VARCHAR(10))
                        + 'value' AS Value
                 FROM   #initialsortstep) AS InitPivot
                PIVOT ( Max(itemvalue)
                      FOR value IN ([att1value],
                                    [att2value],
                                    [att3value],
                                    [att4value],
                                    [att5value]) ) AS pivoted)
SELECT PINBFV.fkeyvalue,
       PINBFV.att1name  AS att1name,
       PINBFN.att1value AS att1value,
       PINBFV.att2name  AS att2name,
       PINBFN.att2value AS att2value,
       PINBFV.att3name  AS att3name,
       PINBFN.att3value AS att3value,
       PINBFV.att4name  AS att4name,
       PINBFN.att4value AS att4value,
       PINBFV.att5name  AS att5name,
       PINBFN.att5value AS att5value
FROM   cte_pivoteditemnamebyfkeyvalue PINBFV
       JOIN pivoteditemvaluesbyfkeyvalue PINBFN
         ON PINBFV.fkeyvalue = PINBFN.fkeyvalue