Overcoming the Limitation of PIVOT for Strings
By Kalman Toth, M.Phil. Physics, M.Phil. Comp. Science, MCDBA, MCITP
March 9 , 2008
One of SQL Server 2005 new features, the PIVOT operator, has two big limitations: it is not dynamic for the column list (requires hardwiring), and it does not work for string data, only for numeric. Here is an example from the AdventureWorks2008 sample database using the SUM numeric aggregate operator for PIVOTing:
USE AdventureWorks2008;
WITH cteOrder
AS (SELECT DATENAME(MONTH,OrderDate) AS 'Month',
COUNT(* ) AS 'OrderCount'
FROM Sales.SalesOrderHeader
WHERE YEAR(OrderDate) = 2003
GROUP BY DATENAME(MONTH,OrderDate))
SELECT Period = 'All times Sales Orders', *
FROM cteOrder
PIVOT
(SUM(OrderCount)
FOR [Month] IN ( [January],[February],[March],[April],[May],[June],
[July],[August],[September],[October],[November],[December] ) ) AS pvt
GO
This is the result set:
| Period |
January |
February |
March |
April |
May |
June |
July |
August |
September |
October |
November |
December |
| All times Sales Orders |
309 |
404 |
378 |
368 |
469 |
423 |
609 |
1760 |
1783 |
1779 |
1889 |
2272 |
In the above case the aggregate SUM functions works on numerical data and tabulates the numerical results for each month. Note that the month names are hard-wired, the month list cannot be replaced by a dynamic SELECT. That forces the use of dynamic SQL if hard-wiring is not a good fit which may be the case quite frequently.
What happens if the data is not numeric like the OrderCount in the case above? How can we get the SUM of string data? What is the sum of ‘green’ and ‘blue’? Well, there is no sum for string data, but the equivalent logical operation for business reporting is CONCATENATION. For readability reason the usual way to create the concatenation for business reporting is a comma separated list: ‘green, blue’ . Is that meaningful though? Yes most definitely: “The new dress has green and blue in it”. String lists are really useful and commonly used business reporting. For the PIVOT example the concatenation is for the sales staff names who handled purchase orders for certain date in a given region(territory). Quite a normal expectation in business reporting! On the downside for comma-limited lists, they cannot be too long, otherwise printing and formatting problems will arise. Potential solution is for example using First Initial and Last Name for sales staff to save space.
FOR XML Path has been augmented with the text() concatenation function for SQL Server 2005. The following examples will form a list of department names separated by commas:
USE AdventureWorks2008;
SELECT [Department List] =
isnull(stuff((SELECT ', ',
[text()] = Name
FROM HumanResources.Department
for XML Path ('')),1,1,''),'')
GO
-- Partial results: Document Control, Engineering, Executive,...
The great thing about this concatenation solution is that it does not require a T-SQL variable like an SQL Server 2000 solution did. That means it can be part of a single step query.
Since there is no summary function available for concatenation, we cannot use PIVOT, we use instead a 10-way self-join for the following territories:
Australia
Canada
Central (U.S.A.)
France
Germany
Northeast (U.S.A.)
Northwest (U.S.A.)
Southeast (U.S.A.)
Southwest (U.S.A.)
United Kingdom
This is how a self-join link looks:
left join ( select * from cteOrderDateXrossTerritory where Territory='Northeast' ) o6
on o0.OrderDate=o6.OrderDate
For the actual list selection of sales people who handled a purchase order for that territory on a particular day:
'Germany'=isnull(stuff((select ',', [text()]=o5x.SalesPerson from cteOrderDateXrossTerritory o5x
where o5x.OrderDate=o0.OrderDate and o5x.Territory='Germany' for XML Path ('')),1,1,''),''),
The entire query follows. It is not dynamic (but it can be enhanced into dynamic sql query), it is hard-wired! Yet it is so powerful that it may earn you a raise or promotion. If one does not do it, 10 or 20 BI report like this will! Corporate executives love crosstab reports because it helps them tremendously in business analysis. The power of the query comes from 2 CTE-s, concatenation with XML path for the sales staff in the Territory-OrderDate cell and left-joins which are performing the pivoting of string data. To package it for production, a dynamic SQL stored procedure would be in order.
USE AdventureWorks;
WITH cteOrderDateXrossTerritory
AS (SELECT DISTINCT OrderDate = convert(CHAR(10),OrderDate,112),
Territory = s.Name,
[SalesPerson] = FirstName + ' ' + LastName
FROM Sales.SalesOrderHeader soh
JOIN Person.Contact c
ON c.ContactID = soh.SalesPersonID
JOIN Sales.SalesTerritory s
ON s.TerritoryID = soh.TerritoryID),
cteAllOrderDates
AS (SELECT DISTINCT OrderDate = convert(CHAR(10),OrderDate,112)
FROM Sales.SalesOrderHeader soh
JOIN Sales.SalesTerritory s
ON s.TerritoryID = soh.TerritoryID
AND SalesPersonID IS NOT NULL)
SELECT DISTINCT o0.OrderDate,
'Australia' = isnull(stuff((SELECT ',',
[text()] = o1x.SalesPerson
FROM cteOrderDateXrossTerritory o1x
WHERE o1x.OrderDate = o0.OrderDate
AND o1x.Territory = 'Australia'
for XML Path ('')),1,1,''),''),
'Canada' = isnull(stuff((SELECT ',',
[text()] = o2x.SalesPerson
FROM cteOrderDateXrossTerritory o2x
WHERE o2x.OrderDate = o0.OrderDate
AND o2x.Territory = 'Canada'
for XML Path ('')),1,1,''),''),
'Central' = isnull(stuff((SELECT ',',
[text()] = o3x.SalesPerson
FROM cteOrderDateXrossTerritory o3x
WHERE o3x.OrderDate = o0.OrderDate
AND o3x.Territory = 'Central'
for XML Path ('')),1,1,''),''),
'France' = isnull(stuff((SELECT ',',
[text()] = o4x.SalesPerson
FROM cteOrderDateXrossTerritory o4x
WHERE o4x.OrderDate = o0.OrderDate
AND o4x.Territory = 'France'
for XML Path ('')),1,1,''),''),
'Germany' = isnull(stuff((SELECT ',',
[text()] = o5x.SalesPerson
FROM cteOrderDateXrossTerritory o5x
WHERE o5x.OrderDate = o0.OrderDate
AND o5x.Territory = 'Germany'
for XML Path ('')),1,1,''),''),
'Northeast' = isnull(stuff((SELECT ',',
[text()] = o6x.SalesPerson
FROM cteOrderDateXrossTerritory o6x
WHERE o6x.OrderDate = o0.OrderDate
AND o6x.Territory = 'Northeast'
for XML Path ('')),1,1,''),''),
'Northwest' = isnull(stuff((SELECT ',',
[text()] = o7x.SalesPerson
FROM cteOrderDateXrossTerritory o7x
WHERE o7x.OrderDate = o0.OrderDate
AND o7x.Territory = 'Northwest'
for XML Path ('')),1,1,''),''),
'Southeast' = isnull(stuff((SELECT ',',
[text()] = o8x.SalesPerson
FROM cteOrderDateXrossTerritory o8x
WHERE o8x.OrderDate = o0.OrderDate
AND o8x.Territory = 'Southeast'
for XML Path ('')),1,1,''),''),
'Southwest' = isnull(stuff((SELECT ',',
[text()] = o9x.SalesPerson
FROM cteOrderDateXrossTerritory o9x
WHERE o9x.OrderDate = o0.OrderDate
AND o9x.Territory = 'Southwest'
for XML Path ('')),1,1,''),''),
'United Kingdom' = isnull(stuff((SELECT ',',
[text()] = o10x.SalesPerson
FROM cteOrderDateXrossTerritory o10x
WHERE o10x.OrderDate = o0.OrderDate
AND o10x.Territory = 'United Kingdom'
for XML Path ('')),1,1,''),'')
FROM cteAllOrderDates o0
LEFT JOIN (SELECT *
FROM cteOrderDateXrossTerritory
WHERE Territory = 'Australia') o1
ON o0.OrderDate = o1.OrderDate
LEFT JOIN (SELECT *
FROM cteOrderDateXrossTerritory
WHERE Territory = 'Canada') o2
ON o0.OrderDate = o2.OrderDate
LEFT JOIN (SELECT *
FROM cteOrderDateXrossTerritory
WHERE Territory = 'Central') o3
ON o0.OrderDate = o3.OrderDate
LEFT JOIN (SELECT *
FROM cteOrderDateXrossTerritory
WHERE Territory = 'France') o4
ON o0.OrderDate = o4.OrderDate
LEFT JOIN (SELECT *
FROM cteOrderDateXrossTerritory
WHERE Territory = 'Germany') o5
ON o0.OrderDate = o5.OrderDate
LEFT JOIN (SELECT *
FROM cteOrderDateXrossTerritory
WHERE Territory = 'Northeast') o6
ON o0.OrderDate = o6.OrderDate
LEFT JOIN (SELECT *
FROM cteOrderDateXrossTerritory
WHERE Territory = 'Northwest') o7
ON o0.OrderDate = o7.OrderDate
LEFT JOIN (SELECT *
FROM cteOrderDateXrossTerritory
WHERE Territory = 'Southeast') o8
ON o0.OrderDate = o8.OrderDate
LEFT JOIN (SELECT *
FROM cteOrderDateXrossTerritory
WHERE Territory = 'Southwest') o9
ON o0.OrderDate = o9.OrderDate
LEFT JOIN (SELECT *
FROM cteOrderDateXrossTerritory
WHERE Territory = 'United Kingdom') o10
ON o0.OrderDate = o10.OrderDate
ORDER BY o0.OrderDate
GO
This is part of the result set:
OrderDate |
Australia |
Canada |
Central |
20010701 |
|
Carla Eldridge,Michael Emanuel |
Linda Ecoffey,Maciej Dusza |
20010801 |
|
Carla Eldridge,Michael Emanuel |
Jauna Elson,Linda Ecoffey,Maciej Dusza,Shelley Dyck |
20010901 |
|
Carla Eldridge,Gary Drury,Michael Emanuel |
Linda Ecoffey,Maciej Dusza,Shelley Dyck |
20011001 |
|
Carla Eldridge,Michael Emanuel |
Linda Ecoffey,Maciej Dusza |
------------
Related article:
SQL Server PIVOT examples?
|