Pivots (turning a column’s values into actual columns) is a very common activity. Spreadsheet programs have robust support for it. But Standard SQL? Not so much.
Create Table orders ( orderNumber int, sku char(3), quantity int, salesPerson varchar(10) ); insert into orders values ( 1, 'ZR34', 2, 'Mary'), ( 1, 'AS99', 1, 'Mary'), ( 2, 'ZR34', 1, 'Jim'), ( 2, 'MB01', 1, 'Jim');
The ubiquitous order table with the SKU, quantity and sales person. To keep this simple, I did not normalize. If that bothers you, then think of the orders table as the results of joining between the all the bits.
The ask, is to product a report that shows, for each sales persons, how many of each SKU they sold.
sku Mary Jim Kiki AS99 1 0 0 MB01 0 1 0 ZR34 2 1 0
(Kiki was apparently on vacation.)
Now if your most people, you download the data to a spreadsheet and call it a day. But we’re not most people. We have a hammer (SQL) so we are going to hammer this flat. Plus, we know that we’ll get the same request next week, and the next, etc. And who has time for that?
If we were to do this in standard SQL, it would look like:
select sku, sum([Mary]) as "Mary", sum([Jim]) as "Jim", sum([Kiki]) as "Kiki" from ( select sku, case when salesPerson = 'Jim' then quantity else 0 end as [Jim], case when salesPerson = 'Mary' then quantity else 0 end as [Mary], case when salesPerson = 'Kiki' then quantity else 0 end as [Kiki] from orders ) as bySP group by bySP.sku
A new case statement is required for every salesperson. That’s no fun.
MSSQL has a pivot statement that makes this a bit less painful
select sku, [Mary], [Jim], [Kiki] from (select sku, quantity, salesPerson from orders) s pivot (sum(quantity) for salesPerson in ( [Mary], [Jim], [Kiki])) pvt
Some notes about the syntax:
- The alias for the pivot (eg
pvt) is required.
- The alias for the subselect is also required, even though it isn’t used.
- The values that form the in-list are not strings – they are column names.
- You can use * in the outer select’s return list.
You can use a bare table in the from clause, but be careful. Any column (like
sku) that is not aggregated or used as the pivot column becomes a defacto group-by. In our example
orderNumber becomes another row label
select * from orders pivot (sum(quantity) for salesPerson in ( [Mary], [Jim], [Kiki])) pvt
orderNumber sku Mary Jim Kiki 1 AS99 1 NULL NULL 2 MB01 NULL 1 NULL 1 ZR34 2 NULL NULL 2 ZR34 NULL 1 NULL
It would be nice if we could do something like this:
-- THIS DOESN'T WORK select * from (select sku, quantity, salesPerson from orders) s pivot sum(quantity) for salesPerson in ( select distinct salesPerson from orders ) as pt
But unfortunately, the list of values needs to be given explicitly. The one good thing about this is that you will have a column for a value, even if there is no rows that match (think of poor Kiki).
But this can be done using Dynamic SQL and exec. I’ve built dynamic queries such as this for the standard sql case and it is no fun. Doing so for the pivot operator is a piece of cake.
First create a
salesForce table so Kiki will make an appearance.
create table salesForce( name varchar(10) ); insert into salesForce values ('Kiki'), ('Mary'), ('Jim')
Then use a cursor to build our column list and presto!
declare sp cursor for select [name] from salesForce declare @list varchar(500) = '' declare @query varchar(1000) declare @aName varchar(10) open sp Fetch next from sp into @aName while @@FETCH_STATUS = 0 begin if @list != '' set @list = @list + ', ' set @list = @list + '[' + @aName + ']' fetch next from sp into @aName end close sp deallocate sp set @query = 'select sku, ' + @list + ' from (select sku, quantity, salesPerson from orders) s ' + 'pivot (sum(quantity) for salesPerson in (' + @list + ')) pvt' print @query exec(@query)
Unfortunately, to get rid of the nulls means keeping two parallel list – one for the select that has the
isnull and one for the pivot value list.
I hope this helps you use the pivot statement effectively.