There are lot threads on various forum discussing automatic sorting behaviour of Group By clause, lets put extra mile to prove this myth incorrect.
So here is my query, I am running this dummy
query on AdventureWorks2012 database
select pc.Name, sum( psc.ProductSubcategoryID) as 'Sum of Sub Product Id' from
Production.ProductCategory pc
inner join Production.ProductSubcategory
psc
on pc.ProductCategoryID =
psc.ProductCategoryID
group by pc.Name
Name
Sum of Sub Product Id
--------------------------------------------------
---------------------
Accessories 378
Bikes 6
Clothing 172
Components 147
Let’s see from the prospective of execution
plan,
Now just think about a situation where instead
of Stream Aggregate, optimizer feels to use Hash Aggregate, which don’t require
its input to be in sorted order.
Let’s try to bring Hash Operator in the plan in
place of Stream Operator.
I am not able to stimulate the situation where
Hash Aggregate appears in plan. So lets play a tricky game with Optimizer, and
tell there is no such Stream aggregate operator present thus don’t make it use
for creating plan
You can disable Stream Aggregate operator by
DBCC TRACEON (3604);
DBCC RULEOFF('GbAggToStrm');
GO
But later after your operation don’t forget to Rule on for Stream
Aggregate operator.
DBCC TRACEON (3604);
DBCC RULEOFF('GbAggToStrm');
GO
select pc.Name, sum( psc.ProductSubcategoryID) as 'Sum of Sub Product Id' from
Production.ProductCategory pc
inner join Production.ProductSubcategory
psc
on pc.ProductCategoryID =
psc.ProductCategoryID
group by pc.Name
OPTION (RECOMPILE);
GO
DBCC RULEON('GbAggToStrm');
Name
Sum of Sub Product Id
--------------------------------------------------
---------------------
Bikes 6
Clothing 172
Accessories 378
Components 147
Here we can see the result is didn’t get sorted
by optimizer, lets see how optimizer executed this query.
We can see the Stream Aggregate operator
replaced by Hash Aggregate which don’t require its input to be in sorted manner
and that is the reason why our result resulted in unsorted manner.
So to be in safe side, if you want your group by
data in sorted manner do use order by clause along with group by, this way you
can guarantee your data to be in sorted manner.
DBCC TRACEON (3604);
DBCC RULEOFF('GbAggToStrm');
GO
select pc.Name, sum( psc.ProductSubcategoryID) as 'Sum of Sub Product Id' from
Production.ProductCategory pc
inner join Production.ProductSubcategory
psc
on pc.ProductCategoryID =
psc.ProductCategoryID
group by pc.Name
order by pc.Name
OPTION (RECOMPILE);
GO
DBCC RULEON('GbAggToStrm');
Name
Sum of Sub Product Id
--------------------------------------------------
---------------------
Accessories 378
Bikes 6
Clothing 172
Components 147
Conclusion: Always use order by along with Group By, if
you want your data to be in sorted manner.
SQL Server with a Tea glass makes a
perfect combination to deal with darkness :)
Post Reference: Vikram Aristocratic Elfin Share
No comments:
Post a Comment