About Me

My photo
Mumbai, Maharastra, India
He has more than 7.6 years of experience in the software development. He has spent most of the times in web/desktop application development. He has sound knowledge in various database concepts. You can reach him at viki.keshari@gmail.com https://www.linkedin.com/in/vikrammahapatra/ https://twitter.com/VikramMahapatra http://www.facebook.com/viki.keshari

Search This Blog

Friday, June 13, 2014

Distinct vs Group By, Which is faster?



Group By with all the columns in the group by behaves much the same as DISTINCT when no aggregation function is involved.
But my strong recommendation is

“Use Distinct, if you want distinct record set because it is meant for finding Distinct records, and since Group By is associated with aggregate function, use it when you have to do some aggregation.”

A famous quote which I remember. “You can use knife as a screw driver, but when you have screw driver don’t use knife” but at the same time see also the execution time for both the alternative before choosing your alternative.  

Here in this post, we will conclude which is the fastest way to find distinct record in table. For demonstration we are taking a sample table with few rows in it.

create table table1
(id int identity(1,1),
name varchar(10),
l_name varchar(10))

insert into table1
select 'Babu','Mohanty' union all
select 'Rinny','Das' union all
select 'Binny','Mohapatra' union all
select 'Babu','Mohanty' union all
select 'Binny','Mohapatra'

Now since our table is ready, we can do some odd of practical to bring some concrete on this case.

CASE 1:
We are trying to find the distinct name and l_name list.

select name,l_name from table1
group by name,l_name

select distinct name,l_name from table1
Both result the same distinct records. Let’s see the execution plan of both the query.
So we saw both the query produce exactly similar plan, so this case anyone alternative can be chosen.

CASE 2
In the below example we are again trying to find the distinct records with two alternative.
select name,
       getdate(),
from table1
group by name

select distinct
       name,
       getdate()
from table1

Lets see the execution plan for both the query to decide which one will be more effective.


So here we can see the query are producing different plan, but when it comes to cost both are taking exactly same cost to execute. So again you can use any alternative as far this example n environment is concern.

CASE 3

Again we are trying to find the distinct record with two kindda alternative.

select name,
        rank() over(order by name) as RowNumber
from table1
group by name

select distinct
       name,
       rank() over(order by name) as RowNumber
from table1

Lets see the execution plan to find which is more efficient way to find the distinct records.


So here we saw again it produce different plan but both taking same cost to execute, so again we can choose anyone in this respect.

Conclusion: It all on your various parameter which may affect the outcome cost of query, so better approach would be check the execution plan and select your alternative.

A coder has thousand reason to laugh but zero reason to cry! J
 

Post Reference: Vikram Aristocratic Elfin Share

No comments:

Post a Comment