Aristocratic Elfin Share

Friday, June 13, 2014

Distinct vs Group By, Which is faster?

Group By with all the columns in the group by behaves much the same as DISTINCT when no aggregation function is involved.

But my strong recommendation is

“Use Distinct, if you want distinct record set because it is meant for finding Distinct records, and since Group By is associated with aggregate function, use it when you have to do some aggregation.”

A famous quote which I remember. “You can use knife as a screw driver, but when you have screw driver don’t use knife” but at the same time see also the execution time for both the alternative before choosing your alternative.

Here in this post, we will conclude which is the fastest way to find distinct record in table. For demonstration we are taking a sample table with few rows in it.

create table table1

(id int identity(1,1),

name varchar(10),

l_name varchar(10))

insert into table1

select 'Babu','Mohanty' union all

select 'Rinny','Das' union all

select 'Binny','Mohapatra' union all

select 'Babu','Mohanty' union all

select 'Binny','Mohapatra'

Now since our table is ready, we can do some odd of practical to bring some concrete on this case.

CASE 1:

We are trying to find the distinct name and l_name list.

select name,l_name from table1

group by name,l_name

select distinct name,l_name from table1

Both result the same distinct records. Let’s see the execution plan of both the query.

So we saw both the query produce exactly similar plan, so this case anyone alternative can be chosen.

CASE 2

In the below example we are again trying to find the distinct records with two alternative.

select name,

getdate(),

from table1

group by name

select distinct

name,

getdate()

from table1

Lets see the execution plan for both the query to decide which one will be more effective.

So here we can see the query are producing different plan, but when it comes to cost both are taking exactly same cost to execute. So again you can use any alternative as far this example n environment is concern.

CASE 3

Again we are trying to find the distinct record with two kindda alternative.

select name,

rank() over(order by name) as RowNumber

from table1

group by name

select distinct

name,

rank() over(order by name) as RowNumber

from table1

Lets see the execution plan to find which is more efficient way to find the distinct records.

So here we saw again it produce different plan but both taking same cost to execute, so again we can choose anyone in this respect.

Conclusion: It all on your various parameter which may affect the outcome cost of query, so better approach would be check the execution plan and select your alternative.

A coder has thousand reason to laugh but zero reason to cry! J

Post Reference: Vikram Aristocratic Elfin Share

Thursday, June 12, 2014

Where Predicate in “JOIN ON” Clause

Today it happened that my junior Engineer came to my desk and asks me a good question, his statement was like this

“What difference it makes if we write filter condition with ON CLAUSE along with join”

His mean to say; All those condition which we write in “WHERE” clause, if we write all those condition along with join, will it make any difference in the output of query?

Let’s stimulate the same, by doing a small practical.

Here we are creating two table, table1 as parent table and table2 as child with common field for joining “id” in parent table and “p_id” as child field as foreign table.

create table table1

(id int primary key,

name varchar(10))

Command(s) completed successfully.

create table table2

(id int primary key,

address varchar(15),

p_id int references table1(id))

Command(s) completed successfully.

Let’s insert some records in table1

insert into table1

select 1,'Lopa' union all

select 2, 'Abhipsha' union all

select 3,'Rubi' union all

select 4,'Soumya' union all

select 5,'Diksha'

Now lets insert some child records in table2

insert into table2

select 1,'Vegas', 1 union all

select 2,'New Jersey', 2 union all

select 3,'San Diego',3 union all

select 4,'San Francisco',3 union all

select 5,'Los Angeles',5

select * from table1

id name

----------- ----------

1 Lopa

2 Abhipsha

3 Rubi

4 Soumya

5 Diksha

select * from table2

id address p_id

----------- --------------- -----------

1 Vegas 1

2 New Jersey 2

3 San Diego 3

4 San Francisco 3

5 Los Angeles 5

Lets write inner join query joining table1 with table2 on id of table1 with p_id of table2

select * from table1

inner join table2

on table1.id = table2.p_id

id name id address p_id

----------- ---------- ----------- --------------- -----------

1 Lopa 1 Vegas 1

2 Abhipsha 2 New Jersey 2

3 Rubi 3 San Diego 3

3 Rubi 4 San Francisco 3

5 Diksha 5 Los Angeles 5

Now here we will be adding a condition along with the join. The condition is the p_id should be 3. And the condition is written with WhERE clause.

select * from table1

inner join table2

on table1.id=table2.p_id

where table2.p_id=3

id name id address p_id

----------- ---------- ----------- --------------- -----------

3 Rubi 3 San Diego 3

3 Rubi 4 San Francisco 3

Now we will write the same condition p_id equals 3 but this time the condition is written along with the JOIN clause.

select * from table1

inner join table2

on table1.id = table2.p_id and table2.p_id=3

id name id address p_id

----------- ---------- ----------- --------------- -----------

3 Rubi 3 San Diego 3

3 Rubi 4 San Francisco 3

Here we saw writing condition either with JOIN clause or WHERE did not make any difference with INNER JOIN.

Lets try the same with OUTER join.

select * from table1

left outer join table2

on table1.id = table2.p_id

id name id address p_id

----------- ---------- ----------- --------------- -----------

1 Lopa 1 Vegas 1

2 Abhipsha 2 New Jersey 2

3 Rubi 3 San Diego 3

3 Rubi 4 San Francisco 3

4 Soumya NULL NULL NULL

5 Diksha 5 Los Angeles 5

Now we will put some filter criteria.

select * from table1

left outer join table2

on table1.id=table2.p_id

where table2.p_id=3

id name id address p_id

----------- ---------- ----------- --------------- -----------

3 Rubi 3 San Diego 3

3 Rubi 4 San Francisco 3

Here we saw after left outer join we have added a criteria of p_id =3… it filter the result set to p_id equals 3.

Now we will try to write the same query but remove the where predicate and write the table2.p_id=3 condition along with join clause and check the result.

select * from table1

left outer join table2

on table1.id = table2.p_id and table2.p_id=3

id name id address p_id

----------- ---------- ----------- --------------- -----------

1 Lopa NULL NULL NULL

2 Abhipsha NULL NULL NULL

3 Rubi 3 San Diego 3

3 Rubi 4 San Francisco 3

4 Soumya NULL NULL NULL

5 Diksha NULL NULL NULL

Result are not same in case of Outer Join.

Conclusion: According to Logical Query Processing, the ON clause executed first then the where clause, Now here with the above Outer Join example, the table2 was filtered with p_id =3 condition then the result was joined with table1. So be aware if you are writing condition with ON CLAUSE with OUTER JOIN.

Blend your code with TEA, toast up your logic and enjoy your life with programming forever! J

Post Reference: Vikram Aristocratic Elfin Share

Aristocratic Elfin Share

Pages

About Me

Search This Blog

Friday, June 13, 2014

Distinct vs Group By, Which is faster?

Thursday, June 12, 2014

Where Predicate in “JOIN ON” Clause