About Me

My photo
Mumbai, Maharastra, India
He has more than 7.6 years of experience in the software development. He has spent most of the times in web/desktop application development. He has sound knowledge in various database concepts. You can reach him at viki.keshari@gmail.com https://www.linkedin.com/in/vikrammahapatra/ https://twitter.com/VikramMahapatra http://www.facebook.com/viki.keshari

Search This Blog

Showing posts with label TEMP Table. Show all posts
Showing posts with label TEMP Table. Show all posts

Saturday, November 22, 2014

Table Variable Non Clustered Index not keeping correct Statistics

With table variable, the optimizer fail to predict right number of rows returned from the query execution and fall in using wrong index selection. Lets try out with an example, first we will see the same with #temp table then will check the @table variable of SQL Server 2014.

Here we are creating a temporary table with clustered and non clustered index on it.

create table #IndexDemo  (
    id­ int identity primary key clustered,
    city varchar(60),
    stateprovinceid int,
    index ix_stateprovinceid nonclustered (stateprovinceid)
);
insert #IndexDemo (city,stateprovinceid)
select city, stateprovinceid
from person.address;
go

select city
from #IndexDemo
where stateprovinceid=1;

drop table #IndexDemo

Here in above script we have temp table with stateprovinceid field as non clustered key and id as clustered index.  Now if we see the execution plan for the query
select city
from #IndexDemo
where stateprovinceid=1;

 


Now looking at execution plan we can see it has predicted 25 rows, so to efficiently retrieve it use nonclustered index that’s fair enough. Then do the lookup to get the city column from clustered index.

Now lets implement the same using SQL Server 2014 table variable.

declare @IndexDemo table (
       i int identity primary key clustered (i),
       city varchar(60),
       stateprovinceid int,
       index ix_stateprovinceid nonclustered (stateprovinceid)
);

insert @IndexDemo (city,stateprovinceid)
select city, stateprovinceid
from person.address;

select city
from @IndexDemo
where stateprovinceid=1;
go

Now lets check the execution plan

 


Here, the estimated number of row count is showing 1 which is but obvious wrong in number, but to the count, it should use Non Clustered Index seek but oddly it uses Clustered Index scan. So to our example it fail to utilize non clustered index.

Note: Be aware before you decide to go for table variable if you thinking to make use of non clustered index feature.

Let my rhythm take me with you :)


Post Reference: Vikram Aristocratic Elfin Share

Saturday, July 5, 2014

No Statistic Update for Table Variable


Read Previous Post: Statistic for @Table Variable

In previous post we were checking about the presence of Statistic for @Table Variable, here will try to dig more on execution plan to see, whether the statistics which is used by optimizer is up-to-the-minute or stale.     

Again we are using same setup code for finding our conclusion. Here we are creating our table variable and inserting 5000 records in it.

set nocount on  

declare @tab_var table
(id int primary key clustered) 

declare @i int = 0    

begin tran
 while @i < 5000
 begin
  insert into @tab_var
  values (@i)  set @i=@i+1
 end
commit tran  

let set the Statistic Profile ON to see whether any statistic created for the same.

set statistics profile on
select * from @tab_var --option(recompile)
set statistics profile off

Rows Executes StmtText                                   EstimateRows
---- -------- --------------------------------------------- ----------
5000 1        select * from @tab_var                        1        
5000 1        |--Clustered Index Scan(OBJECT:(@tab_var))    1        

From the output we can see it uses the statistic, but the estimated Rows count(1) and Actual rows counts(5000) is varying with high percentage. This means the statistic is not updated with the insertion of   5k records.

But if it is so, lets set the strict recompile Option true to the query and see the estimated Row count.

set statistics profile on
select * from @tab_var option(recompile)
set statistics profile off
Rows Executes StmtText                                   EstimateRows
---- -------- --------------------------------------------- ----------
5000 1        select * from @tab_var                        5000        
5000 1        |--Clustered Index Scan(OBJECT:(@tab_var))    5000        

So here we can see the estimated number of rows increases to 5000, that means now the plan is up-to-the-minute plan.

So what conclusion we are getting here is Statistic are available for Table Variable but statics are nit updated with DML operation.

So one need to explicitly call Option (Recompile) to make the statistic update for Table variable. And that is the difference between Table variable and Temp Table.

The joy of release of ur code is more pleasurable than keeping it with you



Post Reference: Vikram Aristocratic Elfin Share

Statistic for Table Variable


There is always a trading open for all good thing, and when it comes to using Temp Table vs Table variable, you will witness super mall promoting their cases. Here in this post we won’t separate ourselves, will take the same plane to proof our finding.

Statistic for table variable; This is a point which creates muddy fight between table variable versus Temp table, Here in this post we will try to clear the ground on this difference.

Here we are creating a table variable with just a single column, And inserting around 5000 records in it.

set nocount on  

declare @tab_var table
(id int primary key clustered) 

declare @i int = 0    

begin tran
 while @i < 5000
 begin
  insert into @tab_var
  values (@i)  set @i=@i+1
 end
commit tran  

Now here we are setting the Statistic Profile ON to see whether any statistic created for the same.

set statistics profile on
select * from @tab_var --option(recompile)
set statistics profile off


Rows Executes StmtText                                   EstimateRows
---- -------- --------------------------------------------- ----------
5000 1        select * from @tab_var                        1        
5000 1        |--Clustered Index Scan(OBJECT:(@tab_var))    1        

So here we can see, it produces the statistic of estimated rows. But what we can see the estimated Rows count is 1 where as Actual rows count is 5000, so there is a big difference, so i can conclude this as a stale statistic for table variable.

In next post we will see the work around solution for stale statistic of table variable.

Release of Code is a paradise but release from project is a saddle to the coder! J
 

Post Reference: Vikram Aristocratic Elfin Share

Thursday, June 26, 2014

Logical Query Processing- Fuzzy Query with Order By Clause



Today it happened, I was called by my fellow programmer, he saw me a piece of T-SQL Code and asks few question on it, I will try to replicate the same code here.
There was a query somehow like this

select sub_group,sub_group,name,name from #temp
where sub_group = 'Furniture'
order by sub_group

And the query was failing, and asked me to justify the why it failing. That’s good, I find it interesting to explain the logic behind the scene through Logical Query Processing.  Yeah ;) I didn’t ask tea this time for this: P

Let’s formulate the same scenario by creating a sample table.

create table #temp
(id int identity(1,1),
sub_group varchar(10),
name varchar(10))
Command(s) completed successfully.

Lets insert few records in it.

insert into #temp
select 'Furniture','chair' union all
select 'Vechile','Maruti' union all
select 'Furniture','Desk' union all
select 'Furniture','Dine' union all
select 'Vechile','Honda' union all
select 'H.Vechile','Mahindra'

Now we can have the same Select query. Lets fire the query and see  the output.

select sub_group,sub_group,name,name from #temp
where sub_group = 'Furniture'
order by sub_group
Msg 209, Level 16, State 1, Line 32
Ambiguous column name 'sub_group'.

Now lets take the error, and dig the background of it. Now it says Ambiguous Column name.

If you see in the select list, we have called this column two times, now if we go through logical query processing, it says for this query.
1st from clause gets executed i.e #temp and record set will build up say RS1.
2nd where by clause gets executed which will filter the data and remain with the data which has ‘Furniture’ as sub_group and create a new record set say RS2
3rd Now the Select clause will gets executed with form one more column of same name as sub_group. So the new record set will have this many column “sub_group, sub_group, name, name” say this record set as RS3
4th Now Oder By clause gets the chance, it operate on RS3 record set, now if you see in query, the order by clause is on sub_group, so when it tries to order the RS3 data, it will found two column with same name sub_group, and optimizer find it difficult which column to pick up, thus it throws Ambiguous column error.

Solution to this problem is: make an alias to one of the sub_group column.

select sub_group,sub_group SB ,name,name from #temp
where sub_group = 'Furniture'
order by sub_group

Run the query and you will get your result.

sub_group  SB         name       name
---------- ---------- ---------- ----------
Furniture  Furniture  chair      chair
Furniture  Furniture  Desk       Desk
Furniture  Furniture  Dine       Dine


You just have to make time for your coding and keep it balance J
 

Post Reference: Vikram Aristocratic Elfin Share