Aristocratic Elfin Share: TEMP Table

Showing posts with label TEMP Table. Show all posts

Saturday, November 22, 2014

Table Variable Non Clustered Index not keeping correct Statistics

With table variable, the optimizer fail to predict right number of rows returned from the query execution and fall in using wrong index selection. Lets try out with an example, first we will see the same with #temp table then will check the @table variable of SQL Server 2014.

Here we are creating a temporary table with clustered and non clustered index on it.

create table #IndexDemo (

id int identity primary key clustered,

city varchar(60),

stateprovinceid int,

index ix_stateprovinceid nonclustered (stateprovinceid)

);

insert #IndexDemo (city,stateprovinceid)

select city, stateprovinceid

from person.address;

select city

from #IndexDemo

where stateprovinceid=1;

drop table #IndexDemo

Here in above script we have temp table with stateprovinceid field as non clustered key and id as clustered index. Now if we see the execution plan for the query

select city

from #IndexDemo

where stateprovinceid=1;

Now looking at execution plan we can see it has predicted 25 rows, so to efficiently retrieve it use nonclustered index that’s fair enough. Then do the lookup to get the city column from clustered index.

Now lets implement the same using SQL Server 2014 table variable.

declare @IndexDemo table (

i int identity primary key clustered (i),

city varchar(60),

stateprovinceid int,

index ix_stateprovinceid nonclustered (stateprovinceid)

);

insert @IndexDemo (city,stateprovinceid)

select city, stateprovinceid

from person.address;

select city

from @IndexDemo

where stateprovinceid=1;

Now lets check the execution plan

Here, the estimated number of row count is showing 1 which is but obvious wrong in number, but to the count, it should use Non Clustered Index seek but oddly it uses Clustered Index scan. So to our example it fail to utilize non clustered index.

Note: Be aware before you decide to go for table variable if you thinking to make use of non clustered index feature.

Let my rhythm take me with you :)

Post Reference: Vikram Aristocratic Elfin Share

Saturday, July 5, 2014

No Statistic Update for Table Variable

Read Previous Post: Statistic for @Table Variable

In previous post we were checking about the presence of Statistic for @Table Variable, here will try to dig more on execution plan to see, whether the statistics which is used by optimizer is up-to-the-minute or stale.

Again we are using same setup code for finding our conclusion. Here we are creating our table variable and inserting 5000 records in it.

set nocount on

declare @tab_var table

(id int primary key clustered)

declare @i int = 0

begin tran

while @i < 5000

begin

insert into @tab_var

values (@i) set @i=@i+1

end

commit tran

let set the Statistic Profile ON to see whether any statistic created for the same.

set statistics profile on

select * from @tab_var --option(recompile)

set statistics profile off

Rows Executes StmtText EstimateRows

---- -------- --------------------------------------------- ----------

5000 1 select * from @tab_var 1

5000 1 |--Clustered Index Scan(OBJECT:(@tab_var)) 1

From the output we can see it uses the statistic, but the estimated Rows count(1) and Actual rows counts(5000) is varying with high percentage. This means the statistic is not updated with the insertion of 5k records.

But if it is so, lets set the strict recompile Option true to the query and see the estimated Row count.

set statistics profile on

select * from @tab_var option(recompile)

set statistics profile off

Rows Executes StmtText EstimateRows

---- -------- --------------------------------------------- ----------

5000 1 select * from @tab_var 5000

5000 1 |--Clustered Index Scan(OBJECT:(@tab_var)) 5000

So here we can see the estimated number of rows increases to 5000, that means now the plan is up-to-the-minute plan.

So what conclusion we are getting here is Statistic are available for Table Variable but statics are nit updated with DML operation.

So one need to explicitly call Option (Recompile) to make the statistic update for Table variable. And that is the difference between Table variable and Temp Table.

The joy of release of ur code is more pleasurable than keeping it with you

Post Reference: Vikram Aristocratic Elfin Share

Statistic for Table Variable

There is always a trading open for all good thing, and when it comes to using Temp Table vs Table variable, you will witness super mall promoting their cases. Here in this post we won’t separate ourselves, will take the same plane to proof our finding.

Statistic for table variable; This is a point which creates muddy fight between table variable versus Temp table, Here in this post we will try to clear the ground on this difference.

Here we are creating a table variable with just a single column, And inserting around 5000 records in it.

set nocount on

declare @tab_var table

(id int primary key clustered)

declare @i int = 0

begin tran

while @i < 5000

begin

insert into @tab_var

values (@i) set @i=@i+1

end

commit tran

Now here we are setting the Statistic Profile ON to see whether any statistic created for the same.

set statistics profile on

select * from @tab_var --option(recompile)

set statistics profile off

Rows Executes StmtText EstimateRows

---- -------- --------------------------------------------- ----------

5000 1 select * from @tab_var 1

5000 1 |--Clustered Index Scan(OBJECT:(@tab_var)) 1

So here we can see, it produces the statistic of estimated rows. But what we can see the estimated Rows count is 1 where as Actual rows count is 5000, so there is a big difference, so i can conclude this as a stale statistic for table variable.

In next post we will see the work around solution for stale statistic of table variable.

Release of Code is a paradise but release from project is a saddle to the coder! J

Post Reference: Vikram Aristocratic Elfin Share

Thursday, June 26, 2014

Logical Query Processing- Fuzzy Query with Order By Clause

Today it happened, I was called by my fellow programmer, he saw me a piece of T-SQL Code and asks few question on it, I will try to replicate the same code here.

There was a query somehow like this

select sub_group,sub_group,name,name from #temp

where sub_group = 'Furniture'

order by sub_group

And the query was failing, and asked me to justify the why it failing. That’s good, I find it interesting to explain the logic behind the scene through Logical Query Processing. Yeah ;) I didn’t ask tea this time for this: P

Let’s formulate the same scenario by creating a sample table.

create table #temp

(id int identity(1,1),

sub_group varchar(10),

name varchar(10))

Command(s) completed successfully.

Lets insert few records in it.

insert into #temp

select 'Furniture','chair' union all

select 'Vechile','Maruti' union all

select 'Furniture','Desk' union all

select 'Furniture','Dine' union all

select 'Vechile','Honda' union all

select 'H.Vechile','Mahindra'

Now we can have the same Select query. Lets fire the query and see the output.

select sub_group,sub_group,name,name from #temp

where sub_group = 'Furniture'

order by sub_group

Msg 209, Level 16, State 1, Line 32

Ambiguous column name 'sub_group'.

Now lets take the error, and dig the background of it. Now it says Ambiguous Column name.

If you see in the select list, we have called this column two times, now if we go through logical query processing, it says for this query.

1^st from clause gets executed i.e #temp and record set will build up say RS1.

2^nd where by clause gets executed which will filter the data and remain with the data which has ‘Furniture’ as sub_group and create a new record set say RS2

3^rd Now the Select clause will gets executed with form one more column of same name as sub_group. So the new record set will have this many column “sub_group, sub_group, name, name” say this record set as RS3

4^th Now Oder By clause gets the chance, it operate on RS3 record set, now if you see in query, the order by clause is on sub_group, so when it tries to order the RS3 data, it will found two column with same name sub_group, and optimizer find it difficult which column to pick up, thus it throws Ambiguous column error.

Solution to this problem is: make an alias to one of the sub_group column.

select sub_group,sub_group SB ,name,name from #temp

where sub_group = 'Furniture'

order by sub_group

Run the query and you will get your result.

sub_group SB name name

---------- ---------- ---------- ----------

Furniture Furniture chair chair

Furniture Furniture Desk Desk

Furniture Furniture Dine Dine

You just have to make time for your coding and keep it balance J

Post Reference: Vikram Aristocratic Elfin Share

Aristocratic Elfin Share

Pages

About Me

Search This Blog

Saturday, November 22, 2014

Table Variable Non Clustered Index not keeping correct Statistics

Saturday, July 5, 2014

No Statistic Update for Table Variable

Statistic for Table Variable

Thursday, June 26, 2014

Logical Query Processing- Fuzzy Query with Order By Clause