About Me

My photo
Mumbai, Maharastra, India
He has more than 7.6 years of experience in the software development. He has spent most of the times in web/desktop application development. He has sound knowledge in various database concepts. You can reach him at viki.keshari@gmail.com https://www.linkedin.com/in/vikrammahapatra/ https://twitter.com/VikramMahapatra http://www.facebook.com/viki.keshari

Search This Blog

Sunday, July 7, 2013

Advantage of having a non clustered index same as a clustered index on a table


I found this as a question in one of the forum and thought of coming up with an article with the same. His question was like this

Problem Statement: Is there any advantage of having a non clustered index same as a clustered index on a table?.
Here is a little background to his question question: In our current data warehouse framework (sql server version - 2005), In all of the tables we have one unique non clustered index same as a clustered index. So I was wondering if there is any advantage of doing that.
Solution: The only advantage which I could figure out is that the entries on leaf pages of nonclustered index are not as wide. They only contain index columns while the clustered index' leaf pages are the actual rows of data. Therefore, if you need something like select count (column_name) or count (*) from your_table then scanning the nonclustered index will involve considerably smaller number of data pages.
To prove our statement lets create a scenario, here we are creating table DupIndexDemo.

create table DupIndexDemo 
(id int identity(1,1), 
name varchar(200), 
ordered_qty int, 
date_getdate datetime default getdate()  )

Command(s) completed successfully.
  
Let’s now create index on ID, one is cluster and another we are creating noncluster index on the same column id.

create clustered index idx_nc_id on DupIndexDemo(id) 
create nonclustered index idx_c_id on DupIndexDemo(ordered_qty)  

Command(s) completed successfully.

Now let’s insert few records in DupIndexDemo.

declare @i int =0 
while @i < 10000 
begin    
      insert into DupIndexDemo(name,ordered_qty)    
      values  ( 'DemoNames for ' + cast(@i as varchar(10)),rand(1) * 100 )    
      set @i = @i + 1 
end

Let’s fire count (*) query to check, which index it is picking, later we will justify why it picked non cluster index.

select COUNT(*) from DupIndexDemo



















Here from the explain plan we can see it picked NonClustered Index idx_nc_id and by extending the nonCluster operator we can see the IO cost involved is 0.0157176 and it cost 82% of the total cost to find the count of records.

Lets now lets drop NonCluster Index and fire the count (*) query.

drop index idx_nc_id on DupIndexDemo
Command(s) completed successfully

Now lets run the count(*) query and check the cost measure.

select COUNT(*) from DupIndexDemo

















Here we can see cluster Index is taking 91% of the total cost involved in the query and the IO cost involved is 0.0475694 which is around four times compare to NonCluster Index.

Thus here we can say NonCluster index on the same column is giving performance optimized. But still I will not prefer to have cluster and NonCluster on the same column since it leads to duplicate Indexes which is considered as bad practice.
   
Whereas if you remove both the indexes then it will do table scan.

My girl friend always labels me Idiot; if programmers are idiots then Program Managers must be Idiot boxes ;)  


No comments:

Post a Comment