I found this as a question in one of the forum and thought of
coming up with an article with the same. His question was like this
Problem Statement: Is there any advantage of
having a non clustered index same as a clustered index on a table?.
Here is a little background
to his question question: In our current data warehouse framework (sql server
version - 2005), In all of the tables we have one unique non clustered index
same as a clustered index. So I was wondering if there is any advantage of
doing that.
Solution: The only advantage which I
could figure out is that the entries on leaf pages of nonclustered index are
not as wide. They only contain index columns while the clustered index' leaf
pages are the actual rows of data. Therefore, if you need something like select
count (column_name) or count (*) from your_table then scanning the nonclustered
index will involve considerably smaller number of data pages.
To
prove our statement lets create a scenario, here we are creating table DupIndexDemo.
create table DupIndexDemo
(id int identity(1,1),
name varchar(200),
ordered_qty
int,
date_getdate datetime default getdate() )
Command(s) completed successfully.
Let’s now create index on ID,
one is cluster and another we are creating noncluster index on the same column id.
create clustered index
idx_nc_id on DupIndexDemo(id)
create nonclustered
index idx_c_id on
DupIndexDemo(ordered_qty)
Command(s) completed successfully.
Now let’s insert few records
in DupIndexDemo.
declare @i int =0
while @i < 10000
begin
insert into DupIndexDemo(name,ordered_qty)
values ( 'DemoNames for ' + cast(@i as varchar(10)),rand(1) * 100 )
set @i = @i + 1
end
Let’s fire count (*) query to check, which index it is picking,
later we will justify why it picked non cluster index.
select COUNT(*) from DupIndexDemo
Here from the explain plan we can see it picked NonClustered Index
idx_nc_id
and by extending the nonCluster operator we can see the IO cost involved is 0.0157176 and it cost
82% of the total cost to find the count of records.
Lets now lets drop NonCluster Index and fire the count (*) query.
drop index
idx_nc_id on DupIndexDemo
Command(s) completed successfully
Now lets run the count(*) query and check the cost measure.
select COUNT(*) from DupIndexDemo
Here we can see cluster Index is taking 91% of the total cost
involved in the query and the IO cost involved is 0.0475694 which is around
four times compare to NonCluster Index.
Thus here we can say NonCluster index on the same column is giving
performance optimized. But still I will not prefer to have cluster and
NonCluster on the same column since it leads to duplicate Indexes which is
considered as bad practice.
Whereas if you remove both the indexes then it will do table scan.
My girl
friend always labels me Idiot; if programmers are idiots then Program Managers must
be Idiot boxes ;)
No comments:
Post a Comment