About Me

My photo
Mumbai, Maharastra, India
He has more than 7.6 years of experience in the software development. He has spent most of the times in web/desktop application development. He has sound knowledge in various database concepts. You can reach him at viki.keshari@gmail.com https://www.linkedin.com/in/vikrammahapatra/ https://twitter.com/VikramMahapatra http://www.facebook.com/viki.keshari

Search This Blog

Showing posts with label Non Clustered Index. Show all posts
Showing posts with label Non Clustered Index. Show all posts

Tuesday, November 29, 2016

Max Key length warning in SQL Server 2016 for NC Index

There are certain times when your index gets created with warning message

Warning! The maximum key length for a nonclustered index is 1700 bytes. The index 'test_index' has maximum length of 1705 bytes. For some combination of large values, the insert/update operation will fail.

Remember SQL Server 2016 increase maximum key size for Non Cluster Index from 900 byte to 1700 byte. That seems interesting correct!  You can now have more key column in your NC index Wow!!

Well it is not so wow! The larger index comes with lot of cons, which you need to consider before designing larger NC index, I will write a separate note on this.

Let reciprocate the above error to understand more on it.

CREATE SCHEMA indexDemo GO

CREATE TABLE indexDemo.IndexByteSQL2016 ( MyCol1 varchar(1705) );

Here we created table with a single column of length 1705, just do remember SQL Server 2016 allow only 1700 byte key column length.  


Now lets create NC index
CREATE INDEX test_index ON indexDemo.IndexByteSQL2016 (MyCol1);

Warning! The maximum key length for a nonclustered index is 1700 bytes. The index 'test_index' has maximum length of 1705 bytes. For some combination of large values, the insert/update operation will fail.

NC index created with warning, this warning we cannot ignore since it will affect your insertion. Let try to fire some insert statement to test this warning

INSERT INTO indexDemo.IndexByteSQL2016
SELECT cast(replicate('a', 700) AS varchar(1000)); -- Success
(1 row(s) affected)

INSERT INTO indexDemo.IndexByteSQL2016
SELECT cast(replicate('b', 901) AS varchar(1000)); -- Success
(1 row(s) affected)

INSERT INTO indexDemo.IndexByteSQL2016
SELECT cast(replicate('c', 1703) AS varchar(2000)); -- Fail
Msg 1946, Level 16, State 3, Line 20
Operation failed. The index entry of length 1703 bytes for the index 'test_index' exceeds the maximum length of 1700 bytes for nonclustered indexes.


The above insertion fails, why? Though you are trying to insert record of permissible length defined by your datatype i.e. varchar(1705) but this column is participating in NC index key creation and in SQL Server 2016 the permissible length of NC IndexKey columns is 1700.

So this particular value cast(replicate('c', 1703) AS varchar(2000) won’t get space in index creation thus insertion fails


Enjy coding…SQL J

Post Reference: Vikram Aristocratic Elfin Shar

Saturday, January 31, 2015

Group by clause sort my record set on Group by specified column


We must have experience that whenever we apply group By clause, it automatically sort the record set on the column specified by Group By clause. Lets see whats the basic behind the sorting technique.

Lets fire a query on the AdventureWorks2014 database ProductCategory and ProductSubCategory table.

select  pc.Name, sum( psc.ProductSubcategoryID) from Production.ProductCategory pc
inner join Production.ProductSubcategory psc
on pc.ProductCategoryID = psc.ProductCategoryID
group by pc.Name

Name                                              
-------------------------------------------------- -----------
Accessories                                        378
Bikes                                              6
Clothing                                           172
Components                                         147

Here we can see the records are appearing in sorted order. Lets see the execution plan to check what is happing at the background.


The execution plan above is clearly shown a sort operator which is why we are get our data sorted by Name column. Lets dissect the plan to see how it has executed the complete query.

The series of Logical operation that happened are
1.        First, Clustered Index Scan on ProductionSubCategory to get all the data from the table.
2.        Second, Clustered Index Scan on ProductCategory to get all the data from the table.
3.        Third, Sort operator, which was applied on ProductCategory table to get the data in sorted manner. Which internally do Order By on ProductCategory.Name column.
4.        Forth, then inner join was implemented to get the matching records from both the table.
5.        Fifth, Stream Aggregate was performed on the data which group by data on the bases of PorductCategory.Name column.
So here we can easily say why group by is producing result in sorted manner. because of sorting of ProductCategory table on ProductCategory.Name Column which is internally happening.

Reason for Sort Operator appearance, the stream aggregate operator always wants its input in sorted order before it apply its operation of aggregating.

Lets do one more practical to remove the sort operator, I am interested to remove this because it s taking the most cost among all the operator.

So internally sort operator arrange the data, the samething we can accomplish through indexing also, the benefit will be along with sorting the access to the data from ProductCategory tabke will be faster.

So why wait lets create Non Clustered Index on Name column of ProductCategory.
create unique nonclustered index [ak_productcategory_name] on [production].[productcategory]
(
       [name] asc
)
Command(s) completed successfully.

So now we have index on ProductCategory Name column. Now its time to rerun the same query to see whether we are still getting the sort operator in execution plan.

select  pc.Name, sum( psc.ProductSubcategoryID) from Production.ProductCategory pc
inner join Production.ProductSubcategory psc
on pc.ProductCategoryID = psc.ProductCategoryID
group by pc.Name


So we are successful in removing the sort operator and we can see the Index we created on Name column is now appears in execution plan.

And since index automatically sort the data, optimizer didn’t make use of sort operator explicitly to do the sorting and thus we can see a good plan.

 SQL Server with a Tea glass makes a perfect combination to deal with darkness J


Post Reference: Vikram Aristocratic Elfin Share

Monday, January 26, 2015

Difference between COUNT and BIG_COUNT from the angle of execution plan


We all know when we need to find the the count of rows in a table which exceed the int datatype range, we use BIG_COUNT instead of COUNT, which has return type bigint unlike int in case of COUNT.

Let’s see from the prospective of execution plan, how it differ in plan

select count(*) from Production.ProductCategory pc 



Lets see the text plan to see, what optimizer in detail did to execute the above query.

SET STATISTICS PROFILE ON
select count(*) from Production.ProductCategory pc 



Again a very simple plan, lets go through with each step to see how optimizer executes the query. First step, all rows are read from NonClustered Index Scan, then Stream Aggregate counts the number of rows in the record set and place the record count in Expr1002. Then Compute Scalar Operator takes the record count stored in Expr1002 and convert it to INT Datatype and kept the result in Expr1001.

The output of the Stream Aggregate operator  is a BigInt, but as we know the COUNT aggregate function returns value of type INT, so it use Compute scalar to convert result of count in INT data type. You may remember that, in order to count BigInt values, you should use the COUNT_BIG function. If you change the query above to use the COUNT_BIG, then you will see that the plan no longer uses the compute scalar operator

Lets check out the same query with BIG_COUNT.
select count_big(*) from Production.ProductCategory pc 

 
Here we can see there is no Compute scalar operator in the execution plan. Lets see now whether this plans makes any difference in cost compared to query using COUNT.



Here if we compare the final tree cost the query using BIG_Count has 0.0032893 where as query using COUNT has  0.0032893. So it came out as same. So it makes no or minimal cost change while using count vs big_count.

Conclusion: BIG_COUNT don’t use compute Scalar Operator because the Stream Aggregate operator of COUNT returns result in BIG INT datatype where as in case of COUNT function, since it has return type of INT, the compute scalar operator is used to convert the result returned from  Stream Aggregate from big int to int.
 
SQL Server with a Tea glass makes a perfect combination to deal with darkness :)


Post Reference: Vikram Aristocratic Elfin Share

Saturday, January 24, 2015

Blocking Operator and Non Blocking Operator

Non Blocking Operator: A non blocking operator is one which reads one row from its previous operator and returns the read; Nested loop is a prime example of non blocking operator.

select top 10 pc.ProductCategoryID,pc.Name, psc.ProductSubcategoryID, psc.Name from Production.ProductCategory pc
inner join Production.ProductSubcategory psc
on pc.ProductCategoryID = psc.ProductCategoryID



Blocking Operator: A blocking operator needs to read all the rows from its previous operator to perform  and then return the data. A classic sample of a blocking operator is the SORT operator; it needs to read all rows, sort the data and then return the ordered rows. The execution of the query will wait until all rows to be read and ordered, before continuing with the command

select distinct ProductCategoryID from Production.ProductSubcategory



Here we can see the blocking sort operator reads all rows from clustered index then rearrange the records in ascending fashion and then gives the output to subsequent operator.

Thing starts work, when you are with me… SQL Server  :)


Post Reference: Vikram Aristocratic Elfin Share