About Me

My photo
Mumbai, Maharastra, India
He has more than 7.6 years of experience in the software development. He has spent most of the times in web/desktop application development. He has sound knowledge in various database concepts. You can reach him at viki.keshari@gmail.com https://www.linkedin.com/in/vikrammahapatra/ https://twitter.com/VikramMahapatra http://www.facebook.com/viki.keshari

Search This Blog

Saturday, July 6, 2013

Forwarding Records, Forwarding Pointers Masked Performance Killer for NonCluster Index


Tool Used : sys.dm_db_index_physical_stats DMV
Conclusion : Forwarded records are a type of fragmentation within heap tables that can cause poor SQL Server performance.

In a heap it is possible to get  forwarded records. They occur when a record in a heap expands such that it no longer fits on the page it currently resides on. In this case, the record is moved to a new page, and a small forwarding record is left in the original location. The forwarding record points to the new location of the record, which is known as a forwarded record. This is done as a performance optimization so that all the non clustered indexes on the heap do not have to be altered with the new location of the heap record.

When a larger value is updated in a variable length field in a record, SQL Server will first try to expand the row of the existing page on the chance that there’s enough room to continue to use the original page. If that fails and SQL Server can’t find an existing page with enough room for the larger value, a new page has to be created for the record. The data is moved to the new page and is assigned a new Relative Identifier (RID). The old page now contains a forwarding pointer (forwarded record) that tells SQL Server to jump to the new RID. The new record location also has a back pointer to the old record. 

Forwarded records prevent non-clustered indices on heap tables from being updated with the RID of the new row. The overhead of updating indices is worse than replacing the old records with a pointer. But the drawback to forwarded records is that their presence could result in a large amount of I/O. For each forwarded record, SQL Server has to jump to the target page, then back to the original one. This is a very inefficient way to read a row, and it also wastes space.

There can be non-clustered indexes in the Heap. So in case a record has to move to new page, at the old location of the row it will leave a forwarding pointer. This mechanism is to stop the updating of non-clustered indexes as it can still point to the old location. So the modification is less costly than if all non-clustered indexes needed to reflect the row move.

Let’s now stimulate a condition where we will be getting the forwarding records, here we are creating table Forwarding_Records_Demo

create table Forwarding_Records_Demo 
(id bigint identity(1,1), 
name varchar(200), 
ordered_qty int, 
date_getdate datetime default getdate()  ) 

Command(s) completed successfully.

Here we are creating two non cluster index on date_getdate and ordered_qty

create nonclustered index idx_date_getdate on Forwarding_Records_Demo(date_getdate) 
create nonclustered index idx_ordered_qty on Forwarding_Records_Demo(ordered_qty) 

Command(s) completed successfully.

Now lets insert some records in newly created table

declare @i int =0 
while @i < 100 
begin    
      insert into Forwarding_Records_Demo(name,ordered_qty)    
      values  ( 'DemoNames for ' + cast(@i as varchar(10)),rand(1) * 100 )    
      set @i = @i + 1 
end

Let’s check the insertion

select COUNT(*) as No_of_Records from Forwarding_Records_Demo
No_of_Records
-------------
100


Let’s query the dm_db_index_physical_stats to check forwarded record count.

select
object_name(ps.object_id) as tablename,
i.name as indexname,     
ps.index_type_desc,     
ps.page_count,     
ps.avg_fragmentation_in_percent,     
ps.forwarded_record_count 
from
sys.dm_db_index_physical_stats (db_id(), null, null, null,'detailed') as ps 
inner join sys.indexes as i     
on ps.object_id = i.object_id and ps.index_id = i.index_id 
where object_name(ps.object_id) = 'Forwarding_Records_Demo'

indexname         index_type_desc page_count avg_fragmentation forwarded_
                                             in_percent        record_count
--------------------------------- ---------------------------- ------------
NULL              HEAP            1          0                 NULL
idx_date_getdate  NONCLUSTERED    1          0                 NULL
idx_ordered_qty   NONCLUSTERED    1          0                 NULL

(3 row(s) affected)

Now let’s alter the column size of name, and forces SQL Server to move those rows to another page

alter table Forwarding_Records_Demo
alter column name char(400);

Command(s) completed successfully.

Now since we have modified the column to 400 sizes, let’s re-run the same DMV query to check whether

select
object_name(ps.object_id) as tablename,
i.name as indexname,     
ps.index_type_desc,     
ps.page_count,     
ps.avg_fragmentation_in_percent,     
ps.forwarded_record_count 
from
sys.dm_db_index_physical_stats (db_id(), null, null, null,'detailed') as ps 
inner join sys.indexes as i     
on ps.object_id = i.object_id and ps.index_id = i.index_id 
where object_name(ps.object_id) = 'Forwarding_Records_Demo'

indexname         index_type_desc page_count avg_fragmentation forwarded_
                                              _in_percent      record_count
--------------------------------- ---------------------------- ------------
NULL              HEAP            6          50                85
idx_date_getdate  NONCLUSTERED    1          0                 NULL
idx_ordered_qty   NONCLUSTERED    1          0                 NULL

(3 row(s) affected)

Now we have 85 forwarded records.

Forwarding pointers are common performance problem in Heaps. SQL Server 2008 introduced ALTER TABLE ... REBUILD which removes all forwarding pointers and performs a rebuild

alter table Forwarding_Records_Demo rebuild
Command(s) completed successfully.

Now let us re-examine the forwarded recount again which will be back to zero by running the above DMV query.

indexname         index_type_desc page_count avg_fragmentation forwarded_
                                             in_percent        record_count
--------------------------------- ---------------------------- ------------
NULL              HEAP            7          0                 NULL
idx_date_getdate  NONCLUSTERED    1          0                 NULL
idx_ordered_qty   NONCLUSTERED    1          0                 NULL

(3 row(s) affected)

Now we have no forwarding records.

Every ash of mine will fabricate a novel code which will lessen up the world of programmer hitches         


Post Reference: Vikram Aristocratic Elfin Share

Thursday, July 4, 2013

Merge multiple rows into Single rows


I was asked a very good question from my friend, how can one merge multiple rows into single rows with comma separated output.

Thought entertaining, this means breaking the rule of Normalization 1NF, still if there is business need one has to compromise with it.

There I device two way to achieve the solution for it. Before we discuss solution, let’s first try to stimulate the same problem.

For our example I am taking two table say, deptTest and emptest , lets create these two tables.

create table deptTest
(dept_id int,
dept_name varchar(10))

Command(s) completed successfully.

create table empTest
(emp_id int,
name varchar(10),
dept_id int);

Command(s) completed successfully.

 Now let’s insert some records in both the table.

insert into deptTest values(1,'Software'),(2,'Operation')
(2 row(s) affected)

insert into empTest values(1,'Snehal',1),(2,'Prachi',1),(3,'Bratiti',2)
(3 row(s) affected)

Lets query both the table

select * from deptTest

dept_id     dept_name
----------- ----------
1           Software
2           Operation

(2 row(s) affected)

select * from empTest

emp_id      name       dept_id
----------- ---------- -----------
1           Snehal     1
2           Prachi     1
3           Bratiti    2

(3 row(s) affected)

Problem Statement: we need to find all employees in comma separated values for each department. Like this
name
----------------
Bratiti
Snehal,Prachi

First Method: with XML and Path
Here in the below script, it is performing co related query, for each department in the outer query, the inner query is trying to find employees for the department.

The inner query result is then converted into XML and with the help of data()we are formating values into normal text.
  
select distinct t1.dept_name, name =
replace((SELECT t2.name AS [data()] FROM empTest t2 where t2.dept_id = t1.dept_id FOR XML PATH('')), ' ', ',')
from deptTest t1

Second Method:
Here we are creating a function where we are appending each record in a particular string and then returning the string to the calling function.

create function fn_Merge(@id int)
returns varchar(20)
as
begin
      declare @name1 varchar(20)
      set @name1=''
      select @name1 = @name1 + ',' + t.name
      from empTest t where t.dept_id = @id
      return substring(@name1,2,len(@name1))
end

select dbo.fn_Merge(d.dept_id) as Name from deptTest d

Name
--------------------
Snehal,Prachi
Bratiti

(2 row(s) affected)

For a programmer, the surest lane to wisdom is the noninterventionist coding. Let him do code only.   


Post Reference: Vikram Aristocratic Elfin Share

Monday, June 10, 2013

Retrieving Hierarchical Database table using Recursive CTE



Problem Statement:

Suppose folder structure is der like folder n subfolder so suppose “Main” folder has sub folder “A” and “B” and “A”  has “a1” and B has “b1” so how will you maintain dis data in single table and how you will get all sub folders(“A”,”B”,”a1”,”b1”) of “Main” folder .

Level1      Level2      Level3
---------- ---------- ----------
mail       A          a1


Solution :

create table SnehalKaQuestion
(FolderId int,
FolderName varchar(10),
ParentFolder int null) 


insert into SnehalKaQuestion values(1,'mail',null),(2,'A',1),(3,'a1',2),(4,'B',1),(5,'b1',4)

select * from SnehalKaQuestion
FolderId    FolderName ParentFolder
----------- ---------- ------------
1           mail       NULL
2           A          1
3           a1         2
4           B          1
5           b1         4

(5 row(s) affected)

Now we want to find the level of each folder in the folder hierarchal structure, lets see how we will be achieving this use recursive CTE.

Here we are creating a stored procedure which will use CTE to derive the required result

create procedure proc_FindFolderLevel
@folderName varchar(10)
as
begin
;with cte(folderid,foldername,parentfolder)
as
(
select folderid,foldername,parentfolder from SnehalKaQuestion where foldername = @folderName
union all
select s.folderid,s.foldername,s.parentfolder from cte c,SnehalKaQuestion s
where c.parentfolder = s.folderid)

select 'Level' + cast (ROW_NUMBER() over (order by folderid) as varchar) as Level ,foldername  from cte
end  

This stored Procedure is taking one argument as folder_name for which we need to find the hierarchy.

Let’s execute the procedure to find the result

Test 1 : With ‘a1’, to see the folder hierarchy of a1

exec proc_FindFolderLevel 'a1'

Level                               foldername
----------------------------------- ----------
Level1                              mail
Level2                              A
Level3                              a1

(3 row(s) affected)

Test 2 : With ‘b1’, to see the folder hierarchy of b1

exec proc_FindFolderLevel 'b1'

Level                               foldername
----------------------------------- ----------
Level1                              mail
Level2                              B
Level3                              b1

(3 row(s) affected)

Test 3 : With ‘A’, to see the folder hierarchy of A

exec proc_FindFolderLevel 'A'

Level                               foldername
----------------------------------- ----------
Level1                              mail
Level2                              A

(2 row(s) affected)

Test 4 : With ‘B’, to see the folder hierarchy of B

exec proc_FindFolderLevel 'B'

Level                               foldername
----------------------------------- ----------
Level1                              mail
Level2                              B

(2 row(s) affected)

Test 5: With ‘mail’, to see the folder hierarchy of mail

exec proc_FindFolderLevel 'mail'

Level                               foldername
----------------------------------- ----------
Level1                              mail

(1 row(s) affected)

Cool to witness, we can pull off this with CTE.

Heart beat is just an evidence of programmer, if your code is burning well; heart beat is just an ash...



Post Reference: Vikram Aristocratic Elfin Share

A new column type with Sparse Column XML COLUMN_SET



There’s a new column type which is available to use with sparse columns – an XML COLUMN_SET. This is a column is only comes into picture when selected, and will return all the non-NULL sparse columns in a row as an XML BLOB.

It will also change the behavior of a SELECT * operation – removing all the sparse columns from the resultset, representing all the non-NULL sparse columns.

Let’s Check out with an example, here we are creating a table with sparse

create table SparseTest
(id int sparse,
name varchar(10) sparse,
addrss varchar(15) sparse,
sparseColumns XML COLUMN_SET FOR ALL_SPARSE_COLUMNS)

Command(s) completed successfully.

Lets try to insert few null values in the SparseTest table few records in  table

declare @i int = 1
while  @i <= 3
begin
      insert into SparseTest(id,name,addrss) values(null,null,null)    
      set @i=@i+ 1     
end

Let’s insert few non null values in SparseTest table.

insert into SparseTest(id,name,addrss) values(1,'Gargi','Mumbai')
insert into SparseTest(id,name,addrss) values(1,'Bratiti','Banglore')
insert into SparseTest(id,name,addrss) values(1,'Lopamudra','Delhi')
insert into SparseTest(id,name,addrss) values(1,'Monalisa','Pune')
insert into SparseTest(id,name,addrss) values(1,'Pragayan',null)

Lets query the SparseTest to check out the result set.

select * from SparseTest

sparseColumns
----------------------------------------------------------
NULL
NULL
NULL
<id>1</id><name>Gargi</name><addrss>Mumbai</addrss>
<id>1</id><name>Bratiti</name><addrss>Banglore</addrss>
<id>1</id><name>Lopamudra</name><addrss>Delhi</addrss>
<id>1</id><name>Monalisa</name><addrss>Pune</addrss>
<id>1</id><name>Pragayan</name>

(8 row(s) affected)

Here we can see it changes the behavior of a SELECT * operation – removing all the sparse columns from the resultset, representing all the non-NULL sparse columns, you can define column name in select statement to retrieve the column values, like this

select id,name,sparseColumns from SparseTest

id          name       sparseColumns
----------- ---------- ----------------------------------------------------
NULL        NULL       NULL
NULL        NULL       NULL
NULL        NULL       NULL
1           Gargi      <id>1</id><name>Gargi</name><addrss>Mumbai</addrss>
1           Bratiti    <id>1</id><name>Bratiti</name><addrss>Banglore</addrss>
1           Lopamudra  <id>1</id><name>Lopamudra</name><addrss>Delhi</addrss>
1           Monalisa   <id>1</id><name>Monalisa</name><addrss>Pune</addrss>
1           Pragayan   <id>1</id><name>Pragayan</name>

(8 row(s) affected)

When code burns from heart, it shapes to rebellion   


Post Reference: Vikram Aristocratic Elfin Share

Optimize your Storage while using NULL in your field with SPARSE Column.


If a column is defined with Sparse column then if the column hold null value, sparse column will take much lesser space as compared to regular column.

Few facts on SPARSE column
  •        A column can be declared sparse by specifying the ‘SPARSE’ after the data type in the table.
  •        A table can contain sparse columns of up to 30000. Though a table contains 30000 sparse columns the maximum number of non sparse columns in a table is still 1024 only.
  •        If the value of a column is NULL, it doesn’t consume space/ lesser space at all. It stores the data in a single xml column but for an external application it behaves like a normal column.


Let’s Check out with an example, here we are creating two table one with sparse and second without sparse.

create table SpareTest
(id int sparse,
name varchar(10) sparse,
addrss varchar(15) sparse)

Command(s) completed successfully.

create table withoutsparse
(id int,
name varchar(10),
addrss varchar(15))

Command(s) completed successfully.

Lets try to insert few records in both the table

declare @i int = 1
while  @i <= 100000
begin
      insert into SpareTest values(null,null,null)
      insert into withoutsparse values(null,null,null)
      set @i=@i+ 1     
end

Now let’s see how much space used in both the table to store the null values


sp_spaceused 'SpareTest'
go

name       rows    reserved  data     index_size unused
------------------ --------  -------  ---------- --------
SpareTest  100000  1224 KB   1176 KB  8 KB       40 KB

sp_spaceused 'withoutsparse'
go
name       rows    reserved  data     index_size unused
------------------ --------  -------  ---------- --------
SpareTest  100000  1480 KB   1416 KB  8 KB       56 KB

Conclusion: Here we can easily compare, how much space are being saved when we are using Spare for storing null values.

Code burn; lighten world

Post Reference: Vikram Aristocratic Elfin Share