Aristocratic Elfin Share: Execution Plan

Showing posts with label Execution Plan. Show all posts

Thursday, November 22, 2018

Performance of Read Ahead Read with Trace Flag 642

When user submit a query to fetch data from SQL server, database engine do a logical read to check if requested data page is holding the requested data are present in cache if so it will do logical read and sent back to user, but if the requested data pages are not present in buffer cache then it will do a physical read which is reading from disk, this is an expensive operation involving high IO and wait type.

To avoid physical read, SQL Server has something known as Read Ahead Read, this will bring the data(data pages in buffer) even before it is requested from the query. Read Ahead Read operation is a default behavior of SQL Server.

In this post we will check the performance of Read Ahead Read compared to physical read with the help of Trace Flag 642.

I have created two set of Query, one using Read Ahead read to fetch the data and other one using physical read.

With Read Ahead Read: Here I have set the IO on to capture the plan and freed out the cache and buffer

dbcc traceoff(652,-1)

dbcc freeproccache

dbcc dropcleanbuffers

set statistics io on

set statistics time on

--select * from dbo.person

select * from person.address

set statistics io off

set statistics time off

Lets check what IO info is saying

(19614 row(s) affected)

Table 'Address'. Scan count 1, logical reads 346, physical reads 1, read-ahead reads 344, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Server Execution Times:

CPU time = 93 ms, elapsed time = 855 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

Here we can see Read Ahead Read is 150, that means to fetch 19972 records, database storage engine gets 150 eight K pages to cache before plan get executed. Now let’s check Without Read Ahead Read

dbcc traceon(652,-1)

dbcc dropcleanbuffers --do not run this on prod

dbcc freeproccache() --do not run this on prod

set statistics io on

set statistics time on

-- select * from dbo.person

select * from person.address

set statistics io off

set statistics time off

Let’s check what IO info is saying

(19614 row(s) affected)

Table 'Address'. Scan count 1, logical reads 345, physical reads 233, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)

SQL Server Execution Times:

CPU time = 141 ms, elapsed time = 3041 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

Here clearly we can see the difference, read ahead read give better elapsed time compare to Physical Read.

Conclusion: Read Ahead Read perform well as compared to physical read in our case.

Enjy coding…SQL J

Post Reference: Vikram Aristocratic Elfin Share

Monday, November 19, 2018

Logical Read, Physical Read and Buffer Cash hit

Logical Reads:

This is also known as cache hit, which means reading pages from cache memory instead of disk. Logical reads in

Logical read specifies total number of data pages needed to be accessed from data cache to process query. It is very possible that logical read will access same data pages many times, so count of logical read value may be higher than actual number of pages in a table. Usually the best way to reduce logical read is to apply correct index or to rewrite the query.

Physical Reads
Physical read indicates total number of data pages that are read from disk. In case no data in data cache, the physical read will be equal to number of logical read. And usually it happens for first query request. And for subsequent same query request the number will be substantially decreased because the data pages have been in data cache.

Buffer Cash Hit Ratio:

The more logical read count, better would be cash hit ratio.
(logical reads – physical reads)/logical read * 100%. The high buffer hit ratio (if possible to near 100%) indicates good database performance.

Some warning on high number of logical read:

Higher number of Logical Reads tends high memory usage, but there are various way by which we can reduce higher number of logical read

Remove Improper/Useless/Insufficient Indexes: Indexes should be build on the basis of data access or retrieval process if any of the indexes is build on the columns which are not used in a query will leads to High Logical reads and will degrade the performance while reads and writing the data....
Poor Fill Factor/Page Density: Page use should not be very less. otherwise large number of page will be used for small amount of data which will also leads to High Logical Reads....
Wide Indexes: Indexing on the large number of columns will leads to high logical reads....
Index scanning: if query is leads to index scanning on the table then logical reads will be high...

Logical Reads count can be get by using following ways

set statistics io on
sys.dm_exec_query_Stats
SQL Profiler: by executing the sql profiler on that database we can find out logical reads..

Example

set statistics io on

set statistics time on

select * from dbo.person

set statistics io off

set statistics time off

(19972 row(s) affected)

Table 'Person'. Scan count 1, logical reads 150, physical reads 0, read-ahead reads 7, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 15 ms, elapsed time = 963 ms.

(19972 row(s) affected)

Table 'Person'. Scan count 1, logical reads 150, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 564 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

Enjy coding…SQL J

Post Reference: Vikram Aristocratic Elfin Share

Friday, February 13, 2015

Myth, Group by always result in sorting the table on Group by Column

There are lot threads on various forum discussing automatic sorting behaviour of Group By clause, lets put extra mile to prove this myth incorrect.

So here is my query, I am running this dummy query on AdventureWorks2012 database

select pc.Name, sum( psc.ProductSubcategoryID) as 'Sum of Sub Product Id' from Production.ProductCategory pc

inner join Production.ProductSubcategory psc

on pc.ProductCategoryID = psc.ProductCategoryID

group by pc.Name

Name Sum of Sub Product Id

-------------------------------------------------- ---------------------

Accessories 378

Bikes 6

Clothing 172

Components 147

Let’s see from the prospective of execution plan,

From execution plan we can see the Group By physical operator is replaced by Stream Aggregate logical operator. And since Stream Operator always want its input to be in sorted order, we can see an existence of Sort operator in the execution plan which is sorting the ProductCategory data on ProductCategory.Name wise. And this is the reason why we get our result in sorted order when we use Group by clause our query.

Now just think about a situation where instead of Stream Aggregate, optimizer feels to use Hash Aggregate, which don’t require its input to be in sorted order.

Let’s try to bring Hash Operator in the plan in place of Stream Operator.

I am not able to stimulate the situation where Hash Aggregate appears in plan. So lets play a tricky game with Optimizer, and tell there is no such Stream aggregate operator present thus don’t make it use for creating plan

You can disable Stream Aggregate operator by

DBCC TRACEON (3604);

DBCC RULEOFF('GbAggToStrm');

But later after your operation don’t forget to Rule on for Stream Aggregate operator.

DBCC TRACEON (3604);

DBCC RULEOFF('GbAggToStrm');

select pc.Name, sum( psc.ProductSubcategoryID) as 'Sum of Sub Product Id' from Production.ProductCategory pc

inner join Production.ProductSubcategory psc

on pc.ProductCategoryID = psc.ProductCategoryID

group by pc.Name

OPTION (RECOMPILE);

DBCC RULEON('GbAggToStrm');

Name Sum of Sub Product Id

-------------------------------------------------- ---------------------

Bikes 6

Clothing 172

Accessories 378

Components 147

Here we can see the result is didn’t get sorted by optimizer, lets see how optimizer executed this query.

We can see the Stream Aggregate operator replaced by Hash Aggregate which don’t require its input to be in sorted manner and that is the reason why our result resulted in unsorted manner.

So to be in safe side, if you want your group by data in sorted manner do use order by clause along with group by, this way you can guarantee your data to be in sorted manner.

DBCC TRACEON (3604);

DBCC RULEOFF('GbAggToStrm');

select pc.Name, sum( psc.ProductSubcategoryID) as 'Sum of Sub Product Id' from Production.ProductCategory pc

inner join Production.ProductSubcategory psc

on pc.ProductCategoryID = psc.ProductCategoryID

group by pc.Name

order by pc.Name

OPTION (RECOMPILE);

DBCC RULEON('GbAggToStrm');

Name Sum of Sub Product Id

-------------------------------------------------- ---------------------

Accessories 378

Bikes 6

Clothing 172

Components 147

Conclusion: Always use order by along with Group By, if you want your data to be in sorted manner.

SQL Server with a Tea glass makes a perfect combination to deal with darkness :)

Post Reference: Vikram Aristocratic Elfin Share

Aristocratic Elfin Share

Pages

About Me

Search This Blog

Thursday, November 22, 2018

Performance of Read Ahead Read with Trace Flag 642

Monday, November 19, 2018

Logical Read, Physical Read and Buffer Cash hit

Friday, February 13, 2015

Myth, Group by always result in sorting the table on Group by Column