Archive for the ‘Performance Improvement’ Category

Interesting one today:

This article is part 3 in a series on Advanced Query Tuning Concepts, that are good to be familiar with. Full list is here.

Hash Join

When both the data sets are large with unsorted & unindexed data sets, Hash Join is the best operator. This is the most complicated data set to process and Hash Join could process them efficiently.


Hash Joins follow a complicated logic in identifying matching records. We’ll get into the details in a future post.

There are different types of Hash Joins

  1. In-memory hash join
  2. Grace hash join
  3. Recursive hash join


Hash Joins end up being used a lot in intermediary steps. During large table joins, when virtual tables are generated during intermediary steps, the subsequent joins on those intermediary tables are performed using Hash Joins — as these intermediary tables are not indexed or sorted


Hope this helps,

Read Full Post »

Interesting one today:

This article is part 2 in a series on Advanced Query Tuning Concepts, that are good to be familiar with. Full list is here.

Merge Join

When both the inputs are fairly large size with indexed and sorted data sets, Merge Join is very efficient in returning matching records. See the image below:


In this example, we have 2 similarly sized tables (this is important), that are Indexed (and hence sorted — this is also important). When they are joined on ID columns (ON B.ID = S.ID) that are indexed, Sql Server uses Merge Join

Simple Explanation:

From the sorted input lists, Sql Server takes one record from each table and compares them; If they match, it is returned. If not, the lower value row is discarded and next row from the same table is obtained for next comparison. It keeps iterating until the end of that table.

  • Both the tables need to be similarly sized in comparison to each other.
  • The join columns in both the data sets need to be indexed & sorted
    • This is important for efficient processing of the records.
    • If they are not already sorted, Merge Join adds a sorting step resulting in longer processing times — making it inefficient.



Hope this helps,

Read Full Post »

Interesting one today:

This is Part 1 of a series on Advanced Query Tuning Concepts, that are good to be familiar with. Today we’ll cover Nested Loop Joins. Full list is here.

Nested Loop Join

Nested Loop Join is utilized when, in a join, one input is a small data set (fewer than 10 records) and the other is a large data set that is indexed on the columns used in join.


Simple Explanation:

For each record in the SmallTable, it searches entire LargeTable for matching records. It keeps iterating for all the records in the SmallTable. On the first glance it seems like an inefficient method, but it is the most efficient method.

  • Since we are making the SmallTable as the outer input table, it limits the number of times we need to loop through.
  • Since the LargeTable is indexed, a quick Index Seek returns the matching value for the given value from SmallTable (i.e. ON B.ID = S.ID).
    • Hence the reason for Index Scan for SmallTable, with Index Seek for Large table.
    • The other way round would be very inefficient (Scaning BigTable and Seeking on SmallTable)

In this combination of Small vs. Large data sets, Nested Loop Join is the most efficient operator.



Hope this helps,

Read Full Post »

Interesting one today:

Usually, during query performance tuning, we look at query plan to see where most of the time is spent during query execution. Microsoft uses different Graphical Execution Plan Icons to to help easily convey the query execution steps & related costs at each step. Today we’ll look at three of those that are crucial in distilling large data sets and returning only the pertinent records. They are essentially Join Operators used in comparing multiple large data sets and retrieve necessary records.

  1. Nested Loop Join
  2. Merge Join
  3. Hash Join
    • In-memory hash join
    • Grace hash join
    • Recursive hash join

These are part of Advanced Query Tuning Concepts, that are good to be familiar with.

We’ll cover each in an individual post with an example, in this 3 part series.

  1. Nested Loop Join: Best when one of the data sets is small
  2. Merge Join: Best when both the data sets are of similar sizes
  3. Hash Join: Can efficiently handle large data sets, either sorted or not.


Hope this helps,

Read Full Post »

Quick one today:

On a regular bases, on production machines, selected perfmon metrics are captured into local files. Each day’s metrics are captured into an individual file — making it easier to analyze the data as and when needed.

Sometimes, to uncover any patterns, we’d need to combine a few days worth of files into one BLG file. This is rare, but needed. Microsoft provides a command to achieve this action. Enter relog command. This command could do a lot of things, but today we’ll look at file concatenation.

relog SqlCounters_08112017_48.blg SqlCounters_08122017_49.blg -f BIN -o C:\PerfLogs\Sql2014Counters\1\s.blg


The following Perfmon files

  • SqlCounters_08112017_48.blg
  • SqlCounters_08122017_49.blg

are combined into a final binary file called Combined.blg.

  • -f flag indicates the format of the output (concatenated file)
  • -o flag indicated the path of the output file

The following image shows, the output when you run it from command prompt.


Hope this helps,

Read Full Post »

Interesting one today:

Lately, Access Methods performance metrics have been helpful in troubleshooting some recent issues. Access Methods has several important metrics that show the metrics to measure the usage internals of logical data with in Sql Server.

Here we’ll look at 3 of them:

  1. \SQLServer:Access Methods\FreeSpace Scans/sec
  2. \SQLServer:Access Methods\Table Lock Escalations/sec
  3. \SQLServer:Access Methods\Workfiles Created/sec

FreeSpace Scans

Objects in Sql Server are written on database pages. A group of these pages are called Extents (8 pages). Allocation of space occurs in units of extents. Extents are of two types Mixed & Uniform Extents.

Usually small tables (and sometimes Heap tables) are written to Mixed extents. So, when the data in those tables/objects increases, Sql Server needs to find more free space to  accommodate for the growth. In these situation, Sql Server performs Free Space Scans. This counter measure how many times these occur every second, hence FreeSpace Scans/sec.

Not sure what Microsoft’s recommended range on this metric is, but in our environment, this metric stays low i.e. under 5 or 10. So, as a rule of thumb, lets say as long as the number is below 20 we are okay. Anything higher for extended periods of time might need some attention.

So the best approach is to gather baseline first; Then you’ll know what is normal and what is out of the ordinary.

Table Lock Escalations

When a query is trying to read rows from a table, it uses the smallest lock as possible to maintain concurrency. In some rare (but not uncommon) occasions, this lock gets escalated higher level, either Page or Table level. While this reduces concurrency on that table, it improves the efficiency of this particular query execution.

Issuing thousands of locks costs a lot of memory; So it is easier to issue table lock and read as much data as needed. But the down side is that it will prevent other connections from reading this table.

For more, read this previous post on Lock Escalations.

This counter measures, the number of such escalations occur per second. Which this is a common occurrence, higher numbers for extended periods of time are not good. So, look into optimizing queries, so they only query the exact amount of data they need (a.k.a. use better JOINs and WHERE clause conditions)

Workfiles Created

When a large query that handles large data sets is executed, sometimes the intermediary data sets (or virtual tables) are written to disk. This helps with efficient processing of data. Sql Server reads this data into memory as and when needed and completes query processing.

This counter measures, how many work files are created each second. On a busy system, that handles large data set manipulation queries, many WorkFiles & WorkTables are created each second. So, this number needs to be considered in context. Capture a baseline first; Then measure any aberrations from baseline and look into possible reasons.

Usually when a query manipulates large data sets, Sql Server uses Hash Joins to manipulate the data to find matches. So, if you have a lot of queries that perform Hash Joins or Hash Aggregates, this counter spikes up.


Hope this helps,

Read Full Post »

Quick one today:

Occasionally, the need to perform DIFFERENTIAL backup arises. The underlying mechanism of Differential backup is one of the coolest things.


--  Syntax for DIFF backup
	, STATS = 1




Some highlights:

  • Differential backup only captures the data that has changed since the last FULL backup (called base of the differential).
    • Even when we perform multiple DIFF backups, it captures all the data that changed since in the last FULL backup (unlike Transactional log backups, where they only take the changes since the last Transactional backup)
  • The size of DIFF backup files is usually smaller than FULL backups;
    • Obviously, since we are only capturing the changes since last FULL backup.
  • Using Differential Bitmap page, Sql Server maintains a list of extents that were modified since last FULL backup.
    • Using this, Sql is able to quickly run through all the extents that need to be backed-up quickly.
    • See the image from MSDN for clarity
  • bnr-how-diff-backups-work
Hope this helps,

Read Full Post »

Older Posts »