Feeds:
Posts
Comments

Archive for the ‘Performance Improvement’ Category

Quick one today

Once in a while, we need to process XML files (a.k.a. deadlock XML files) to retrieve some pertinent information to uncover the deadlock patterns.

In the past, we’ve seen some XML DML queries to parse XML files.

Here we’ll focus on filtering capability based on attribute values in elements:

Take an deadlock XML for example:

Deadlock_Filter_Attribute

The goal is to search for the action element with attribute value as “collect_system_time“.

--
-- Parse & Filter XMl file data
--
SELECT
	  ROW_NUMBER() OVER(ORDER BY DeadLockXML.value('(/event/@timestamp)[1]', 'DATETIME2'))											AS [RowN]
	, DeadLockXML.value('(/event/action[@name="collect_system_time"]/value)[1]', 'DATETIME') AS SystemTime
FROM  dbo.Deadlocks_07262018_Step2
GO

The [] allow us to provide the value to filter the XMl elements:

(/event/action[@name="collect_system_time"]/value)[1]'

 

Hope this helps,
_Sqltimes

 

Advertisements

Read Full Post »

Interesting one today:

On a production box, the backup jobs have been failing with an interesting and perplexing error. Its says “Not enough disk space“; As you can guess, this is one of those confusing or misleading error messages that’s not what it seems on the surface — Making it worthwhile for a post of its own.

Detailed error message is below:

BACKUP DATABASE DummyDB
TO        DISK = N''
	, DISK = N''
	, DISK = N''
	, DISK = ''
WITH STATS = 1
GO
...
...
...
68 percent processed. 
69 percent processed. 
70 percent processed. 
Msg 3202, Level 16, State 1, Line 1 

Write on "F:\MSSQL\Backup\DummyDB.BAK" failed: 
112(There is not enough space on the disk.) 

Msg 3013, Level 16, State 1, Line 1 
BACKUP DATABASE is terminating abnormally.

This error occurs in both backups with & without compression; And in FULL & Differential backups.

This is a fairly large database, ranging up to 18 TB. So, backups are an ordeal to perform. So, when DIFF backups started failing, it was a bit concerning too.

After attempting several backups on local  & remote storage with plenty of space, a pattern still did not emerge. The only constant is that it fails around 70% completion progress.

At that point, one of  my colleagues (Thanks Michael) pointed out that, as part of backup operation, Sql Server will first run some algorithm that calculates the amount of space needed for the backup file. If the backup drive has enough free space well  and good, if not, it throws this error.

But, as you can guess, we had plenty of free space i.e. peta bytes of free space.

Occasionally, manual backups are successful. So, I’m not sure what is going on, but here is my theory:

At different points, Sql  Server  runs the algorithm (“pre-allocation algorithm”) to determine if there is enough space. Initially it comes back saying “yes” — and the backup proceeds with writing to the backup file; Again a little later, it checks, and it comes back with “Yes”; But at someone on subsequent checks (in our case between 70% to 72% complete), the algorithm decides there is  not enough disk space.

So, turns out there is a TRACE FLAG called 3042 that could disable this algorithm from making any assessments — that way backups could progress to completion.

From  MSDN:

Bypasses the default backup compression pre-allocation algorithm to allow the backup file to grow only as needed to reach its final size. This trace flag is useful if you need to save on space by allocating only the actual size required for the compressed backup. Using this trace flag might cause a slight performance penalty (a possible increase in the duration of the backup operation).

Using this trace flag might cause a slight performance penalty (a possible increase in the duration of the backup operation).

Caution: Manually make sure there is plenty of space for backup to complete — since we are disabling the algorithm.

--
-- Disable pre-allocation algorithm
--
DBCC TRACEON (3042)
GO

BACKUP DATABASE DummyDB
TO        DISK = N''
    , DISK = N''
    , DISK = N''
    , DISK = ''
WITH STATS = 1
GO
DBCC TRACEOFF (3042)
GO

Make sure you test this in a non-production environment, before enabling it in production.

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

This article is part 3 in a series on Advanced Query Tuning Concepts, that are good to be familiar with. Full list is here.

Hash Join

When both the data sets are large with unsorted & unindexed data sets, Hash Join is the best operator. This is the most complicated data set to process and Hash Join could process them efficiently.

HashJoin_QueryPlan.png

Hash Joins follow a complicated logic in identifying matching records. We’ll get into the details in a future post.

There are different types of Hash Joins

  1. In-memory hash join
  2. Grace hash join
  3. Recursive hash join

Note:

Hash Joins end up being used a lot in intermediary steps. During large table joins, when virtual tables are generated during intermediary steps, the subsequent joins on those intermediary tables are performed using Hash Joins — as these intermediary tables are not indexed or sorted

 

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

This article is part 2 in a series on Advanced Query Tuning Concepts, that are good to be familiar with. Full list is here.

Merge Join

When both the inputs are fairly large size with indexed and sorted data sets, Merge Join is very efficient in returning matching records. See the image below:

MergeJoin_QueryPlan

In this example, we have 2 similarly sized tables (this is important), that are Indexed (and hence sorted — this is also important). When they are joined on ID columns (ON B.ID = S.ID) that are indexed, Sql Server uses Merge Join

Simple Explanation:

From the sorted input lists, Sql Server takes one record from each table and compares them; If they match, it is returned. If not, the lower value row is discarded and next row from the same table is obtained for next comparison. It keeps iterating until the end of that table.

  • Both the tables need to be similarly sized in comparison to each other.
  • The join columns in both the data sets need to be indexed & sorted
    • This is important for efficient processing of the records.
    • If they are not already sorted, Merge Join adds a sorting step resulting in longer processing times — making it inefficient.

 

 

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

This is Part 1 of a series on Advanced Query Tuning Concepts, that are good to be familiar with. Today we’ll cover Nested Loop Joins. Full list is here.

Nested Loop Join

Nested Loop Join is utilized when, in a join, one input is a small data set (fewer than 10 records) and the other is a large data set that is indexed on the columns used in join.

NestedLoopJoin_Plan

Simple Explanation:

For each record in the SmallTable, it searches entire LargeTable for matching records. It keeps iterating for all the records in the SmallTable. On the first glance it seems like an inefficient method, but it is the most efficient method.

  • Since we are making the SmallTable as the outer input table, it limits the number of times we need to loop through.
  • Since the LargeTable is indexed, a quick Index Seek returns the matching value for the given value from SmallTable (i.e. ON B.ID = S.ID).
    • Hence the reason for Index Scan for SmallTable, with Index Seek for Large table.
    • The other way round would be very inefficient (Scaning BigTable and Seeking on SmallTable)

In this combination of Small vs. Large data sets, Nested Loop Join is the most efficient operator.

 

 

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

Usually, during query performance tuning, we look at query plan to see where most of the time is spent during query execution. Microsoft uses different Graphical Execution Plan Icons to to help easily convey the query execution steps & related costs at each step. Today we’ll look at three of those that are crucial in distilling large data sets and returning only the pertinent records. They are essentially Join Operators used in comparing multiple large data sets and retrieve necessary records.

  1. Nested Loop Join
  2. Merge Join
  3. Hash Join
    • In-memory hash join
    • Grace hash join
    • Recursive hash join

These are part of Advanced Query Tuning Concepts, that are good to be familiar with.

We’ll cover each in an individual post with an example, in this 3 part series.

  1. Nested Loop Join: Best when one of the data sets is small
  2. Merge Join: Best when both the data sets are of similar sizes
  3. Hash Join: Can efficiently handle large data sets, either sorted or not.

 

Hope this helps,
_Sqltimes

Read Full Post »

Quick one today:

On a regular bases, on production machines, selected perfmon metrics are captured into local files. Each day’s metrics are captured into an individual file — making it easier to analyze the data as and when needed.

Sometimes, to uncover any patterns, we’d need to combine a few days worth of files into one BLG file. This is rare, but needed. Microsoft provides a command to achieve this action. Enter relog command. This command could do a lot of things, but today we’ll look at file concatenation.

relog SqlCounters_08112017_48.blg SqlCounters_08122017_49.blg -f BIN -o C:\PerfLogs\Sql2014Counters\1\s.blg

Explanation:

The following Perfmon files

  • SqlCounters_08112017_48.blg
  • SqlCounters_08122017_49.blg

are combined into a final binary file called Combined.blg.

  • -f flag indicates the format of the output (concatenated file)
  • -o flag indicated the path of the output file

The following image shows, the output when you run it from command prompt.

Concatenate_Perfmon_BLG

Hope this helps,
_Sqltimes

Read Full Post »

Older Posts »