Feeds:
Posts
Comments

Archive for the ‘Performance Improvement’ Category

Quick one today:

Occasionally, the need to perform DIFFERENTIAL backup arises. The underlying mechanism of Differential backup is one of the coolest things.

 

--
--  Syntax for DIFF backup
--
BACKUP DATABASE SampleDB
TO DISK = 'I:\MSSQL\Backup\DIFF\SampleDB.DIFF'
WITH      DIFFERENTIAL
	, COMPRESSION
	, STATS = 1
GO

 

DIFF_Backup_Syntax

 

Some highlights:

  • Differential backup only captures the data that has changed since the last FULL backup (called base of the differential).
    • Even when we perform multiple DIFF backups, it captures all the data that changed since in the last FULL backup (unlike Transactional log backups, where they only take the changes since the last Transactional backup)
  • The size of DIFF backup files is usually smaller than FULL backups;
    • Obviously, since we are only capturing the changes since last FULL backup.
  • Using Differential Bitmap page, Sql Server maintains a list of extents that were modified since last FULL backup.
    • Using this, Sql is able to quickly run through all the extents that need to be backed-up quickly.
    • See the image from MSDN for clarity
  • bnr-how-diff-backups-work
Hope this helps,
_Sqltimes

Read Full Post »

Quick one today:

Every now and then, when good Sql Server resources are available, they are posted here for more people to benefit from them. Continuing on that tradition, today we have a gold mine. Microsoft has released many, many & many e-books open to public to download for free. In the list there are several Sql Server books along with BI, Windows, Office, SharePoint, etc

Happy learning !!

 

Hope this helps,
_Sqltimes

Read Full Post »

In a recent post, we saw details on one of the advanced setting starting Sql Server 2012, called TARGET_RECOVERY_TIME. When set correctly, this interacts with the server level setting called ‘recovery interval‘.

One is a server-level setting (recovery interval) and the other is a database level setting. So we need to understand the interaction or dynamic between these two settings and how they interact when set on the same database.

By default, recovery interval is set to 0 (1 minute) — which kicks the CHECKPOINT every one minute to flush dirty pages to disk.

Note:1 minute is just a general guideline, the actual interval depends on the amount of traffic on the database system. For higher traffic systems, there will be a lot of transactions each second, so there will be more dirty pages. So, the CHECKPOINT (background writer) kicks off more frequently than once a minute.

By default, both TARGET_RECOVERY_TIME & Recovery Interval is set to 0. so CHECKPOINTs occur approximately every 1 minute.

Recovery Interval could be set to 2 or 3 (minutes) or some other higher number to allow longer recovery times after a crash. So, Sql Server waits a longer period before flushing dirty pages to disk —  which results in longer times for recovery after crash; As it needs to roll-forward & roll-back transactions.

--
-- Set recovery interval on Sql Server 2012
--
SELECT *
FROM sys.configurations
WHERE name = 'recovery interval (min)'
GO

EXEC sp_configure 'show advanced options', 1
RECONFIGURE
GO

EXEC sp_configure 'recovery interval', 2
GO

EXEC sp_configure 'show advanced options', 0
RECONFIGURE
GO

TARGET_RECOVERY_TIME could be set in seconds or minutes. This determines the duration allowed for recovery time after a crash; This setting overwrites the system-level setting (recovery interval).

--
--  Change TARGET_RECOVERY_TIME
--
ALTER DATABASE SampleDB
SET TARGET_RECOVERY_TIME = 30 SECONDS;
GO

Relationship between the two settings:

  • When TARGET_RECOVERY_TIME is set, it overrides recovery interval setting.
  • Similarly, when recovery interval setting is configured and TARGET_RECOVERY_TIME is set to 0, then automatic checkpoint (recovery interval) is used.
TARGET_RECOVERY_TIME recovery interval Checkpoint Used
0 0 Automatic CHECKPOINT is used, where target recovery interval is 1 minute
0 >0 Automatic CHECKPOINT us used. The setting comes from (>0) number.
>0 N/A Indirect checkpoint. Setting is based on TARGET_RECOVERY_TIME
Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

A few months ago, we saw a point on CHECKPOINT and its counterparts. Now, lets dig a little deeper into it.

Starting Sql Server 2012, we have a new advanced option called TARGET_RECOVERY_TIME; It helps with setting Indirect Checkpoints to change the recovery time after a crash.

Automatic CheckPoints are the default (system level setting in sys.configurations) settings that decide how frequently the dirty pages in buffer pool are written to disk. Usually it is once every minute (generalization), but there are some nunaces to it (see NOTE below).   This helps in reducing the amount of time it takes to bring the system back to working after a crash.

Note:1 minute is just a general guideline, the actual interval depends on the amount of traffic on the database system. For higher traffic systems, there will be a lot of transactions each second, so there will be more dirty pages. So, the CHECKPOINT (background writer) kicks off more frequently than once a minute.

With new Indirect Checkpoints, a database level setting, we could configure a custom checkpoint settings to enable faster & predictable recovery times after crash.

TARGET_Recovery_Time_Setting

Context:

When we UPDATE/INSERT data into Sql Server, it is written to buffer pool, not disk. Only when (automatic/default) CHECKPOINT occurs, all the dirty pages in buffer pool are written to disk. This occurs at one minute intervals (varies based on workload, but 1 min in a good general guideline). So, approximately every minute, at the 60th second, you’ll see a HUGE spike in I/O to the MDF/NDF files as all the dirty pages are being written to disk. Then it waits for another ~60 seconds for the next CHECKPOINT, to write all the dirty pages to disk again. So, you see a pattern here.

The entire dirty page workload is being written to disk in one shot —  then wait (sit idle) for the next 60 seconds; And then again write the next workload to disk in one shot. As you can see, the I/O subsystem will be more active during these CHECKPOINT periods than at anytime in between.

If your storage is designed to handled, let’s say, 100 MB/sec and you have 1000 MB worth of dirty pages since the last checkpoint (1 min), it might take storage subsystem more than 10 seconds to fully process the workload. This results in unnecessary spikes in I/O metrics.

See the image below, where it shows the Maximum reading on the amount of dirty pages written to disk.

Default_CheckPoint_1min_interval

This presents an incorrect picture that there is something wrong with your storage. While staying idle the remaining 50 (to 45) seconds of the minute.

--
--	Change TARGET_RECOVERY_TIME
--
ALTER DATABASE SampleDB
SET TARGET_RECOVERY_TIME = 30 SECONDS;
GO

Advantages

  • I/O bottlenecks:
    • Now, if we could write more frequently, then the same workload could be accomplished without triggering off any false positive metrics (and also reducing recovery time after a crash).
    • In our above example, if the same 1000 MB dirty page workload per minute, could be written 2 times within a minute, we’ll have ~500MB workload every 30 seconds.
    • Now, the same storage metrics will show much better picture.
    • Then we could tweak the design of storage to the requirements of Sql Server dirty page workload.

TARGET_Recovery_Time_30seconds

  • Indirect Checkpoints enable you to control recovery time after a crash to fit within your business requirements

Disadvantages:

  • For OLTP workloads, sometimes this setting could result in performance degradation. Looks like the background writer, that writes dirty pages to disk, increases total write workload for server instance.
    • If different databases have different settings, the instance ends up doing more work, which might result in performance degradation.
    • So, this setting needs to be tested in performance environment before enabling it in Production environments.
In the next post, we’ll see the interaction between ‘recovery interval‘ & TARGET_RECOVERY_TIME setting.
Hope this helps,
_Sqltimes

Read Full Post »

Quick one today:

In the past, we’ve covered a MAXDOP query hint with details and nuances (and here too). This time, we’ll get into another interesting query hint MAXRECURSION.

One of my colleagues needed to come up with a numbers table with values like this, where there are multiple batches and each batch has some records.

Batch RecordsInBatch
1 1
1 2
1 3
1 4
1 2500
2 1
2 2
.. ..
2 2500
.. ..
999 2500

This could be achieved with recursive CTE with a table OPTION (MAXRECURSION = n).

--
--  Numbers table with recursive CTE
--
DECLARE @BN INT = 999, @RN INT = 2500

;WITH BatchNumbers (BatchNum)
AS	(
		SELECT 1 AS BatchNum
		UNION ALL
		SELECT BatchNum + 1 AS BatchNum
		FROM BatchNumbers
		WHERE BatchNum < @BN
	),
RecordNumbers (RecNum)
AS	(
		SELECT 1 AS RecNum
		UNION ALL
		SELECT RecNum + 1 AS RecNum
		FROM RecordNumbers
		WHERE RecNum < @RN
	)
SELECT BatchNumbers.BatchNum, RecordNumbers.RecNum
FROM RecordNumbers
CROSS JOIN BatchNumbers
ORDER BY BatchNumbers.BatchNum, RecordNumbers.RecNum
OPTION (MAXRECURSION 2500)
GO

MAXRECURSION – points to keep in mind

  • Server-wide, the limit is set to 100.
    • Use MAXRECURSION query hint to prevent infinite loops
  • Server-wide default is set to 100
  • When MAXRECURSION 0 is specified, it is same as infinite loops (no limit)
  • In query hint, values for MAXRECURSION range from 0 to 32,767
  • In cases where incorrect code enters into production, that results in infinite loops, MAXRECURSION could be used to limit the loops.
Hope this helps,
_Sqltimes

Read Full Post »

Quick one today:

Years ago, there was a popular comment that Sql Server did not wide variety of functions; Once of them is the much needed IsNumeric functionality. Though such a function existed, there was a case where it resulted in incorrect results. So there were several custom functions like IsReallyNumeric or IsInteger, etc (my personal DBA library also had such custom functions)

Now, there is a powerful function that does more than just IsNumeric capability. It is called TRY_PARSE. Let’s take an example:

--
-- Sample code for TRY_PARSE
--
SELECT	  TRY_PARSE('ABCDEF' AS INT)		AS [INT_Test1]
	, TRY_PARSE('123456' AS INT)		AS [INT_Test2]
	, TRY_PARSE('ABCD-01-01' AS DATETIME)	AS [DateTime_Test1]
	, TRY_PARSE('2016-01-01' AS DATETIME)	AS [DateTime_Test2]
GO

TRY_PARSE.PNG

This function takes input in VARCHAR/NVARCHAR format and attempts to convert into the requested format. If the result is successful, the value is returned in requested data type. If not, NULL is returned. This makes it very useful in data processing logic flows; And makes it more useful function than the legacy (IsNumeric) functions.

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

MSDTC is one of the popular software components that is on all Windows systems. It is one of the Windows  Operating System components that Sql Server relies on it to perform some crucial tasks (when needed).

What does it do?

MSDTC, Microsoft Distributed Transaction Coordinator, is essentially, as name suggests, a coordinator/manager to handle transactions that are distributed over multiple machines. Let’s say we start a transaction, where one of the steps includes querying data from a different Sql Server instance on a different physical machine; MSDTC comes into action with these specific tasks that need transaction coordination across different physical machines. It executes the section of code that is supposed to run on remote machines and brings back the results to local Sql instance. In this process, if any issue were to occur, on the remote machine that results in rollback, MSDTC makes sure the original transaction on this machine also rolls-back safely.

How does it do?

MSDTC comes with necessary Operating System controls and memory structures to carry out these operations independent of the Sql Instances, while keeping integrity of the transaction across the multiple physical Sql machines a.k.a. the complete two-phase distributed commit protocol and the recovery of distributed transactions.

Where does Sql Server use it?

The key point here is that these need to be Sql Instances on different physical machines. Queries that request data across different instances on the same physical box do not go through MSDTC.

MSDTC is used by query activities like

  • Linked Servers
  • OPENROWSET
  • OPENQUERY
  • OPENDATASOURCE
  • RPC (Remote Procedure Calls)
  • Ones with
    • BEGIN DISTRIBUTED TRANSACTION
  • etc…

So, every time we run SQL queries that utilize above techniques, they rely on MSDTC to carry out operation while maintaining transaction integrity.

Who else uses it?

MSDTC is an Operating System resource that is used by applications other than Sql Server, to perform any distributed transaction activities; Like eXtended Architecture applications.

Is MSDTC required?

MSDTC is not required for Sql Server installation or operation. If you are only going to use Database Engine, then it is not required or used. If your Sql uses any of the above mentioned query techniques (Linked Server, OPENQUERY, etc), or SSIS or Workstation Components then MSDTC is required.

If you are installing only the Database Engine, the MSDTC cluster resource is not required. If you are installing the Database Engine and SSIS, Workstation Components, or if you will use distributed transactions, you must install MSDTC. Note that MSDTC is not required for Analysis Services-only instances.

What about Sql Cluster?

Same rules as above apply to Sql Clusters as well with one additional rule. If you have two instances on the same machine (that are clustered across different physical machines), then you’ll need MSDTC. Since the Cluster could failover to remote machine at anytime.

Let’s take an example:

Let’s say Instance1 is on physical machines A & B, with B as active node. Instance2 is on machines B & C, with B as active node. A query going from Instance1 to Instance2 will need MSDTC (even if both the instances are active on the same physical machine B at that given point in time.).

This is because, there is no guarantee that they will remain on the same physical machine at any given time; They might failover to other machines, resulting in instances being on physically different machines. So MSDTC is required (when distributed operations are performed).

Also the recent Sql Server versions do not required MSDTC during Sql Server installations.

Other points in a Clustered Environment

We could have multiple instances of MSDTC as different clustered resource (along with default MSDTC resource).

In scenario with multiple MSDTC, we could configure each Sql Cluster resource to have a dedicated MSDTC instance. If such mapping does not exist, it automatically falls back to use the default MSDTC resource.

Hope this helps,
_Sqltimes

Read Full Post »

Older Posts »