Feeds:
Posts
Comments

Quick one today:

Let’s look at an Interesting, albeit, seemingly anomalous behavior with in Sql Server.

Question

  • When does a query with literals perform faster than query with variables? Even with a simple query.

Let’s look at some examples:

--
-- Query 1 : Query with literals
--
SELECT Column1, Col2
FROM dbo.SampleTable
WHERE ID BETWEEN 1 AND 100
GO

--
-- Query 2 : Query with variables
--
DECLARE @L INT = 1, @U INT = 100

SELECT Column1, Col2
FROM dbo.SampleTable AS O
WHERE O.ID BETWEEN @L AND @U
GO

The only difference between both they queries is that in the first one, we are providing literals in the WHERE clause; In second one, we provide values through variables. Even for a simple query like this, the execution plan is different along with the execution duration. This applies only for large tables (for small tables it does not make much difference).

Explanation

In the first query, the Query Optimizer, has clear idea on what is being executed. You know the query and its bounds; So it uses statistics available on this column to come up with the best possible query plan. Sql Server knows the exact number of records that are returned, so it comes up with most appropriate plan.

Whereas, the second query, at the time of compilation, the bounds are not available. Sql Server does not know if the query would return 100 records or 100 million records. It still uses statistics, but it uses it to come up with most appropriate plan including the worst case scenario. So, if you end up providing values that return only 10 records; Too bad. Sql Server still uses the ‘gargantuan query plan’ it designed for worst case scenario. So, it ends up being inefficient for most cases.

Alternative

If I run is using dynamic query execution, it runs faster:

--
-- Query 3 : Dynamic Query
--
DECLARE @L INT = 1, @U INT = 100

SELECT @SQL = 'SELECT Column1, Col2
FROM dbo.SampleTable
WHERE ID BETWEEN ' + CONVERT(VARCHAR, @L) + ' AND ' + CONVERT(VARCHAR, @U)

EXECUTE (@SQL)
GO

Explanation

Obviously, by the time the execution comes to the EXEC (@SQL) section, the query variables are already replaced with literals. So, essentially Query 3 is same as Query 1.

Hope this helps,
_Sqltimes

Quick one today:

As a Sql DBA, we need to continuously monitor different performance metrics to make sure the system is performing as expected and within acceptable limits. When we have VLDB’s, disk is always a critical one to monitor. In one such discussion, a few weeks ago, this point came up:

  • What is the difference between LogicalDisk and PhysicalDisk in perfmon metrics?

When you allocate storage for Sql Server, you combine a group of spinning disks into one RAID group; Then assign it as one volume to the OS, for Sql Server to store its data or log files. You could have 4 spinning disks or 12 in one RAID group. You could carve out with multiple disks, but once they are configured as one group they are treated as one volume.

  • The metric LogicalDisk refers to the metrics gathered from the entire volume as one. Ex: Disk queue length for the entire volume, Disk Read Bytes/Sec (for the entire volume)
  • PhysicalDisk metrics refer to metrics from each individual disk that is part of this one logical volume. Each disk will have its own metrics; They are all combined into one average metric and then displayed for the entire volume/drive attached to the OS.

With powerful SAN storage being available everywhere, may be, there is a some of grey area for this metrics. SAN’s behave very differently compared to DAS or NAS. They are heavily optimized for read/write operations. So, for SAN, it is better to also rely on the metrics from the SAN administrator.

Hat tip

Hope this helps,
_Sqltimes

 

Interesting topic today:

Ran into a situation, for which, I could not find a better explanation. If anyone has a better idea or proof, please share with me.

If it matters, the following tests are run on Sql Server 2008 R2.

Question:

Are these two Rollbacks different?

Scenario 1: Run an ALTER table statement on a very large table. Obviously, it takes a long time. Half way through the progress, hit “Cancel Executing Query” button. Since a lot of work has been performed, Sql Server will try to rollback that work. Since rollback is single threaded it takes a long while.

--
--  Changing the column size from 10 to 100 (takes a lot of time, based on the number of records in the table)
--
ALTER TABLE dbo.SampleTable ALTER COLUMN SomeColumn NVARCHAR(100)
GO

Scenario 2: Run the same ALTER statement, on the same large table, but this time make sure the LDF file is not large enough to support the full execution of the command. Also, disable “autogrowth”. At some point, the execution hits a wall (out of disk space) and performs a rollback and stops running instantaneously. You’ll see the error below:

The statement has been terminated
Msg 9002, Level 17, State 4, Line 1
The transaction log for database 'SampleDB' is full.
To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases

 

So the question is:

  1. How are these two rollbacks different? One is instantaneous and the other takes a LONG time to rollback.
  2. What happens behind the scenes that makes these two different?
  3. Why is the log file usage varies so much? Some times it increases and then decreases and then increases again. This happens a few times; Why?

Conjecture

As of now, I do not have a clear explanation as to why this is. But following are my estimations based on the artifacts gathered during the tests.

  • Rollback happens in both scenarios.
    • One is explicit rollback (user hitting Cancel button), the other is implicit (Sql Server making that decision internally).
  • In both scenarios, the traffic going to the log file is consistent. See the images below:
    • Scenario 1:
  • ALTER Statement with Explicit Rollback

    ALTER Statement with Explicit Rollback

    • Scenario 2:
ALTER Statement with Implicit Rollback and Eventual Error Message

ALTER Statement with Implicit Rollback and Eventual Error Message

 

  • One artifact that reinforced this line of thinking is capturing Sql Trace during both the scenarios.
    • Scenario 1 is self evident a.k.a. when we hit ‘Cencel’, it rolls back.
    • In Scenario 2, the error message is displayed after performing ‘rollback’ implicitly. In Sql Trace, we see the error message “The transaction log for database ‘SampleDB’ is full” a long time before the message is displayed on the screen. So, my guess is that rollbacks happens in both scenarios, but the error message is Scenario 2 is displayed after successfully and completely performing the rollback.
    • Scenario 2 seems to take longer as as it progresses much further along, so the rollback takes longer.

Unexplained behavior

  • Why does the log file usage vary so much. It increases to 90%, then down to 85%, then up to 99% and hovers there for a long time. I see it going up and down like this several times: 99.2 %, 99.8 %, 99.1 %, 99.7 %. Why does this happen?
    • One possible explanation is that, there might be a background process (something like Log Flush) that cleans up log file every few minutes. And everytime it kicks in, some entries are cleared up, resulting in more free space available.

Any ideas to help explain this behavior in a better way are welcome.

Hope this helps,
_Sqltimes

Quick one today:

Very often, we get this question about how to check the progress of certain long running activities. There are several ways to check different processes, but there is no single answer. The best answer is as always ‘it depends’ – because different processes takes different techniques.

Today, we’ll look at one such technique.

Starting Sql Server 2005, Microsoft has added a DMV that provides status or progress of certain tasks. With later versions a.k.a. Sql Server 2008, 2008 R2, 2012, etc, the scope has been expanded to cover more tasks. One exception is that this does not provide progress for data retrieval queries.

The DMV ‘sys.dm_exec_requests’ provides several important and useful internal data along with ‘percent complete’ and ‘estimated completion time’. Both are helpful in planning and preparation.

--
--  sys.dm_exec_requests
--
SELECT *
FROM sys.dm_exec_requests
GO
Percent Complete

Percent Complete

Based on MSDN, we could query status or progress for the following tasks:

  • ALTER INDEX REORGANIZE
  • AUTO_SHRINK option with ALTER DATABASE
  • BACKUP DATABASE
  • DBCC CHECKDB
  • DBCC CHECKFILEGROUP
  • DBCC CHECKTABLE
  • DBCC INDEXDEFRAG
  • DBCC SHRINKDATABASE
  • DBCC SHRINKFILE
  • RECOVERY
  • RESTORE DATABASE,
  • ROLLBACK
  • TDE ENCRYPTION

In the past, we had a post that show how to check the progress of backup job using this DMV.

 

Hope this helps,
_Sqltimes

Quick one today:

In a previous post, we discussed the different ways to clean Sql Server cache. Now we look at another one. DBCC FREESYSTEMCACHE.

DBCC FREESYSTEMCACHE('ALL')
GO

DBCC FREESYSTEMCACHE

As the name suggests, FREESYSTEMCACHE allows us to free up any unused cache in Sql Server memory. Sql Server usually, cleans up any unused cache from memory to free up space for new entries to use. But if we need to run this on demand, we could use this command to free-up. In addition to that we could also specify a particular pool that we want to clear or ALL pools.

Example 1: To clear all entries from from ‘default pool’

--
-- Clear the 'default' pool un-used entries
--
DBCC FREESYSTEMCACHE('ALL', 'default')
GO

 

Example 2: To clear all entries from all pools

--
-- Clear unused entries from all pools
--
DBCC FREESYSTEMCACHE('ALL')
GO

 

Quick one today:

A few weeks ago, we ran into an interesting error. This nebulous error has caused some confusion in the team, as it occurred during a high pressure situation. After the event passed, and everyone came back to normal state, it became apparent that this is a innocuous error.

Lock request time out period exceeded. (Microsoft SQL Server, Error: 1222)

One of our deployment steps changes the table definition of a large table. During this time, looks like, the database catalog is locked. That makes sense, as we are attempting to change the table definition which is in the catalog. Since this was a large table, the ALTER TABLE took more than a few minutes and for the whole duration of this change the catalog would be locked.

So, when we clicked on the tree structure in SSMS, there was no response. SSMS froze. After a few seconds, it came back with this error message. Surprisingly, if you query the catalog using TSQL, it works. So, that anomaly is not clear yet.

With in SSMS, under Options menu, go to, ‘Designers‘ >> ‘Table and Database Designers‘. On the right had side, look for ‘Override connection string time-out value for table designer updates: Transaction time-out after: 30 seconds‘. Change it to 60 or 90 seconds (as you see fit). This way, it will wait longer before times-out.

My preference is still 30 seconds. Once it errors out, I use TSQL to gather any information I need.

For more information, please refer to this MSDN article.

 

Hope this helps,
_Sqltimes

Quick one today:

Sometimes there is a need to track the usage patterns of transactional log file usage. There is a quick and efficient way to gather this data on regular basis.

--
-- Check the log file space usage
--
DBCC SQLPERF(LOGSPACE)
GO

It gives, Database Name, Log Size (MB), Log Space Used (%) and status. See the sample image below:

Log Space Usage

Log Space Usage

Previously, we saw a different usage of DBCC SQLPERF to reset DMV counters.

Hope this helps,
_Sqltimes

Follow

Get every new post delivered to your Inbox.