Feeds:
Posts
Comments

Archive for the ‘Operating System’ Category

Interesting one today:

When we run a query with SET STATISTICS TIME ON, we see CPU Time and Elapsed Time statistics for a query. What is the difference?

SET STATISTICS ON

SET STATISTICS ON

 

CPU Time is amount of CPU cycles (in time) spent by the workload thread on a particular processor/CPU. This is not an indication of the duration since the query began running on that CPU (there are some nuances — we’ll cover below)

Elapsed Time is the total duration of that thread, the time it took from start to end.

Simple Example:

Let’s say there is a workload thread for a simple query. The query runs for a bit (let’s say 2 seconds) and gets put on hold for a bit (let’s say 5 seconds), and then runs again (let’s say for 1 second) all the way to completion. The time the thread actually spent on CPU is counted towards CPU time (the two times, 2 & 1 seconds — 3 seconds total). Elapsed time is the total duration from when the query was issued to be executed to the time the result was generated — total 8 seconds.

Parallel Processing:

Let’s say there are 4 thread carrying out the workload of a query. Each run on a processor for 1 second and complete all the work in one shot.

Total CPU Time = 4 * 1 = 4 seconds (4 CPU’s with 1 second each)

Elapsed Time = 1 second (from start to finish)

Summary

CPU Time and Elapsed Time measure different time metrics for the same query. One measure the CPU cycles spent (in time) and the other measures the duration. The difference is important in figuring out what problem we are trying to solve.

  1. If a query has high Elapsed Time, but low CPU Time, that means, there is some CPU contention or other resource contention (waiting for all resources to be available before the query could be given a CPU).
    1. We want to look for ways to optimize the the way query is written — is it requesting for more resources (rows) than it needs. End user will feel the pain in waiting — if this happens to majority of the queries. 
    2. Adding to this, if the CPUs are all busy all the time, that definitely means we need bigger CPUs for the workload or optimized queries.
    3. Conversely, if the CPU’s are not too busy at this time, that means, the thread is just waiting for other resources (concurrency — locked tables) to be released for this query to progress. Optimize the queries.
  2. If CPU Time is higher than Elapsed Time, then there is good amount of parallelism in play.
    1. As long as the CPUs are optimally utilized, this is not a problem, but if you notice more than 80% utilization all the time (the threshold is even lower if they are virtual CPUs), then we need to add extra CPUs to alleviate the situation.
    2. This may not be a problem right now, but it will evolve into a problem soon. One extra long query will snowball everything into high delays — to the point where a user will notice the delays. 

 

Hope this helps,

Read Full Post »

Quick one today:

Every time we we build a new cluster in our lab environment, we start with running cluster validation report before starting Sql Server installation. Once in a while, we need to go back to refer older validation reports.

All cluster validation reports are saved, by default, at the location below on the active node where the validation was executed from:

C:\Windows\Cluster\Reports

ValidationReport

 

Hope this helps,
_Sqltimes

 

Read Full Post »

Interesting one today:

Occasionally we need to connect to Sql Server over Windows Authentication using different Windows user. This is a good step to manually check a few things about the validity or privilege level for that other Windows Account (there are other ways too).

One of the quickest way is to Connect to Sql Server over Windows Authentication using the other Windows account.

By default, when you open SSMS (Sql Server Management Studio) under Windows Authentication, it pre-populates our current windows account. There are a couple of options to workaround that.

Solution 1: Use runas

From a windows command prompt or a Run window, run the following runas command with the other Windows user

runas /user:domain\user Ssms.exe

New SSMS window opens up with the above mentioned Windows User account pre-populated.

Solution 2: Use RunAs from windows menu

Same steps as above, but we use windows user interface to perform the same steps.

Right click on SSMS and go to pop-up menu:

SSMS_DifferentUser_1

On “Microsoft Sql Server Management Studio 17“, press Shift + Right click

SSMS_DifferentUser_2

On the new SSMS, you could specify the new Windows Account (or Domain Account) to use for login.

Hope this helps,
_Sqltimes

 

Read Full Post »

Interesting one today:

Earlier, when performing some cluster maintenance work, this error popped-up on the screen.

The specified disk or volume is managed by the Microsoft Failover cluster 
component. The disk must be in the cluster maintenance mode and the 
cluster resource status must be online to perform this operation

When attempting to format a new LUN to 64K allocation unit, this error popped up. Since the LUN/Drive is already added to the cluster, new format changes could not be made.

Resolution:

As the verbose error message suggests, assign this particular disk into “Maintenance Mode”, then perform formatting steps.

Go to Failover Cluster Manager and go to Storage > Disks; Identify the particular disk and right click and go to More Actions >> Turn On Maintenance Mode.

Cluster_LUN_Error

Once the disk is in maintenance mode, you’ll see under Status column as Online (Maintenance Mode).

Now we are free to format the disk. Computer Management >> Storage >> Disk Management go to the individual disk and right click and Format.

FormatClusterLUN

Now formatting works and once completed, go back to Failover Cluster Manager and set the disk back out of Maintenance Mode.

 

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

We have a bunch of lab Sql Server boxes machines and sometimes after a fresh Sql Server install, when we try to open Activity Monitor, we run into this problem.

Error:

 

TITLE: Microsoft SQL Server Management Studio
 ------------------------------

The Activity Monitor is unable to execute queries against server DC2POLTPS02.
 Activity Monitor for this instance will be placed into a paused state.
 Use the context menu in the overview pane to resume the Activity Monitor.

Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)) (mscorlib)

------------------------------

Since these are lab machines, we are remotely logged into the machines and looks like there is some setting that prevents Activity Monitor from opening successfully. Activity Monitor provides great detail on what is going on with Sql Server at any given point-in-time and such activity needs “high level insight” into the Operating System and Sql Server; Such “high level” permissions are not enabled by default for user accounts.

Following steps show a way to enable elevated permissions when logged in remotely.  From what I could gather from Microsoft Connect this seems like elevated permissions on remote operating system’s DCOM. So we need to enable Remote Launch & Remote Activation permissions on remote Operating System (lab machine)

Resolution:

RDP to the remote machine and

  1. Open Component Services (DCOMCNFG) from start menu
  2. In the left hand tree, under Console Root, expand Component Services, expand Computers, right-click on My Computer and go to Properties
  3. In My computer Properties window, go to COM Security tab.
  4. In the Launch and Activation Permissions section, click on Edit Limits button.
    1. In the Security Limits tab, see if your user/group name exists. If not add to the list by clicking on Add button.
    2. Once user is added, highlight the user and make sure it has both Remote Launch & Remote Activation permissions checked.
  5. In the Access Permissions section, click on Edit Limits button
    1. In the Security Limits tab, see if your user/group name exists. If not add to the list by clicking on Add button.
    2. Once user is added, highlight the user and make sure it has Remote Access permissions checked.
  6. Hit Okay to save changes.
  7. Now expand the My Computer in the left-hand tree and go to DCOM Config.
    1. Find Windows Management and Instrumentation and go to Properties.
    2. Go to Security tab and under Launch and Activation Permissions section, click on Edit button
    3. In the Security tab, see if your user/group name exists. If not add to the list by clicking on Add button.
    4. Once user is added, highlight the user and make sure it has both Remote Launch & Remote Activation permissions checked.
    5. (See the image below)
  8. Save all changes and re-open Activity Monitor

Activity Monitor Error

Activity Monitor Error

 

 

 

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

On a production box, the backup jobs have been failing with an interesting and perplexing error. Its says “Not enough disk space“; As you can guess, this is one of those confusing or misleading error messages that’s not what it seems on the surface — Making it worthwhile for a post of its own.

Detailed error message is below:

BACKUP DATABASE DummyDB
TO        DISK = N''
	, DISK = N''
	, DISK = N''
	, DISK = ''
WITH STATS = 1
GO
...
...
...
68 percent processed. 
69 percent processed. 
70 percent processed. 
Msg 3202, Level 16, State 1, Line 1 

Write on "F:\MSSQL\Backup\DummyDB.BAK" failed: 
112(There is not enough space on the disk.) 

Msg 3013, Level 16, State 1, Line 1 
BACKUP DATABASE is terminating abnormally.

This error occurs in both backups with & without compression; And in FULL & Differential backups.

This is a fairly large database, ranging up to 18 TB. So, backups are an ordeal to perform. So, when DIFF backups started failing, it was a bit concerning too.

After attempting several backups on local  & remote storage with plenty of space, a pattern still did not emerge. The only constant is that it fails around 70% completion progress.

At that point, one of  my colleagues (Thanks Michael) pointed out that, as part of backup operation, Sql Server will first run some algorithm that calculates the amount of space needed for the backup file. If the backup drive has enough free space well  and good, if not, it throws this error.

But, as you can guess, we had plenty of free space i.e. peta bytes of free space.

Occasionally, manual backups are successful. So, I’m not sure what is going on, but here is my theory:

At different points, Sql  Server  runs the algorithm (“pre-allocation algorithm”) to determine if there is enough space. Initially it comes back saying “yes” — and the backup proceeds with writing to the backup file; Again a little later, it checks, and it comes back with “Yes”; But at someone on subsequent checks (in our case between 70% to 72% complete), the algorithm decides there is  not enough disk space.

So, turns out there is a TRACE FLAG called 3042 that could disable this algorithm from making any assessments — that way backups could progress to completion.

From  MSDN:

Bypasses the default backup compression pre-allocation algorithm to allow the backup file to grow only as needed to reach its final size. This trace flag is useful if you need to save on space by allocating only the actual size required for the compressed backup. Using this trace flag might cause a slight performance penalty (a possible increase in the duration of the backup operation).

Using this trace flag might cause a slight performance penalty (a possible increase in the duration of the backup operation).

Caution: Manually make sure there is plenty of space for backup to complete — since we are disabling the algorithm.

--
-- Disable pre-allocation algorithm
--
DBCC TRACEON (3042)
GO

BACKUP DATABASE DummyDB
TO        DISK = N''
    , DISK = N''
    , DISK = N''
    , DISK = ''
WITH STATS = 1
GO
DBCC TRACEOFF (3042)
GO

Make sure you test this in a non-production environment, before enabling it in production.

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

A few months ago, we had an issue with a database in our lab environment where the database ended up in SUSPECT mode due to storage issues. Once the storage was fixed, we were able to perform troubleshooting steps on the database. Of those, today we’ll only cover the step that was used to rebuild the transactional log file (rest of the steps will be covered in a future post).

Error Message:

An error occurred while processing the log for database 'SampleDB'. 
If possible, restore from backup. 
If a backup is not available, it might be necessary to rebuild the log.

 

NOTE: Rebuilding the transaction log is always the last option. There are other, safer options to troubleshoot when databases are in SUSPECT mode. Use this mode only after you’ve exhausted all other options like

  • CHECKDB
  • Restore from valid backup
  • Repairing with EMERGENCY mode

 

NOTE: Rebuilding transactional log file will break the restore chain; So any previous transactional log files could not be applied with the backups going forward.

Rebuild Transactional Log File

 

Step 1: Let’s identify the transactional log file name and path. Use the following query:

--
-- Gather Logical & Physical names of the database
--
SELECT D.name AS [DatabaseName]
	, M.name AS [LogicalName]
	, M.physical_name AS [PhysicalName]
FROM sys.master_files AS M
INNER JOIN sys.databases AS D 
	ON d.database_id = m.database_id
	AND D.name = 'SampleDB'
GO

Let’s use the name & path in the script below.

 

Step 2: Prepare the database

Before we rebuild the transactional log file, we need to set the database in EMERGENCY mode & SINGLE_USER mode. Use the script below:

--
-- Set database in EMERGENCY mode & SINGLE_USER mode
--
ALTER DATABASE SampleDB  SET EMERGENCY
GO
ALTER DATABASE SampleDB  SET SINGLE_USER WITH ROLLBACK IMMEDIATE
GO

 

 

Step 3: Rebuild Transactional Log

This is an undocumented & unsupported command, so use caution before you run this in production. Steps like this must be taken only upon guidance from Microsoft CSS.

With this script, we could create a new transactional log file. After you runs it, make sure you run CHECKDB & backups.

--
-- Rebuild log file
--
ALTER DATABASE SampleDB 
REBUILD LOG
ON 
	( NAME = 'SampleDB_log'
	, FILENAME = 'L:\MSSQL\LOGS\SampleDB_log.MDF'
	)
GO

 

Step 4: Post Rebuild

Upon successful log rebuild, lets take a few precautions to make sure everything is till good.

With this script, we could create a new transactional log file. After you runs it, make sure you run CHECKDB & backups.

--
-- Post rebuild steps
--
DBCC checkdb(SampleDB)
GO

ALTER DATABASE SampleDB  SET MULTI_USER
GO

SELECT DATABASEPROPERTYEX('SampleDB', 'Status')
GO

BACKUP DATABASE SampleDB TO DISK = N'Z:\MSSQL\Backup\SampleDB_AfterTLogRebuild.BAK'
GO

 

Hope this helps,
_Sqltimes

Read Full Post »

Quick one today:

On a regular bases, on production machines, selected perfmon metrics are captured into local files. Each day’s metrics are captured into an individual file — making it easier to analyze the data as and when needed.

Sometimes, to uncover any patterns, we’d need to combine a few days worth of files into one BLG file. This is rare, but needed. Microsoft provides a command to achieve this action. Enter relog command. This command could do a lot of things, but today we’ll look at file concatenation.

relog SqlCounters_08112017_48.blg SqlCounters_08122017_49.blg -f BIN -o C:\PerfLogs\Sql2014Counters\1\s.blg

Explanation:

The following Perfmon files

  • SqlCounters_08112017_48.blg
  • SqlCounters_08122017_49.blg

are combined into a final binary file called Combined.blg.

  • -f flag indicates the format of the output (concatenated file)
  • -o flag indicated the path of the output file

The following image shows, the output when you run it from command prompt.

Concatenate_Perfmon_BLG

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

During one of the Sql Cluster set up in our lab environment, we ran into this interesting (and common) error message.

The Sql Server failover cluster instance name '' already exists as a clustered resource. 
Specify a different failover cluster name.

Sql Cluster Set up Error

Sql Cluster Set up Error

Resolution

Turns out, the sysadmins created a Resource Group with the same name and allocated all the cluster disks to it. So, when I attempt to create a new Resource Group (for Sql Server resources), it throws this error.

 

SqlCluster Install Resolution

SqlCluster Install Resolution

 

Solution: Once the dummy resource group was removed, the cluster set up wizard progressed without any errors.

The moral of the story is, when we are about to create a resource group, make sure a resource group by the same name does not already exist.

 

 

Hope this helps,
_Sqltimes

Read Full Post »

Older Posts »