Feeds:
Posts
Comments

Archive for the ‘Operating System’ Category

Interesting one today:

Earlier, when performing some cluster maintenance work, this error popped-up on the screen.

The specified disk or volume is managed by the Microsoft Failover cluster 
component. The disk must be in the cluster maintenance mode and the 
cluster resource status must be online to perform this operation

When attempting to format a new LUN to 64K allocation unit, this error popped up. Since the LUN/Drive is already added to the cluster, new format changes could not be made.

Resolution:

As the verbose error message suggests, assign this particular disk into “Maintenance Mode”, then perform formatting steps.

Go to Failover Cluster Manager and go to Storage > Disks; Identify the particular disk and right click and go to More Actions >> Turn On Maintenance Mode.

Cluster_LUN_Error

Once the disk is in maintenance mode, you’ll see under Status column as Online (Maintenance Mode).

Now we are free to format the disk. Computer Management >> Storage >> Disk Management go to the individual disk and right click and Format.

FormatClusterLUN

Now formatting works and once completed, go back to Failover Cluster Manager and set the disk back out of Maintenance Mode.

 

Hope this helps,
_Sqltimes
Advertisements

Read Full Post »

Interesting one today:

We have a bunch of lab Sql Server boxes machines and sometimes after a fresh Sql Server install, when we try to open Activity Monitor, we run into this problem.

Error:

 

TITLE: Microsoft SQL Server Management Studio
 ------------------------------

The Activity Monitor is unable to execute queries against server DC2POLTPS02.
 Activity Monitor for this instance will be placed into a paused state.
 Use the context menu in the overview pane to resume the Activity Monitor.

Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)) (mscorlib)

------------------------------

Since these are lab machines, we are remotely logged into the machines and looks like there is some setting that prevents Activity Monitor from opening successfully. Activity Monitor provides great detail on what is going on with Sql Server at any given point-in-time and such activity needs “high level insight” into the Operating System and Sql Server; Such “high level” permissions are not enabled by default for user accounts.

Following steps show a way to enable elevated permissions when logged in remotely.  From what I could gather from Microsoft Connect this seems like elevated permissions on remote operating system’s DCOM. So we need to enable Remote Launch & Remote Activation permissions on remote Operating System (lab machine)

Resolution:

RDP to the remote machine and

  1. Open Component Services (DCOMCNFG) from start menu
  2. In the left hand tree, under Console Root, expand Component Services, expand Computers, right-click on My Computer and go to Properties
  3. In My computer Properties window, go to COM Security tab.
  4. In the Launch and Activation Permissions section, click on Edit Limits button.
    1. In the Security Limits tab, see if your user/group name exists. If not add to the list by clicking on Add button.
    2. Once user is added, highlight the user and make sure it has both Remote Launch & Remote Activation permissions checked.
  5. In the Access Permissions section, click on Edit Limits button
    1. In the Security Limits tab, see if your user/group name exists. If not add to the list by clicking on Add button.
    2. Once user is added, highlight the user and make sure it has Remote Access permissions checked.
  6. Hit Okay to save changes.
  7. Now expand the My Computer in the left-hand tree and go to DCOM Config.
    1. Find Windows Management and Instrumentation and go to Properties.
    2. Go to Security tab and under Launch and Activation Permissions section, click on Edit button
    3. In the Security tab, see if your user/group name exists. If not add to the list by clicking on Add button.
    4. Once user is added, highlight the user and make sure it has both Remote Launch & Remote Activation permissions checked.
    5. (See the image below)
  8. Save all changes and re-open Activity Monitor
Activity Monitor Error

Activity Monitor Error

 

 

 

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

On a production box, the backup jobs have been failing with an interesting and perplexing error. Its says “Not enough disk space“; As you can guess, this is one of those confusing or misleading error messages that’s not what it seems on the surface — Making it worthwhile for a post of its own.

Detailed error message is below:

BACKUP DATABASE DummyDB
TO        DISK = N''
	, DISK = N''
	, DISK = N''
	, DISK = ''
WITH STATS = 1
GO
...
...
...
68 percent processed. 
69 percent processed. 
70 percent processed. 
Msg 3202, Level 16, State 1, Line 1 

Write on "F:\MSSQL\Backup\DummyDB.BAK" failed: 
112(There is not enough space on the disk.) 

Msg 3013, Level 16, State 1, Line 1 
BACKUP DATABASE is terminating abnormally.

This error occurs in both backups with & without compression; And in FULL & Differential backups.

This is a fairly large database, ranging up to 18 TB. So, backups are an ordeal to perform. So, when DIFF backups started failing, it was a bit concerning too.

After attempting several backups on local  & remote storage with plenty of space, a pattern still did not emerge. The only constant is that it fails around 70% completion progress.

At that point, one of  my colleagues (Thanks Michael) pointed out that, as part of backup operation, Sql Server will first run some algorithm that calculates the amount of space needed for the backup file. If the backup drive has enough free space well  and good, if not, it throws this error.

But, as you can guess, we had plenty of free space i.e. peta bytes of free space.

Occasionally, manual backups are successful. So, I’m not sure what is going on, but here is my theory:

At different points, Sql  Server  runs the algorithm (“pre-allocation algorithm”) to determine if there is enough space. Initially it comes back saying “yes” — and the backup proceeds with writing to the backup file; Again a little later, it checks, and it comes back with “Yes”; But at someone on subsequent checks (in our case between 70% to 72% complete), the algorithm decides there is  not enough disk space.

So, turns out there is a TRACE FLAG called 3042 that could disable this algorithm from making any assessments — that way backups could progress to completion.

From  MSDN:

Bypasses the default backup compression pre-allocation algorithm to allow the backup file to grow only as needed to reach its final size. This trace flag is useful if you need to save on space by allocating only the actual size required for the compressed backup. Using this trace flag might cause a slight performance penalty (a possible increase in the duration of the backup operation).

Using this trace flag might cause a slight performance penalty (a possible increase in the duration of the backup operation).

Caution: Manually make sure there is plenty of space for backup to complete — since we are disabling the algorithm.

--
-- Disable pre-allocation algorithm
--
DBCC TRACEON (3042)
GO

BACKUP DATABASE DummyDB
TO        DISK = N''
    , DISK = N''
    , DISK = N''
    , DISK = ''
WITH STATS = 1
GO
DBCC TRACEOFF (3042)
GO

Make sure you test this in a non-production environment, before enabling it in production.

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

A few months ago, we had an issue with a database in our lab environment where the database ended up in SUSPECT mode due to storage issues. Once the storage was fixed, we were able to perform troubleshooting steps on the database. Of those, today we’ll only cover the step that was used to rebuild the transactional log file (rest of the steps will be covered in a future post).

Error Message:

An error occurred while processing the log for database 'SampleDB'. 
If possible, restore from backup. 
If a backup is not available, it might be necessary to rebuild the log.

 

NOTE: Rebuilding the transaction log is always the last option. There are other, safer options to troubleshoot when databases are in SUSPECT mode. Use this mode only after you’ve exhausted all other options like

  • CHECKDB
  • Restore from valid backup
  • Repairing with EMERGENCY mode

 

NOTE: Rebuilding transactional log file will break the restore chain; So any previous transactional log files could not be applied with the backups going forward.

Rebuild Transactional Log File

 

Step 1: Let’s identify the transactional log file name and path. Use the following query:

--
-- Gather Logical & Physical names of the database
--
SELECT D.name AS [DatabaseName]
	, M.name AS [LogicalName]
	, M.physical_name AS [PhysicalName]
FROM sys.master_files AS M
INNER JOIN sys.databases AS D 
	ON d.database_id = m.database_id
	AND D.name = 'SampleDB'
GO

Let’s use the name & path in the script below.

 

Step 2: Prepare the database

Before we rebuild the transactional log file, we need to set the database in EMERGENCY mode & SINGLE_USER mode. Use the script below:

--
-- Set database in EMERGENCY mode & SINGLE_USER mode
--
ALTER DATABASE SampleDB  SET EMERGENCY
GO
ALTER DATABASE SampleDB  SET SINGLE_USER WITH ROLLBACK IMMEDIATE
GO

 

 

Step 3: Rebuild Transactional Log

This is an undocumented & unsupported command, so use caution before you run this in production. Steps like this must be taken only upon guidance from Microsoft CSS.

With this script, we could create a new transactional log file. After you runs it, make sure you run CHECKDB & backups.

--
-- Rebuild log file
--
ALTER DATABASE SampleDB 
REBUILD LOG
ON 
	( NAME = 'SampleDB_log'
	, FILENAME = 'L:\MSSQL\LOGS\SampleDB_log.MDF'
	)
GO

 

Step 4: Post Rebuild

Upon successful log rebuild, lets take a few precautions to make sure everything is till good.

With this script, we could create a new transactional log file. After you runs it, make sure you run CHECKDB & backups.

--
-- Post rebuild steps
--
DBCC checkdb(SampleDB)
GO

ALTER DATABASE SampleDB  SET MULTI_USER
GO

SELECT DATABASEPROPERTYEX('SampleDB', 'Status')
GO

BACKUP DATABASE SampleDB TO DISK = N'Z:\MSSQL\Backup\SampleDB_AfterTLogRebuild.BAK'
GO

 

Hope this helps,
_Sqltimes

Read Full Post »

Quick one today:

On a regular bases, on production machines, selected perfmon metrics are captured into local files. Each day’s metrics are captured into an individual file — making it easier to analyze the data as and when needed.

Sometimes, to uncover any patterns, we’d need to combine a few days worth of files into one BLG file. This is rare, but needed. Microsoft provides a command to achieve this action. Enter relog command. This command could do a lot of things, but today we’ll look at file concatenation.

relog SqlCounters_08112017_48.blg SqlCounters_08122017_49.blg -f BIN -o C:\PerfLogs\Sql2014Counters\1\s.blg

Explanation:

The following Perfmon files

  • SqlCounters_08112017_48.blg
  • SqlCounters_08122017_49.blg

are combined into a final binary file called Combined.blg.

  • -f flag indicates the format of the output (concatenated file)
  • -o flag indicated the path of the output file

The following image shows, the output when you run it from command prompt.

Concatenate_Perfmon_BLG

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

During one of the Sql Cluster set up in our lab environment, we ran into this interesting (and common) error message.

The Sql Server failover cluster instance name '' already exists as a clustered resource. 
Specify a different failover cluster name.
Sql Cluster Set up Error

Sql Cluster Set up Error

Resolution

Turns out, the sysadmins created a Resource Group with the same name and allocated all the cluster disks to it. So, when I attempt to create a new Resource Group (for Sql Server resources), it throws this error.

 

SqlCluster Install Resolution

SqlCluster Install Resolution

 

Solution: Once the dummy resource group was removed, the cluster set up wizard progressed without any errors.

The moral of the story is, when we are about to create a resource group, make sure a resource group by the same name does not already exist.

 

 

Hope this helps,
_Sqltimes

Read Full Post »

Interesting one today:

Lately, Access Methods performance metrics have been helpful in troubleshooting some recent issues. Access Methods has several important metrics that show the metrics to measure the usage internals of logical data with in Sql Server.

Here we’ll look at 3 of them:

  1. \SQLServer:Access Methods\FreeSpace Scans/sec
  2. \SQLServer:Access Methods\Table Lock Escalations/sec
  3. \SQLServer:Access Methods\Workfiles Created/sec

FreeSpace Scans

Objects in Sql Server are written on database pages. A group of these pages are called Extents (8 pages). Allocation of space occurs in units of extents. Extents are of two types Mixed & Uniform Extents.

Usually small tables (and sometimes Heap tables) are written to Mixed extents. So, when the data in those tables/objects increases, Sql Server needs to find more free space to  accommodate for the growth. In these situation, Sql Server performs Free Space Scans. This counter measure how many times these occur every second, hence FreeSpace Scans/sec.

Not sure what Microsoft’s recommended range on this metric is, but in our environment, this metric stays low i.e. under 5 or 10. So, as a rule of thumb, lets say as long as the number is below 20 we are okay. Anything higher for extended periods of time might need some attention.

So the best approach is to gather baseline first; Then you’ll know what is normal and what is out of the ordinary.

Table Lock Escalations

When a query is trying to read rows from a table, it uses the smallest lock as possible to maintain concurrency. In some rare (but not uncommon) occasions, this lock gets escalated higher level, either Page or Table level. While this reduces concurrency on that table, it improves the efficiency of this particular query execution.

Issuing thousands of locks costs a lot of memory; So it is easier to issue table lock and read as much data as needed. But the down side is that it will prevent other connections from reading this table.

For more, read this previous post on Lock Escalations.

This counter measures, the number of such escalations occur per second. Which this is a common occurrence, higher numbers for extended periods of time are not good. So, look into optimizing queries, so they only query the exact amount of data they need (a.k.a. use better JOINs and WHERE clause conditions)

Workfiles Created

When a large query that handles large data sets is executed, sometimes the intermediary data sets (or virtual tables) are written to disk. This helps with efficient processing of data. Sql Server reads this data into memory as and when needed and completes query processing.

This counter measures, how many work files are created each second. On a busy system, that handles large data set manipulation queries, many WorkFiles & WorkTables are created each second. So, this number needs to be considered in context. Capture a baseline first; Then measure any aberrations from baseline and look into possible reasons.

Usually when a query manipulates large data sets, Sql Server uses Hash Joins to manipulate the data to find matches. So, if you have a lot of queries that perform Hash Joins or Hash Aggregates, this counter spikes up.

 

Hope this helps,
_Sqltimes

Read Full Post »

Older Posts »