Archive for the ‘Error Messages’ Category

Interesting one today:

On one of our production machines, we recently added a new LUN to a SQL cluster. A task like this is a team effort. Sysadmins perform some steps and DBA carry out the remaining. In this article, the main focus is on covering the steps after the LUN is added to the OS & Sql Cluster by the sysadmins.  For context, we’ll start with high level steps before going into details.

Sysadmins steps

  1. Add new storage to the machine/OS as an available storage
  2. Format the available drive with appropriate settings (cluster size) and add it as a new drive
  3. Make drive available to the Cluster using “Add Disk” screen in FailOver Cluster Management tool.

DBAs steps

  1. Add available storage to Sql Cluster
  2. Configure dependency (and check the report before & after)
  3. Add data file on the new cluster storage drive

Here we’ll cover the DBA steps in detail:

Some of these steps were covered under a different recent article as part of dealing with an error message, but here we’ll cover it as a task by itself (which it is).

Add New Storage

Once sysadmins have made the new storage as an ‘available storage’ to OS Cluster, it needs to be added as a new storage location to the SQL Cluster.

In FailOver cluster manager, go to Sql Server Resource Group for this SQL Cluster and right click for detailed options and choose “Add Storage” (see image below)


Once successful, go to Storage\Disks under in FailOver Cluster Manager to confirm the availability. See image below:


Configure Dependency

Adding the storage is an important step, and equally important step is adding the new drive to Sql Cluster Dependency Chain. Dependency Chain informs Sql Sever “how to act”, when any resource in the Cluster becomes unavailable. Some resources automatically trigger cluster failover to other node; some resources do not. This decision is made based on the configurations in Dependency Chain.


Critical: Data drive/LUN that has database files is critical for optimal availability of the Sql Cluster. So, if it becomes unavailable, failing over to other available nodes is imperative to keep the cluster available.

Non-Critical: In some scenarios, Sql Server Agent is not considered as Critical. So if it stops for some reason, Cluster will make multiple attempts to start it on the same node, but may not necessarily cause failover.

This is a business decision. All these “response actions” will be configured in Cluster settings.

Now, check the dependency report (before); We can see that new drive exists in Cluster, but is not yet added to the Dependency Chain.


To Configure Dependency Chain, go to the Sql Server Resource Group under Roles in FailOver Cluster Manager. See the image below for clarity:

Then go to the bottom section for this Resource Group, where all the individual resources that are part of this Resource Group are displayed.

Under “Other Resources“, right click on Sql Server Resource and choose properties.

do As show


In the “Sql Server Properties” window, we can see the existing resources already added to dependency chain logic.


Now, go to the end of the logic list and choose “AND” for condition and pick the new Cluster Storage to be included. See image below for clarity:


After saving the settings, regenerate the Dependency Chain report. Now, we’ll see the new drive as part of the logic.


Add Database Data File to New Cluster Storage

Now, that the new drive is ready, we could easily add a new data file to the new location.

-- Add data file to new storage location
USE [master]
		  NAME 			= 	N'SampleDB_Data3'
		, FILENAME 		= 	N'U:\MSSQL\Data\SampleDB_Data3.NDF'
		, SIZE 			= 	3500 GB
		, FILEGROWTH 	= 	100 GB
		, MAXSIZE 		= 	3900 GB
Hope this helps,

Read Full Post »

Quick one today:

Earlier today, we ran into this error in our lab environment with replication. Two different errors, on the same environment, due to similar sequence of events.

Msg 20584, Level 16, State 1, Procedure sp_MSrepl_check_server, Line 67
Cannot drop server because it is used as a Subscriber to remote Publisher in replication.

Cannot drop server because it is used as a distributor in replication

This type of errors are rare, but one of these three options below might provide some relief.

  1. Reset Dist flag
  2. Force drop distributor
  3. Manually remove Publishers (from Distributor properties)

Also, see the notes in the bottom section (points to keep in mind)

Reset Dist Flag for Distributor Instance

Sometimes, the dist flag for Distributor stays set, when not supposed to. So manually resetting it helps.

--  Reset Distributor flag
EXEC master.dbo.sp_ServerOption   @server = 'REPLDist'
				, @OptName = 'dist'
				, @OptValue = 'false'

Force Drop Distributor

One of the last options is to ‘force’ drop distributor. This allows removing distributor database without running through the proper checks. This step is a high risk, high reward option; Since all checks were previously performed and all components of replication are removed, except distributor, this step might provide relief.

EXEC sp_dropdistributor @no_checks = 1, @ignore_distributor = 1

Manually Remove Publishers

Usually, the above two steps will resolve the issue. But in rare occasions, the issue persists, like in our lab. For those stubborn situation, this technique has given some relief.

Go to Replication >> Distributor Properties >> Go to Publishers (tab)

Now, un-check each Publisher and save. Important to note that, this needs to be done for one publisher at a time.


Points to keep in mind:

Sometimes, just the one of these steps might be enough to resolve the issue; Sometimes, it might take two of the steps. In our lab, we ended up needing all the 3 steps. the first step is easy to implement.
The second and third steps might need repeated attempts. Something like, try Step 2, if it fails, then try step 3; If Step 3 succeeds, then go back to Step 2. Once Step 2 is successful, the distributor is fully cleared and cleaned.
Hope this helps,

Read Full Post »

Quick one today:

Given the usage of UNPIVOT is not that common as PIVOT (which itsel is more uncommon than other techniques in TSQL), the occurance of this error is even more remote. Luckly, we ran into it earlier:

Msg 8167, Level 16, State 1, Line 5
The type of column conflicts with the type of other columns specified in the UNPIVOT list.

Following is the UNPIVOT statement that threw the error. As you can see, on the surface, everything looks correct.

--	UNPIVOT syntax that throws error
SELECT ID, Metrics, Measures
	SELECT	  Daily_Health_Check_ID
			, Environment
			, DataCenter
	FROM SampleTable
	) AS Source
	Measures FOR Metrics IN (Environment, DataCenter)

Now let’s change the datatype and size of all the columns to match.

--	After converting all colums to the same datatype & size
SELECT ID, Metrics, Measures
	SELECT	  Daily_Health_Check_ID
			, CONVERT(VARCHAR(20), Environment) AS [Environment]
			, CONVERT(VARCHAR(20), DataCenter) AS [DataCenter]
	FROM SampleTable
	) AS Source
	Measures FOR Metrics IN (Environment, DataCenter)

Voilà !! The key is to have uniformed data type & size of all columns in the source data set

Hope this helps,

Read Full Post »

Interesting one today:

Setting up a new SSRS machine on one of the virtual machines in our lab, we ran into this interesting error after walking through the initial steps of install & configuration. Turns out, this is a common error after fresh installs. Following is the error message:

The permissions granted to user are insufficient for performing this operation. 
(rsAccessDenied) Get Online Help

Following steps have helped resolve it.


SSRS service runs under the security of a user account; My guess is that this account needs permissions to access the content on the SSRS site & folders. So we need to carry out two steps to assign proper permissions.

  1. Folder Permissions
  2. Site Permissions
  3. Trusted Site Settings

Folder Permissions

Step 1:

Open browser with Administrative Privileges and point it to Reports Manager URL. Go to ‘Folder Settings’.

ssrs_permissions_error_foldersettings_1 Step 2:

In the properties page, go to the Security tab and click on ‘New Role Assignment


Step 3:

In the new page, add the user account for ‘Group or User Name’ field. Then assign ‘Content Manager‘ permissions to the user account. And hit okay to save the settings.


Site Settings

Step 1:

Use the same browse (with administrative permissions), to go to Site Settings on the right.


Step 2:

In the settings page, go to Security tab and click on New Role Assignment.


Step 3:

In the new page, add the windows user to the ‘Group or user name’; Assign System Administrator role and hit OK.


Trusted Site Settings

Go to Internet Explorer, Tools -> Internet Options -> Security tab > Click on Trusted Sites. In the new window, enter the URL for Report Manager and close.

Now you can open browser and open Report Manager without any security warning.



Once the permissions are set, close the browser and re-open it without administrative permissions. Going forward, it would work without needing to open browser with elevated permissions.

For more information, please refer to this BoL article.

Hope this helps,

Read Full Post »

Interesting one today:

Recently, on one of the lab machines, we had to rebuild SSRS machine on a new VM. After walking through the regular install and other configuration steps, when browser is opened and pointed to the SSRS URL, we get this error.

Error Message:

The report server cannot decrypt the symmetric key that is used to access sensitive
or encrypted data in a report server database. You must either restore a backup key
or delete all encrypted content.

This error message seems cryptic and familiar at the same time. After pacing back and forth for a few minutes near my desk, it dawned on me. This VM with pre-installed SSRS must have some encryption keys set up already, that need recreation or just deletion.


  1. Open Reporting Services Configuration Manager
  2. Connect to the SSRS engine on the VM
  3. Open Encryption Keys tab on the left
  4. Click on ‘Delete’ button to remove the keys (from previous set up)
  5. Now open browser with Administration privileges
    1. Open SSRS URL

Now the error message is resolved.


Note: Since this is a lab machine (and freshly minted VM), these aggressive steps are warranted. But in a production environment, you want to restore the encryption keys with the correct backup keys.

Hope this helps,

Read Full Post »

Interesting error today:

Recently on one of out performance test environment, we ran into an interesting error that looked new (unfamiliar) and a bit perplexing. Below is the error:

Error: 7886, Severity: 20, State: 1
A read operation on a large object failed while sending  data to the client. 
A common cause for this is if the  application is running in READ UNCOMMITTED isolation level.  
This connection will be terminated.

When this error occurred?

Upon further digging (and rereading the error message a few times), it became clear that were performing some long-running NOLOCK query, which was not uncommon in this environment. As it turns out, this is retrieving some large objects and sending to client applications. But in our situation, both of them are not the root of the problem.

What could be the problem?

Since these kind of operations are not uncommon in this environment, it could not have been these operations that could have caused it. May be they contributed, but they did not seem to be the root cause. We ran the same operations several times, sometimes they throw this error and other times it runs fine. So, the problem is some where else.


Luckily, someone already talked about this error in detail here. It turns out, this particular database, before every performance test, is being rest to ‘AutoGrowth’ with small increments (a.k.a. ~ 1 MB). Once we corrected out preparatory steps (for performance test) to configure the database size properly, this error got resolved.

What the root cause?

As of now, it is not clear as to why this issue occurs, but this is my working theory. When a large read operations (reading LOB and sending to client) occur with NOLOCK isolation, it reads dirty data. It might be possible that, someone else is attempting to change the this LOB at the same time. This by itself would not be a problem, since NOLOCK allows reading uncommitted data; But might be a contributor.

Adding to that, there is not enough free space left in the data file to accommodate the new changes coming in. If its a larger LOB, with size greater than the ‘AutoGrowth’ setting of 1 MB, it might have caused some problem where Sql Server has to take a breather to expand the file properly before allowing changes to occur properly. So the best option is to stop the NOLOCK read connection, which is the least important or light weight task. Doing this would cause least negative impact to the database integrity, while keeping the cost minimal (connection close).

Not sure if this theory is correct, but just a made up theory.

Hope this helps,

Read Full Post »

Interesting article today on troubleshooting replication errors.

A few weeks ago, on production, we received an alert with on replication failures. Upon further inspection the error looks like this:

Command attempted:
if @@trancount > 0 rollback tran
(Transaction sequence number: 0x001031C200001B06000700000000, Command ID: 1)

Error messages:
The row was not found at the Subscriber when applying the replicated command. (Source: MSSQLServer, Error number: 20598)
Get help: http://help/20598
The row was not found at the Subscriber when applying the replicated command. (Source: MSSQLServer, Error number: 20598)
Get help: http://help/20598

Identify Root Cause:

To understand the course of action, we need to first understand the underlying issues. Go to Replication Monitor >> Open Details Window on the subscription that has errors. Go to ‘Distributor to Subscriber’ tab for more details. See the image below:


Now we see that, replication is attempting to perform an action on the Subscriber, but the row does not exist. So, lets find out more.

Find the exact command that is being replicated (or executed on Subscriber as part of replication) that throws this error. Use replication procedure sp_browsereplcmds for that.

Query the Distribution agent ID from dbo.MSdistribution_agents and use it in the query below.

-- Uncover the replication command throwing error
EXEC sp_browsereplcmds @xact_seqno_start = '0x001031C200001A620004000000000000'
                     , @xact_seqno_end = '0x001031C200001A620004000000000000'
                     , @agent_id = 49
                     , @publisher_database_id = 3

You’ll see something like this:


Under the command column, we’ll see the exact command that is running into this error.

--  Error 20598 occurring in
{CALL [mtx_rpldel_ReportCriterion] (908236,71357,250,-1)}

Now, lets go to that stored procedure ‘mtx_rpldel_ReportCriterion’ and see what tables are involved in manipulation. In my scenario, the table ReportCriterion does not have the record with ID = 908236


Once you understand the root cause, we have a few options.

  1. Data Integrity: Looks like we have synchronization issues between Publisher and Subscriber. If it is a non-production environment or an environment where reinitializing is an option, then we could take that route to sync up the data first.
    1. Once data integrity issues are resolved, all subsequent replication commends would be successful.
  2. Manual fix: Manually insert the missing record at Subscriber and then allow replication to perform its operations on the new record.
    1. With this option, the more records we uncover as missing, the more manual operation would be required. Its not ideal, but it is a workaround to get things going again.
  3. Ignore, for now: In some situations, until further solution is identified, we may need to ignore this one record and move forward with rest of the replication commands.
    1. Take necessary precautions to make sure there are no more such missing records. Or gather a list of all missing ones.
    2. Configure replication to ignore error 20598 using skiperrors parameter. There are a couple of ways to achieve this; here we’ll look at one.
    3. Go to the Agent Profile for this particular Distributor Agent. One of the profiles allows us to skip certain errors. See the image below.
    4. replication_error_20598_resolution

For more information, please refer to Microsoft Support article on similar issue.

Hope this helps,

Read Full Post »

Older Posts »