Feeds:
Posts
Comments

Archive for the ‘TSQL’ Category

Interesting one today:

On one of our production machines, we recently added a new LUN to a SQL cluster. A task like this is a team effort. Sysadmins perform some steps and DBA carry out the remaining. In this article, the main focus is on covering the steps after the LUN is added to the OS & Sql Cluster by the sysadmins.  For context, we’ll start with high level steps before going into details.

Sysadmins steps

  1. Add new storage to the machine/OS as an available storage
  2. Format the available drive with appropriate settings (cluster size) and add it as a new drive
  3. Make drive available to the Cluster using “Add Disk” screen in FailOver Cluster Management tool.

DBAs steps

  1. Add available storage to Sql Cluster
  2. Configure dependency (and check the report before & after)
  3. Add data file on the new cluster storage drive

Here we’ll cover the DBA steps in detail:

Some of these steps were covered under a different recent article as part of dealing with an error message, but here we’ll cover it as a task by itself (which it is).

Add New Storage

Once sysadmins have made the new storage as an ‘available storage’ to OS Cluster, it needs to be added as a new storage location to the SQL Cluster.

In FailOver cluster manager, go to Sql Server Resource Group for this SQL Cluster and right click for detailed options and choose “Add Storage” (see image below)

sqlcluster_addnewstorage_to_os_cluster

Once successful, go to Storage\Disks under in FailOver Cluster Manager to confirm the availability. See image below:

sqlcluster_addnewdrive

Configure Dependency

Adding the storage is an important step, and equally important step is adding the new drive to Sql Cluster Dependency Chain. Dependency Chain informs Sql Sever “how to act”, when any resource in the Cluster becomes unavailable. Some resources automatically trigger cluster failover to other node; some resources do not. This decision is made based on the configurations in Dependency Chain.

Example:

Critical: Data drive/LUN that has database files is critical for optimal availability of the Sql Cluster. So, if it becomes unavailable, failing over to other available nodes is imperative to keep the cluster available.

Non-Critical: In some scenarios, Sql Server Agent is not considered as Critical. So if it stops for some reason, Cluster will make multiple attempts to start it on the same node, but may not necessarily cause failover.

This is a business decision. All these “response actions” will be configured in Cluster settings.

Now, check the dependency report (before); We can see that new drive exists in Cluster, but is not yet added to the Dependency Chain.

SqlCluster_DependencyReport.PNG

To Configure Dependency Chain, go to the Sql Server Resource Group under Roles in FailOver Cluster Manager. See the image below for clarity:

Then go to the bottom section for this Resource Group, where all the individual resources that are part of this Resource Group are displayed.

Under “Other Resources“, right click on Sql Server Resource and choose properties.

do As show

sqlcluster_addnewstorage_add_to_dependency

In the “Sql Server Properties” window, we can see the existing resources already added to dependency chain logic.

SqlCluster_Dependency_Before.PNG

Now, go to the end of the logic list and choose “AND” for condition and pick the new Cluster Storage to be included. See image below for clarity:

SqlCluster_Dependency_After.PNG

After saving the settings, regenerate the Dependency Chain report. Now, we’ll see the new drive as part of the logic.

sqlcluster_dependencyreport_after

Add Database Data File to New Cluster Storage

Now, that the new drive is ready, we could easily add a new data file to the new location.

--
-- Add data file to new storage location
--
USE [master]
GO
ALTER DATABASE [SampleDB]
ADD FILE
	(
		  NAME 			= 	N'SampleDB_Data3'
		, FILENAME 		= 	N'U:\MSSQL\Data\SampleDB_Data3.NDF'
		, SIZE 			= 	3500 GB
		, FILEGROWTH 	= 	100 GB
		, MAXSIZE 		= 	3900 GB
	)
TO FILEGROUP [PRIMARY]
GO
Hope this helps,
_Sqltimes

Read Full Post »

Quick one today:

Earlier today, we ran into this error in our lab environment with replication. Two different errors, on the same environment, due to similar sequence of events.

Msg 20584, Level 16, State 1, Procedure sp_MSrepl_check_server, Line 67
Cannot drop server because it is used as a Subscriber to remote Publisher in replication.

Cannot drop server because it is used as a distributor in replication

This type of errors are rare, but one of these three options below might provide some relief.

  1. Reset Dist flag
  2. Force drop distributor
  3. Manually remove Publishers (from Distributor properties)

Also, see the notes in the bottom section (points to keep in mind)

Reset Dist Flag for Distributor Instance

Sometimes, the dist flag for Distributor stays set, when not supposed to. So manually resetting it helps.

--
--  Reset Distributor flag
--
EXEC master.dbo.sp_ServerOption   @server = 'REPLDist'
				, @OptName = 'dist'
				, @OptValue = 'false'
GO

Force Drop Distributor

One of the last options is to ‘force’ drop distributor. This allows removing distributor database without running through the proper checks. This step is a high risk, high reward option; Since all checks were previously performed and all components of replication are removed, except distributor, this step might provide relief.

EXEC sp_dropdistributor @no_checks = 1, @ignore_distributor = 1
GO

Manually Remove Publishers

Usually, the above two steps will resolve the issue. But in rare occasions, the issue persists, like in our lab. For those stubborn situation, this technique has given some relief.

Go to Replication >> Distributor Properties >> Go to Publishers (tab)

Now, un-check each Publisher and save. Important to note that, this needs to be done for one publisher at a time.

Distributor_Properties_RemovePublishers.PNG

Points to keep in mind:

Sometimes, just the one of these steps might be enough to resolve the issue; Sometimes, it might take two of the steps. In our lab, we ended up needing all the 3 steps. the first step is easy to implement.
The second and third steps might need repeated attempts. Something like, try Step 2, if it fails, then try step 3; If Step 3 succeeds, then go back to Step 2. Once Step 2 is successful, the distributor is fully cleared and cleaned.
Hope this helps,
_Sqltimes

Read Full Post »

Quick one today:

Given the usage of UNPIVOT is not that common as PIVOT (which itsel is more uncommon than other techniques in TSQL), the occurance of this error is even more remote. Luckly, we ran into it earlier:

Msg 8167, Level 16, State 1, Line 5
The type of column conflicts with the type of other columns specified in the UNPIVOT list.

Following is the UNPIVOT statement that threw the error. As you can see, on the surface, everything looks correct.

--
--	UNPIVOT syntax that throws error
--
SELECT ID, Metrics, Measures
FROM	(
	SELECT	  Daily_Health_Check_ID
			, Environment
			, DataCenter
	FROM SampleTable
	) AS Source
UNPIVOT
(
	Measures FOR Metrics IN (Environment, DataCenter)
) AS UnPVT
GO

Now let’s change the datatype and size of all the columns to match.

--
--	After converting all colums to the same datatype & size
--
SELECT ID, Metrics, Measures
FROM	(
	SELECT	  Daily_Health_Check_ID
			, CONVERT(VARCHAR(20), Environment) AS [Environment]
			, CONVERT(VARCHAR(20), DataCenter) AS [DataCenter]
	FROM SampleTable
	) AS Source
UNPIVOT
(
	Measures FOR Metrics IN (Environment, DataCenter)
) AS UnPVT
GO

Voilà !! The key is to have uniformed data type & size of all columns in the source data set

Hope this helps,
_Sqltimes

Read Full Post »

Quick one today:

Years ago, there was a popular comment that Sql Server did not wide variety of functions; Once of them is the much needed IsNumeric functionality. Though such a function existed, there was a case where it resulted in incorrect results. So there were several custom functions like IsReallyNumeric or IsInteger, etc (my personal DBA library also had such custom functions)

Now, there is a powerful function that does more than just IsNumeric capability. It is called TRY_PARSE. Let’s take an example:

--
-- Sample code for TRY_PARSE
--
SELECT	  TRY_PARSE('ABCDEF' AS INT)		AS [INT_Test1]
	, TRY_PARSE('123456' AS INT)		AS [INT_Test2]
	, TRY_PARSE('ABCD-01-01' AS DATETIME)	AS [DateTime_Test1]
	, TRY_PARSE('2016-01-01' AS DATETIME)	AS [DateTime_Test2]
GO

TRY_PARSE.PNG

This function takes input in VARCHAR/NVARCHAR format and attempts to convert into the requested format. If the result is successful, the value is returned in requested data type. If not, NULL is returned. This makes it very useful in data processing logic flows; And makes it more useful function than the legacy (IsNumeric) functions.

Hope this helps,
_Sqltimes

Read Full Post »

Interesting article today on troubleshooting replication errors.

A few weeks ago, on production, we received an alert with on replication failures. Upon further inspection the error looks like this:

Command attempted:
if @@trancount > 0 rollback tran
(Transaction sequence number: 0x001031C200001B06000700000000, Command ID: 1)

Error messages:
The row was not found at the Subscriber when applying the replicated command. (Source: MSSQLServer, Error number: 20598)
Get help: http://help/20598
The row was not found at the Subscriber when applying the replicated command. (Source: MSSQLServer, Error number: 20598)
Get help: http://help/20598

Identify Root Cause:

To understand the course of action, we need to first understand the underlying issues. Go to Replication Monitor >> Open Details Window on the subscription that has errors. Go to ‘Distributor to Subscriber’ tab for more details. See the image below:

replication_error_20598

Now we see that, replication is attempting to perform an action on the Subscriber, but the row does not exist. So, lets find out more.

Find the exact command that is being replicated (or executed on Subscriber as part of replication) that throws this error. Use replication procedure sp_browsereplcmds for that.

Query the Distribution agent ID from dbo.MSdistribution_agents and use it in the query below.

--
-- Uncover the replication command throwing error
--
EXEC sp_browsereplcmds @xact_seqno_start = '0x001031C200001A620004000000000000'
                     , @xact_seqno_end = '0x001031C200001A620004000000000000'
                     , @agent_id = 49
                     , @publisher_database_id = 3
GO

You’ll see something like this:

replication_error_20598_investigation

Under the command column, we’ll see the exact command that is running into this error.

--
--  Error 20598 occurring in
--
{CALL [mtx_rpldel_ReportCriterion] (908236,71357,250,-1)}

Now, lets go to that stored procedure ‘mtx_rpldel_ReportCriterion’ and see what tables are involved in manipulation. In my scenario, the table ReportCriterion does not have the record with ID = 908236

Resolution

Once you understand the root cause, we have a few options.

  1. Data Integrity: Looks like we have synchronization issues between Publisher and Subscriber. If it is a non-production environment or an environment where reinitializing is an option, then we could take that route to sync up the data first.
    1. Once data integrity issues are resolved, all subsequent replication commends would be successful.
  2. Manual fix: Manually insert the missing record at Subscriber and then allow replication to perform its operations on the new record.
    1. With this option, the more records we uncover as missing, the more manual operation would be required. Its not ideal, but it is a workaround to get things going again.
  3. Ignore, for now: In some situations, until further solution is identified, we may need to ignore this one record and move forward with rest of the replication commands.
    1. Take necessary precautions to make sure there are no more such missing records. Or gather a list of all missing ones.
    2. Configure replication to ignore error 20598 using skiperrors parameter. There are a couple of ways to achieve this; here we’ll look at one.
    3. Go to the Agent Profile for this particular Distributor Agent. One of the profiles allows us to skip certain errors. See the image below.
    4. replication_error_20598_resolution

For more information, please refer to Microsoft Support article on similar issue.

Hope this helps,
_Sqltimes

Read Full Post »

Quick one today:

In a recently post, we saw the value and usage examples for another undocumented command DBCC IND. Here, we’ll cover another such valuable command DBCC PAGE. The output of DBCC IND often is used as the input parameter for DBCC PAGE command.

This command allows us to read the contents of the given page.

Let’s look at an example:

--
-- Create a dummy table to test DBCC IND
--
IF EXISTS (SELECT * FROM sys.objects WHERE name = 'Uniquifier_Test' AND type = 'U')
     DROP TABLE dbo.Uniquifier_Test
GO

CREATE TABLE dbo.Uniquifier_Test (
      ID    INT         NOT NULL    DEFAULT (1)
    , Name  VARCHAR(5)  NOT NULL    DEFAULT('aa')
)
GO

--
-- Create Clustered and NonClustered indexes
--
CREATE CLUSTERED INDEX CI_Uniquifier_Test_ID
    ON dbo.Uniquifier_Test(ID ASC)
GO

CREATE NONCLUSTERED INDEX nCI_Uniquifier_Test_Name
    ON dbo.Uniquifier_Test(Name ASC)
GO

--
-- Let's insert some dummy records
--
INSERT INTO dbo.Uniquifier_Test
DEFAULT VALUES
GO 2

SELECT *
FROM dbo.Uniquifier_Test
GO

The results look something like this (nothing unusual):

DBCC IND

DBCC IND

Using the technique mentioned in the previous post on DBCC IND, we’ll get PageID for non-clustered index as 34538 where PageType is 2.

Now, lets look at the syntax for DBCC PAGE:

DBCC PAGE (Database Name, Page File ID, PageID, PrintOptions)

  • Database Name is the name of the database where the object exists
  • Page File ID is the file id in the target database where this page ID exists
  • PageID is the ID of the page that we want to retrieve
  • PrintOptions:
    • 0 says to only display the page header information.
    • 1 says to display page header with some row hex details
    • 2 says to display the entire page hex
    • 3 says to display page header and details on each row
      • This is the option used to retrieve data in tabular fashion.

Note: By default, when you run this command, the output is sent to the Sql Server Error Log. If you want the output to be displayed in SSMS, enable this trace flag : 3604

--
-- Enable trace flag
--
DBCC TRACEON (3604)
GO

Now, lets move to retrieving the page contents:

--
-- Retrieve page contents in tabular fashion
--
DBCC TRACEON (3604)
DBCC PAGE(test, 1, 34538, 3)
GO

dbcc_page

As you can see, we see the contents of the page (the actual records stored) a.k.a. 1, aa & 1, aa in the page 34538. Also shows other details like PageID, File ID, etc. We’ll talk about UNIQUIFIER column in a separate post.

Hope this helps,
_Sqltimes

Read Full Post »

Older Posts »