High Availability | Playing with database servers...

Archive for the ‘High Availability’ Category

Sql Server : Login failed for user. Reason: Attempting to use an NT account name with SQL Server Authentication. [CLIENT: ]

Posted in DBA, Error Messages, High Availability, Replication, Security, Tips and Techniques, tagged Agent Security, By Impersonating the process account, For Distribution Agent it needs permissions on Distributor & Subscribers, it needs permissions on Distributor & Publisher, Login failed for user. Reason: Attempting to use an NT account name with SQL Server Authentication. [CLIENT: ], LogReader agent, Sql Server Replication, Use the following Sql Server Login on September 12, 2019| Leave a Comment »

Interesting one today:

In our lab, this error propped up in one of the environments that relies on replication heavily.

Problem Definition:

Sql Agent jobs for all the Logreader Agents & Distribution Agents are spitting out this error message while they are in an endless loop to “retry”.

Login failed for user. 
Reason: Attempting to use an NT account name with SQL Server Authentication.
[CLIENT: 'machine IP']

The error makes sense, but it does not seem actionable.

Context:

Replication is carried out between Publisher-to-Distributor-to-Subscriber (simple terms). This action is carried out by Sql Agent job running on Distributor.

Each agent, runs under the security context of a user account (Local OS or Domain account). Since some of these agents needs to talk to Publisher or Subscriber or both, they need proper privileges on all the machines to be able to carry out operations.

Usually when all the machines are in a domain, a single account runs the Agent jobs and also has permissions to connect to Subscriber & Publisher. But we have an extra option in the replication UI.

We could set up one account to run the agent on Distributor; Another to connect to Publisher or Distributor.

For LogReader agent, it needs permissions on Distributor & Publisher
For Distribution Agent, it needs permissions on Distributor & Subscribers

See the image below for how security is set up for LogReader Agent:

ReplicationAgent_Security

In the window above, the top section shows under what user context the agent runs on Distributor. Since it also need to connect to Publisher, in the bottom portion, it shows the account needed to connect to Publisher.

To make things easier, we have two options:

By Impersonating the process account:
1. We could use the same account mentioned the top section or give a new account.
Use the following Sql Server Login:
1. This new account needs to be a Sql Server Login (and not windows or domain account).

Resolution:

The line in red above was the root of the problem.

Someone selected “Use the following Sql Server Login” to “Connect to the Publisher” and provided a Domain account, rather than a Sql Server Login.

Hence the error message :

Attempting to use an NT account name with SQL Server Authentication.

Either provide correct Sql Login or choose “By Impersonating the process account”.

More Info: When you “impersonate process account” it uses the same account under which the LogReader agent runs on Distributor — hence called Process Account

Hope this helps,
_Sqltimes

Read Full Post »

Sql Server: The specified disk or volume is managed by the Microsoft Failover cluster component. The disk must be in the cluster maintenance mode and the cluster resource status must be online to perform this operation

Posted in DBA, Error Messages, High Availability, Operating System, Sql Clustering, Sql Server 2008, Sql Server 2008 R2, Sql Server 2012, Sql Server 2014, Sql Server 2016, Tips and Techniques, Virtual Machines, tagged Failover Cluster Manager, More Actions Turn On Maintenance Mode, Sql Server, The specified disk or volume is managed by the Microsoft Failover cluster component. The disk must be in the cluster maintenance mode and the cluster resource status must be online to perform this ope on June 26, 2018| Leave a Comment »

Interesting one today:

Earlier, when performing some cluster maintenance work, this error popped-up on the screen.

The specified disk or volume is managed by the Microsoft Failover cluster 
component. The disk must be in the cluster maintenance mode and the 
cluster resource status must be online to perform this operation

When attempting to format a new LUN to 64K allocation unit, this error popped up. Since the LUN/Drive is already added to the cluster, new format changes could not be made.

Resolution:

As the verbose error message suggests, assign this particular disk into “Maintenance Mode”, then perform formatting steps.

Go to Failover Cluster Manager and go to Storage > Disks; Identify the particular disk and right click and go to More Actions >> Turn On Maintenance Mode.

Cluster_LUN_Error

Once the disk is in maintenance mode, you’ll see under Status column as Online (Maintenance Mode).

Now we are free to format the disk. Computer Management >> Storage >> Disk Management go to the individual disk and right click and Format.

FormatClusterLUN

Now formatting works and once completed, go back to Failover Cluster Manager and set the disk back out of Maintenance Mode.

Hope this helps,
_Sqltimes

Read Full Post »

Sql Server Cluster Set up Error : Network Binding Error. The domain network is not the first bound network. This will cause domain operations to run slowly and can cause timeouts that result in failures. Use the Windows network advanced configurations to change the binding order

Posted in DBA, Error Messages, High Availability, Operating System, Sql Clustering, Sql Server 2008, Sql Server 2008 R2, Sql Server 2012, Sql Server 2014, Sql Server 2016, Tips and Techniques, Virtual Machines, tagged Adapters and Bindings, Domain network needs to be on the top, Install Failover Cluster Rules, Network Binding Warning, Network connections" and open Advanced Settings, Sql Server Cluster Set up, Sql Server Cluster Set up Error, Sql Server Cluster Set up Error : Network Binding Error. The domain network is not the first bound network. This will cause domain operations to run slowly and can cause timeouts that result in failur, The domain network is not the first bound network. This will cause domain operations to run slowly and can cause timeouts that result in failures. Use the Windows network advanced configurations to ch, then Heartbeat Network and Remote Access Connections on August 12, 2017| Leave a Comment »

Interesting one today:

Earlier this month, during Sql Server Cluster set up on a new set of VMs, ran into this interesting warning message.

Network binding Order generated a warning.

The domain network is not the first bound network. This will cause domain operations to run slowly and can cause timeouts that result in failures. Use the Windows network advanced configurations to change the binding order.

Cluster Setup Network Binding Order

Upon further investigation, it became clear that the NIC that connects to the Domain network is not given highest priority (as needed) for Sql Cluster.

Resolution

In Clustered environments, it is recommended to have the network interfaces properly ordered for maximum efficiency.

Go to “Network connections” and open Advanced Settings. See the image below:

Network Connection – Advanced Settings

In the resultant window, under Adapters and Bindings tab, make sure the network interfaces are ordered according to the recommendation. Domain network needs to be on the top, then Heartbeat Network and Remote Access Connections. See the image below, for the recommended order.

Network Binding Proper Order

After saving the new order, go back to “Install Failover Cluster Rules” and re-run the checks. This blog has more info, if interested about the rest of cluster set up.

Hope this helps,
_Sqltimes

Read Full Post »

Relationship between TARGET_RECOVERY_TIME & Recovery Interval

Posted in Cool Script, DBA, DBA Interview, High Availability, New Features, Performance Improvement, Sql Server 2012, Sql Server 2014, Sql Server 2016, Technical Documentation, Tips and Techniques, TSQL, tagged ALTER DATABASE SampleDB SET TARGET_RECOVERY_TIME = 30 SECONDS;, CHECKPOINT Recovery Interval, EXEC sp_configure 'recovery interval', Relationship between TARGET_RECOVERY_TIME & Recovery Interval, TARGET_RECOVERY_TIME, TARGET_RECOVERY_TIME & Recovery Interval on July 15, 2017| Leave a Comment »

In a recent post, we saw details on one of the advanced setting starting Sql Server 2012, called TARGET_RECOVERY_TIME. When set correctly, this interacts with the server level setting called ‘recovery interval‘.

One is a server-level setting (recovery interval) and the other is a database level setting. So we need to understand the interaction or dynamic between these two settings and how they interact when set on the same database.

By default, recovery interval is set to 0 (1 minute) — which kicks the CHECKPOINT every one minute to flush dirty pages to disk.

Note:1 minute is just a general guideline, the actual interval depends on the amount of traffic on the database system. For higher traffic systems, there will be a lot of transactions each second, so there will be more dirty pages. So, the CHECKPOINT (background writer) kicks off more frequently than once a minute.

By default, both TARGET_RECOVERY_TIME & Recovery Interval is set to 0. so CHECKPOINTs occur approximately every 1 minute.

Recovery Interval could be set to 2 or 3 (minutes) or some other higher number to allow longer recovery times after a crash. So, Sql Server waits a longer period before flushing dirty pages to disk — which results in longer times for recovery after crash; As it needs to roll-forward & roll-back transactions.

--
-- Set recovery interval on Sql Server 2012
--
SELECT *
FROM sys.configurations
WHERE name = 'recovery interval (min)'
GO

EXEC sp_configure 'show advanced options', 1
RECONFIGURE
GO

EXEC sp_configure 'recovery interval', 2
GO

EXEC sp_configure 'show advanced options', 0
RECONFIGURE
GO

TARGET_RECOVERY_TIME could be set in seconds or minutes. This determines the duration allowed for recovery time after a crash; This setting overwrites the system-level setting (recovery interval).

--
--  Change TARGET_RECOVERY_TIME
--
ALTER DATABASE SampleDB
SET TARGET_RECOVERY_TIME = 30 SECONDS;
GO

Relationship between the two settings:

When TARGET_RECOVERY_TIME is set, it overrides recovery interval setting.
Similarly, when recovery interval setting is configured and TARGET_RECOVERY_TIME is set to 0, then automatic checkpoint (recovery interval) is used.

TARGET_RECOVERY_TIME	recovery interval	Checkpoint Used
0	0	Automatic CHECKPOINT is used, where target recovery interval is 1 minute
0	>0	Automatic CHECKPOINT us used. The setting comes from (>0) number.
>0	N/A	Indirect checkpoint. Setting is based on TARGET_RECOVERY_TIME

Hope this helps,
_Sqltimes

Read Full Post »

Sql Server Advanced Options: TARGET_RECOVERY_TIME & Checkpoint

Posted in DBA, DBA Interview, High Availability, New Features, Operating System, Performance Improvement, Sql Server 2012, Sql Server 2014, Sql Server 2016, Technical Documentation, Tips and Techniques, TSQL, tagged advantages and disadvantages of TARGET_RECOVERY_TIME, Automatic CheckPoint, Indirect Checkpoints, Sql Server Advanced Options, Sql Server Advanced Options: TARGET_RECOVERY_TIME & Checkpoint, sys.configurations, TARGET_RECOVERY_TIME, TARGET_RECOVERY_TIME & Checkpoint on July 1, 2017| 1 Comment »

Interesting one today:

A few months ago, we saw a point on CHECKPOINT and its counterparts. Now, lets dig a little deeper into it.

Starting Sql Server 2012, we have a new advanced option called TARGET_RECOVERY_TIME; It helps with setting Indirect Checkpoints to change the recovery time after a crash.

Automatic CheckPoints are the default (system level setting in sys.configurations) settings that decide how frequently the dirty pages in buffer pool are written to disk. Usually it is once every minute (generalization), but there are some nunaces to it (see NOTE below). This helps in reducing the amount of time it takes to bring the system back to working after a crash.

With new Indirect Checkpoints, a database level setting, we could configure a custom checkpoint settings to enable faster & predictable recovery times after crash.

TARGET_Recovery_Time_Setting

Context:

When we UPDATE/INSERT data into Sql Server, it is written to buffer pool, not disk. Only when (automatic/default) CHECKPOINT occurs, all the dirty pages in buffer pool are written to disk. This occurs at one minute intervals (varies based on workload, but 1 min in a good general guideline). So, approximately every minute, at the 60th second, you’ll see a HUGE spike in I/O to the MDF/NDF files as all the dirty pages are being written to disk. Then it waits for another ~60 seconds for the next CHECKPOINT, to write all the dirty pages to disk again. So, you see a pattern here.

The entire dirty page workload is being written to disk in one shot — then wait (sit idle) for the next 60 seconds; And then again write the next workload to disk in one shot. As you can see, the I/O subsystem will be more active during these CHECKPOINT periods than at anytime in between.

If your storage is designed to handled, let’s say, 100 MB/sec and you have 1000 MB worth of dirty pages since the last checkpoint (1 min), it might take storage subsystem more than 10 seconds to fully process the workload. This results in unnecessary spikes in I/O metrics.

Avg. Disk Write Queue Length will spike to abnormal levels
Disk Writes/sec will spike to the full capacity of the LUN
Avg. Disk Sec/Transfer will also spike, showing bad performance

See the image below, where it shows the Maximum reading on the amount of dirty pages written to disk.

Default_CheckPoint_1min_interval

This presents an incorrect picture that there is something wrong with your storage. While staying idle the remaining 50 (to 45) seconds of the minute.

--
--	Change TARGET_RECOVERY_TIME
--
ALTER DATABASE SampleDB
SET TARGET_RECOVERY_TIME = 30 SECONDS;
GO

Advantages

I/O bottlenecks:
- Now, if we could write more frequently, then the same workload could be accomplished without triggering off any false positive metrics (and also reducing recovery time after a crash).
- In our above example, if the same 1000 MB dirty page workload per minute, could be written 2 times within a minute, we’ll have ~500MB workload every 30 seconds.
- Now, the same storage metrics will show much better picture.
- Then we could tweak the design of storage to the requirements of Sql Server dirty page workload.

TARGET_Recovery_Time_30seconds

Indirect Checkpoints enable you to control recovery time after a crash to fit within your business requirements

Caution:

Adjustment Period:
- When we first change this setting, Sql Server makes some changes to the amount of dirty pages in memory. This may result, initially, in sending some pages to disk immediately to achieve the new Recovery time.
- During this adjustment period, we’ll see the above disk activity as Sql Server is going through the adjustment period.
- Once the adjustment period is completed, Sql Server will go back to a predictable schedule in its attempts to send pages to disk.
- So, do not be alarmed during this adjustment period.
Preparation:
- For OLTP workloads, sometimes this setting could result in performance degradation. Looks like the background writer, that writes dirty pages to disk, increases total write workload for server instance.
- If different databases have different settings, the instance ends up doing more work, which might result in performance degradation.
- So, this setting needs to be tested in performance environment before enabling it in Production environments.

In the next post, we’ll see the interaction between ‘recovery interval‘ & TARGET_RECOVERY_TIME setting.

Hope this helps,
_Sqltimes

Read Full Post »

Sql Server: How to increase Replication Transaction Retention & History Retention

Posted in DBA, DBA Interview, High Availability, Replication, Sql Server 2008, Sql Server 2008 R2, Sql Server 2012, Sql Server 2014, Sql Server 2016, Technical Documentation, Tips and Techniques, tagged Distributor settings, History Retention, How to increase Replication History Retention, How to increase Replication Transaction Retention, Sql Server, Sql Server: How to increase Replication Transaction Retention & History Retention, Transaction Retention on June 10, 2017| Leave a Comment »

Quick one today:

In replication, the transactions that come into the system are stored in Distribution database first; From there they are replicated to each Subscriber. The data stays there for several hours. This is determined by a ‘Transaction Retention‘ setting on the Distributor.

Similarly, as the data is replicated, log entries are made in the Distributor database. These entries also have a retention policy that could be set using ‘History Retention‘ setting.

Go to Distribution instance, Replication >> Right click >> Distributor Properties

Distributor Retention Properties

Under general tab, we see these settings. Change them as needed to achieve longer/shorter retention of both data & log.

Retention Policy

Hope this helps,
_Sqltimes

Read Full Post »

Sql Server Cluster : How to add new drive to Sql Cluster Dependency

Posted in Cool Script, DBA, DBA Interview, Error Messages, High Availability, Operating System, Sql Clustering, Sql Server 2008, Sql Server 2008 R2, Sql Server 2012, Sql Server 2014, Sql Server 2016, Technical Documentation, Tips and Techniques, TSQL, Virtual Machines, tagged Add available storage to Sql Cluster Configure dependency (and check the report before & after) Add data file on the new cluster storage drive, ALTER DATABASE [SampleDB] ADD FILE, available storage, Sql Server Cluster : How to add new drive to Sql Cluster Dependency, Sql Server Cluster Dependency Chain, Sql Server Resource Group on March 25, 2017| Leave a Comment »

Interesting one today:

On one of our production machines, we recently added a new LUN to a SQL cluster. A task like this is a team effort. Sysadmins perform some steps and DBA carry out the remaining. In this article, the main focus is on covering the steps after the LUN is added to the OS & Sql Cluster by the sysadmins. For context, we’ll start with high level steps before going into details.

Sysadmins steps

Add new storage to the machine/OS as an available storage
Format the available drive with appropriate settings (cluster size) and add it as a new drive
Make drive available to the Cluster using “Add Disk” screen in FailOver Cluster Management tool.

DBAs steps

Add available storage to Sql Cluster
Configure dependency (and check the report before & after)
Add data file on the new cluster storage drive

Here we’ll cover the DBA steps in detail:

Some of these steps were covered under a different recent article as part of dealing with an error message, but here we’ll cover it as a task by itself (which it is).

Add New Storage

Once sysadmins have made the new storage as an ‘available storage’ to OS Cluster, it needs to be added as a new storage location to the SQL Cluster.

In FailOver cluster manager, go to Sql Server Resource Group for this SQL Cluster and right click for detailed options and choose “Add Storage” (see image below)

sqlcluster_addnewstorage_to_os_cluster

Once successful, go to Storage\Disks under in FailOver Cluster Manager to confirm the availability. See image below:

sqlcluster_addnewdrive

Configure Dependency

Adding the storage is an important step, and equally important step is adding the new drive to Sql Cluster Dependency Chain. Dependency Chain informs Sql Sever “how to act”, when any resource in the Cluster becomes unavailable. Some resources automatically trigger cluster failover to other node; some resources do not. This decision is made based on the configurations in Dependency Chain.

Example:

Critical: Data drive/LUN that has database files is critical for optimal availability of the Sql Cluster. So, if it becomes unavailable, failing over to other available nodes is imperative to keep the cluster available.

Non-Critical: In some scenarios, Sql Server Agent is not considered as Critical. So if it stops for some reason, Cluster will make multiple attempts to start it on the same node, but may not necessarily cause failover.

This is a business decision. All these “response actions” will be configured in Cluster settings.

Now, check the dependency report (before); We can see that new drive exists in Cluster, but is not yet added to the Dependency Chain.

To Configure Dependency Chain, go to the Sql Server Resource Group under Roles in FailOver Cluster Manager. See the image below for clarity:

Then go to the bottom section for this Resource Group, where all the individual resources that are part of this Resource Group are displayed.

Under “Other Resources“, right click on Sql Server Resource and choose properties.

do As show

sqlcluster_addnewstorage_add_to_dependency

In the “Sql Server Properties” window, we can see the existing resources already added to dependency chain logic.

Now, go to the end of the logic list and choose “AND” for condition and pick the new Cluster Storage to be included. See image below for clarity:

After saving the settings, regenerate the Dependency Chain report. Now, we’ll see the new drive as part of the logic.

sqlcluster_dependencyreport_after

Add Database Data File to New Cluster Storage

Now, that the new drive is ready, we could easily add a new data file to the new location.

--
-- Add data file to new storage location
--
USE [master]
GO
ALTER DATABASE [SampleDB]
ADD FILE
	(
		  NAME 			= 	N'SampleDB_Data3'
		, FILENAME 		= 	N'U:\MSSQL\Data\SampleDB_Data3.NDF'
		, SIZE 			= 	3500 GB
		, FILEGROWTH 	= 	100 GB
		, MAXSIZE 		= 	3900 GB
	)
TO FILEGROUP [PRIMARY]
GO

Hope this helps,
_Sqltimes

Read Full Post »

Sql Server : What is MSDTC and is it required?

Posted in DBA, DBA Interview, High Availability, New Features, Operating System, Performance Improvement, Replication, Sql Clustering, Sql Server 2008, Sql Server 2008 R2, Sql Server 2012, Sql Server 2014, Sql Server 2016, Technical Documentation, Tips and Techniques, tagged BEGIN DISTRIBUTED TRANSACTION MSDTC, distributed operations, Do Sql Server Cluster require MSDTC, eXtended Architecture applications use MSDTC, Have multiple instances of MSDTC, How does MSDTC do?, Is MSDTC required, Linked Servers MSDTC, Microsoft Distributed Transaction Coordinator, msdtc, MSDTC OPENDATASOURCE, MSDTC OPENQUERY, MSDTC OPENROWSET, MSDTC RPC (Remote Procedure Calls), only going to use Database Engine, Other considerations for MSDTC in a Clustered SQL Environment, Sql Server : What is MSDTC and is it required?, then MSDTC is not required or used, Where does Sql Server use MSDTC, Who else uses MSDTC on January 28, 2017| 2 Comments »

Interesting one today:

MSDTC is one of the popular software components that is on all Windows systems. It is one of the Windows Operating System components that Sql Server relies on it to perform some crucial tasks (when needed).

What does it do?

MSDTC, Microsoft Distributed Transaction Coordinator, is essentially, as name suggests, a coordinator/manager to handle transactions that are distributed over multiple machines. Let’s say we start a transaction, where one of the steps includes querying data from a different Sql Server instance on a different physical machine; MSDTC comes into action with these specific tasks that need transaction coordination across different physical machines. It executes the section of code that is supposed to run on remote machines and brings back the results to local Sql instance. In this process, if any issue were to occur, on the remote machine that results in rollback, MSDTC makes sure the original transaction on this machine also rolls-back safely.

How does it do?

MSDTC comes with necessary Operating System controls and memory structures to carry out these operations independent of the Sql Instances, while keeping integrity of the transaction across the multiple physical Sql machines a.k.a. the complete two-phase distributed commit protocol and the recovery of distributed transactions.

Where does Sql Server use it?

The key point here is that these need to be Sql Instances on different physical machines. Queries that request data across different instances on the same physical box do not go through MSDTC.

MSDTC is used by query activities like

Linked Servers
OPENROWSET
OPENQUERY
OPENDATASOURCE
RPC (Remote Procedure Calls)
Ones with
- BEGIN DISTRIBUTED TRANSACTION
etc…

So, every time we run SQL queries that utilize above techniques, they rely on MSDTC to carry out operation while maintaining transaction integrity.

Who else uses it?

MSDTC is an Operating System resource that is used by applications other than Sql Server, to perform any distributed transaction activities; Like eXtended Architecture applications.

Is MSDTC required?

MSDTC is not required for Sql Server installation or operation. If you are only going to use Database Engine, then it is not required or used. If your Sql uses any of the above mentioned query techniques (Linked Server, OPENQUERY, etc), or SSIS or Workstation Components then MSDTC is required.

If you are installing only the Database Engine, the MSDTC cluster resource is not required. If you are installing the Database Engine and SSIS, Workstation Components, or if you will use distributed transactions, you must install MSDTC. Note that MSDTC is not required for Analysis Services-only instances.

What about Sql Cluster?

Same rules as above apply to Sql Clusters as well with one additional rule. If you have two instances on the same machine (that are clustered across different physical machines), then you’ll need MSDTC. Since the Cluster could failover to remote machine at anytime.

Let’s take an example:

Let’s say Instance1 is on physical machines A & B, with B as active node. Instance2 is on machines B & C, with B as active node. A query going from Instance1 to Instance2 will need MSDTC (even if both the instances are active on the same physical machine B at that given point in time.).

This is because, there is no guarantee that they will remain on the same physical machine at any given time; They might failover to other machines, resulting in instances being on physically different machines. So MSDTC is required (when distributed operations are performed).

Also the recent Sql Server versions do not required MSDTC during Sql Server installations.

Other points in a Clustered Environment

We could have multiple instances of MSDTC as different clustered resource (along with default MSDTC resource).

In scenario with multiple MSDTC, we could configure each Sql Cluster resource to have a dedicated MSDTC instance. If such mapping does not exist, it automatically falls back to use the default MSDTC resource.

Hope this helps,
_Sqltimes

Read Full Post »

Sql Server Replication : Source: MSSQLServer, Error number: 20598

Posted in DBA, DBA Interview, Error Messages, High Availability, Replication, Sql Server 2008, Sql Server 2008 R2, Sql Server 2012, Sql Server 2014, Sql Server 2016, Technical Documentation, Tips and Techniques, TSQL, tagged Error 20598, Error number: 20598, Error number: 20598) Get help: http://help/20598, MSdistribution_agents, Replication Error Resolution, Replication Monitor, Source: MSSQLServer, sp_browsereplcmds, The row was not found at the Subscriber when applying the replicated command. (Source: MSSQLServer on January 14, 2017| Leave a Comment »

Interesting article today on troubleshooting replication errors.

A few weeks ago, on production, we received an alert with on replication failures. Upon further inspection the error looks like this:

Command attempted:
if @@trancount > 0 rollback tran
(Transaction sequence number: 0x001031C200001B06000700000000, Command ID: 1)

Error messages:
The row was not found at the Subscriber when applying the replicated command. (Source: MSSQLServer, Error number: 20598)
Get help: http://help/20598
The row was not found at the Subscriber when applying the replicated command. (Source: MSSQLServer, Error number: 20598)
Get help: http://help/20598

Identify Root Cause:

To understand the course of action, we need to first understand the underlying issues. Go to Replication Monitor >> Open Details Window on the subscription that has errors. Go to ‘Distributor to Subscriber’ tab for more details. See the image below:

replication_error_20598

Now we see that, replication is attempting to perform an action on the Subscriber, but the row does not exist. So, lets find out more.

Find the exact command that is being replicated (or executed on Subscriber as part of replication) that throws this error. Use replication procedure sp_browsereplcmds for that.

Query the Distribution agent ID from dbo.MSdistribution_agents and use it in the query below.

--
-- Uncover the replication command throwing error
--
EXEC sp_browsereplcmds @xact_seqno_start = '0x001031C200001A620004000000000000'
                     , @xact_seqno_end = '0x001031C200001A620004000000000000'
                     , @agent_id = 49
                     , @publisher_database_id = 3
GO

You’ll see something like this:

replication_error_20598_investigation

Under the command column, we’ll see the exact command that is running into this error.

--
--  Error 20598 occurring in
--
{CALL [mtx_rpldel_ReportCriterion] (908236,71357,250,-1)}

Now, lets go to that stored procedure ‘mtx_rpldel_ReportCriterion’ and see what tables are involved in manipulation. In my scenario, the table ReportCriterion does not have the record with ID = 908236

Resolution

Once you understand the root cause, we have a few options.

Data Integrity: Looks like we have synchronization issues between Publisher and Subscriber. If it is a non-production environment or an environment where reinitializing is an option, then we could take that route to sync up the data first.
1. Once data integrity issues are resolved, all subsequent replication commends would be successful.
Manual fix: Manually insert the missing record at Subscriber and then allow replication to perform its operations on the new record.
1. With this option, the more records we uncover as missing, the more manual operation would be required. Its not ideal, but it is a workaround to get things going again.
Ignore, for now: In some situations, until further solution is identified, we may need to ignore this one record and move forward with rest of the replication commands.
1. Take necessary precautions to make sure there are no more such missing records. Or gather a list of all missing ones.
2. Configure replication to ignore error 20598 using skiperrors parameter. There are a couple of ways to achieve this; here we’ll look at one.
3. Go to the Agent Profile for this particular Distributor Agent. One of the profiles allows us to skip certain errors. See the image below.

For more information, please refer to Microsoft Support article on similar issue.

Hope this helps,
_Sqltimes

Read Full Post »

Windows Cluster : How to generate Cluster Logs in Windows Server 2012 (powershell)

Posted in Cool Script, DBA, DBA Interview, Error Messages, High Availability, New Features, Operating System, Sql Clustering, Sql Server 2008, Sql Server 2008 R2, Sql Server 2012, Sql Server 2014, Sql Server 2016, Technical Documentation, Tips and Techniques, tagged Get-ClusterLog, How to generate Cluster Logs, How to generate Cluster Logs in Windows Server 2012, How to generate Cluster Logs in Windows Server 2012 using PowerShell, Windows Cluster, Windows Cluster : How to generate Cluster Logs in Windows Server 2012 (powershell) on December 31, 2016| Leave a Comment »

Quick one today:

In a previous post, we covered one of the techniques used to generate text version of cluster Log files using command prompt. Today, we’ll cover another technique, a more common one going forward; Using PowerShell.

Context:

In Windows Server 2012, looks like the cluster.exe command prompt interface is not installed by default, when you install FailOver Cluster.

failovercluster_commandlineinterface

PowerShell:

So, we’ll use PowerShell cmdlets to generate these cluster logs.

#
#  Generate cluster log from SampleCluster and save in temp folder.
#
Get-ClusterLog -Cluster SampleCluster -Destination "c:\temp\"

When you run in PowerShell window, the response looks something like this:

powershell_clusterlog

Hope this helps,
_Sqltimes

Read Full Post »

Older Posts »

Playing with database servers…

Sql Server database articles, tips and scripts.

Archive for the ‘High Availability’ Category

Sql Server: The specified disk or volume is managed by the Microsoft Failover cluster component. The disk must be in the cluster maintenance mode and the cluster resource status must be online to perform this operation

Sql Server Cluster Set up Error : Network Binding Error. The domain network is not the first bound network. This will cause domain operations to run slowly and can cause timeouts that result in failures. Use the Windows network advanced configurations to change the binding order

Resolution

Relationship between TARGET_RECOVERY_TIME & Recovery Interval

Relationship between the two settings:

Sql Server Advanced Options: TARGET_RECOVERY_TIME & Checkpoint

Context:

Advantages

Caution:

Sql Server: How to increase Replication Transaction Retention & History Retention

Sql Server Cluster : How to add new drive to Sql Cluster Dependency

Sysadmins steps

DBAs steps

Add New Storage

Configure Dependency

Example:

Add Database Data File to New Cluster Storage

Sql Server : What is MSDTC and is it required?

What does it do?

How does it do?

Where does Sql Server use it?

Who else uses it?

Is MSDTC required?

What about Sql Cluster?

Let’s take an example:

Other points in a Clustered Environment

Sql Server Replication : Source: MSSQLServer, Error number: 20598

Identify Root Cause:

Resolution

Windows Cluster : How to generate Cluster Logs in Windows Server 2012 (powershell)

Context:

PowerShell:

Archives

Categories

Pages

Sql Server database articles, tips and scripts.

Archive for the ‘High Availability’ Category

Problem Definition:

Context:

Resolution:

Rate this:

Rate this:

Resolution

Rate this:

Relationship between the two settings:

Rate this:

Context:

Advantages

Caution:

Rate this:

Rate this:

Sysadmins steps

DBAs steps

Add New Storage

Configure Dependency

Example:

Add Database Data File to New Cluster Storage

Rate this:

What does it do?

How does it do?

Where does Sql Server use it?

Who else uses it?

Is MSDTC required?

What about Sql Cluster?

Let’s take an example:

Other points in a Clustered Environment

Rate this:

Identify Root Cause:

Resolution

Rate this:

Context:

PowerShell:

Rate this:

Archives

Categories

Pages