Archive for November, 2013

Question for everyone:

What steps could we take to optimize table copy from one database to another?

A few days ago, I was running a large table copy from one database to another. Something like this:

SELECT Col1, Col2 INTO OtherDB.dbo.TableCopy
FROM ThisDB.dbo.Table

This is a huge table, it has close to 6 billion records and the table is close to 300 GB in size. So, I wanted to find a way to optimize such a huge data transfer.

Some of the things considered:

  • Table hints / Query hints to take exclusive lock on the table and use parallelism, but it wasn’t any faster.
  • Data Compression to reduce the I/O. It was a bit faster, but still took a long time.
  • Multiple data files on separate drives. Made it faster, but not fast enough.
  • Made sure the MDF/NDF / LDF files have enough space so there is no AUTO_GROWTH

This is my second question: When back up of entire database can complete it 60 minutes, why does just one table copy take more than that? This is a lab environment, so there is no external traffic.

Any thoughts?


Read Full Post »

A few days ago, I ran into several interesting issues, one of them was documented in a previous post here.  Today, I’m going to write about another issue that occurred under similar circumstances. THREADPOOL wait time. Sql Server has its own scheduling mechanism that is part of its ‘operating system’ called SQLOS. It is responsible for scheduling and synchronization of threads along with several other things.

Upon successful authentication of a login, Sql Server looks for an available thread to be assigned to this new login/request. When all the available threads are taken, Sql Server assigns this THREADPOOL wait for this new connection as it is trying to find an available thread. In simpler terms, Sql Server is hung and cannot issue a thread for new connections. At that point, it makes an entry into the Sql Server error log with following message.

New queries assigned to process on Node 0 have not been picked up by a worker thread 
in the last 60 seconds. Blocking or long-running queries can contribute to this 
condition, and may degrade client response time. Use the "max worker threads" 
configuration option to increase number of allowable threads, or optimize current 
running queries. 

As you can see, it could because of several things. If ‘max worker threads’ is reached, Sql Server cannot create any more new threads. We either need to increase that number or find out why the existing connections are not freeing up the threads. May be there is blocking that is preventing any progress. Using DAC connection would help in these situations to resolve (or KILL) the offending processes (SPIDs) and open up resources. DAC has its own scheduler, memory node, etc so, a connection is usually successful in these situations.

If you instead want to increase ‘max worker threads’, we keep to keep other factors in mind. By default, this setting is set to 0 (zero), meaning unlimited threads. Sql Server is smart enough to know when to create new threads and destroy old (and inactive) threads. So, leaving it to the deafult value (0) is not a bad idea. But if constantly run into issues I mentioed in this previous post, may be re-configuring this to a more appropriate setting is fruitful. Microsoft, says the following as the better practice.

Number of CPUs 32-bit computer 64-bit computer
<= 4 processors 256 512
8 processors 288 576
16 processors 352 704
32 processors 480 960

Keep a close monitor on the production system and adjust this setting based on what works for your environment. There is no one setting that works for all.

Hope this helps,


Read Full Post »

A few days ago, I was setting up virtual machines on one of my lab servers and I ran into this interesting error.

It’s a Windows Server 2008 R2

 "virtual machine could not start because the hypervisor is not running".

This is a Dell PowerEdge T610 and I’m trying to build multiple virtual machines using Hyper-V tools natively available from within Windows Server 2008 R2. The error message is a bit nebulous. It seems like there is a hypervisor service that needs to be started. But that’s not it.

As soon as I saw that message, I realized that when I built this machine, I made sure to ‘Disable’ the ‘Virtualization Technology’ setting in BIOS for processors. Now its coming back to bite me !! So, shutdown the machine; Powered it back up; Went to BIOS settings; In there, it has a setting for ‘Virtualization Technology’ under processor settings. Enabled this setting and brought the machine back up. Voila !! now the virtual machines are running fine.

For other hardware machines (HP, etc), the setting could be something different. But once you go into BIOS you can easily figure it out.

Some more help if anyone is interested.

Hope this helps,


Read Full Post »

A few days ago, I was running a large table copy from one database to another. Something like this:

SELECT Col1, Col2 INTO OtherDB.dbo.TableCopy
FROM ThisDB.dbo.Table

This is a huge table, it has close to 6 billion records and the table is close to 300 GB in size. As the copy progressed, I see a lot of I/O going from one drive to the other. But every few minutes, I/O goes down to zero. It stays there for a few minutes and then revamps to higher numbers. Again after a few minutes, I see the same pattern. On further digging in, I noticed that everytime I/O drops, the SPID is waiting on PREEMPTIVE_OS_WRITEFILEGATHER. Now it makes sense. PREEMPTIVE_OS_WRITEFILEGATHER indicates the Auto Growth event. Looks like as the table copy happens, the database files run out of space and Auto Growth kicks in to increase the file sizes. Since this is a large database, the 10% increase is close to 100 GB and that takes a lot of time. We have a couple of options here:

  • Make sure the database files are pre-sized correctly, so the chances of auto growth are minimized. Assign the size ahead of time, so Sql Server can get contiguous spaces on disk, which in turn makes it faster.
  • Reduce auto growth from percent to a fixed number. Something like 1 GB every time (rather than 10%). But I do not recommend this. This create note fragmentation and in my opinion causes more harm than good. But this may be a good option for some environments.
  • Enable Instant File Initialization. This is an amazing feature that I wrote more about in my previous post. This allows Sql Server to skip the step of zeroing out the new allocated disk space before using.

This PREEMPTIVE_OS_WRITEFILEGATHER wait could occur in other similar situations that require allocation of new space. For example, database restores.

Hope this helps,


Read Full Post »

Quick one today:

To insert the output of a stored procedure into a temporary table, I use the following steps.

  • Create a temporary table with necessary table structure
  • Execute the Stored procedure while inserting its output into this table.


-- Create Temporary Table
SPID varchar(255)
, Status varchar(255)
, Login varchar(255)
, HostName varchar(255)
, BlbBy varchar(255)
, DBName varchar(255)
, Command varchar(255)
, CPUTime varchar(255)
, DiskIO varchar(255)
, LastBatch varchar(255)
, ProgramName varchar(255)
, SPID2 varchar(255)
, RequestID varchar(255)

-- Query stored procedure results into the temporary table
INSERT #who2
Exec sp_who2;

select * from #who2;

Hope this helps,

Read Full Post »

Quick one today.

One point that I regularly recommend is to optimize TempDB. There are several things we can do to improve overall performance of all the applications hitting the database instance by just improving the performance of TempDB; Does not require changes to a single line of application code. Every improvement you make here, adds to performance improvement to every application that hits a database on this instance. A few days ago, I wrote a post about improving performance by correctly balancing the number of data files for TempDB. In this post, I’d like to add some ‘behind the scenes’ understanding on why we need multiple data files.

As you can imagine, when you create a temporary table, it is an actual table created in TempDB. When you have hundreds of connections executing thousands of queries, a lot of temporary tables are created. This creates a lot of activity within TempDB that needs to be effectively managed to facilitate optimal performance. So let’s take one such incident of creating temporary table.

When a query requests to create a temporary table and store some data (or records) in it, Sql Server tries to allocate some pages. These pages are allocated in extents. An Extent is a group of 8 pages. Before it can allocate pages (or extents), it has to perform some meta-data checks to see what pages are available and where. Once an extent is available, assign the extent for this temporary table request and update meta-data settings to indicate that these extents are now allocated for a particular process (and its execution context). Here lies the trick. There may be thousands of extents available, but the meta-data pages where the tracking happens (PFS — allocation bitmaps) is only one for each data file. So, if you have only one data file, then all the requests for temporary table will be fighting to update these allocation bitmaps resulting in heavy delays (resource contention). When you have multiple data files, each file will have its own allocation bitmaps resulting in reduced contention (relatively).

The number of data files you create for TempDB is an important process. Read this article to know more.

As always, make sure the TempDB data files are of the same size.

Hope this helps,

Read Full Post »

Quick one:

As I wrote in one of my previous posts, I am not a fan of running COUNT(*) on any table. I prefer these alternative methods that depend on meta data rather than running a scan on table just to get a count. No matter the size of the table, you can get the row count using meta-data view.

This post is only about COUNT_BIG(*), so:

When you need to perform COUNT(*) on tables that have more than 4 billion rows (or 2 billion if they only use positive integers [greater than 0]), you need BIGINT version of the COUNT(*).

SELECT COUNT(*) FROM DBA.TableName -- smaller tables

SELECT COUNT_BIG(*) FROM DBA.TableName  -- very large tables

Hope this helps,

Read Full Post »

Quick one today

Sql Server has its own operating system that manages several things before the commands are sent to Windows Operating System. All Sql Server services a.k.a. SSRS, DB Engine rely on SQLOS for

  1. Memory Management
  2. Scheduling
  3. Synchronization
  4. Memory Brokers
  5. Exceptional Handling
  6. Deadlock Detection
  7. Extended Events
  8. Asynchronous I/O

While it performs all the above and more tasks, Memory Management and Scheduling are some of its primary tasks.

Hope this helps,

Read Full Post »

Sql Server: Use Query Hints

Quick one today

Query hints are sometimes very useful. They are great when you need them, but you want to be careful in your usage. Test thoroughly before embarking on this path. As MSDN says, “Because the SQL Server query optimizer typically selects the best execution plan for a query, we recommend only using hints as a last resort for experienced developers and database administrators.

Query hints impact the entire query. They affect the way the entire query is executed (query plan, etc), unlike table hints that affect only the table or the indexed view. OPTION clause is used to specify query hints.

Below are some of the examples to using Query Hints in queries.

SELECT ID, Name, Activity
FROM dbo.ActivityExtLog AS AL
INNER JOIN dbo.Activity  AS A

Hope this helps,

Read Full Post »

On one of our production Sql Server we ran into this interesting problem.

"A connection was successfully established with the server, but then an error occurred 
during the login process. (provider: TCP Provider, error: 0 - An existing connection 
was forcibly closed by the remote host.)"

I was attempting to login, using admin account, from SSMS from my laptop. So, I RDP’ed into the machine and fired up SSMS and tried to login using the same admin account. Same error again. Since login attempts from the same machine use ‘Shared Memory’, I thought I’d have better luck, but no success.

Upon further analysis, I see that memory usage is at 99.3%. That’s a big red flag. Surprise our monitoring system did not capture this memory uptick and did not raise any alert emails when it was consistently higher than 90%. Made a mental note to talk to the operations teams. Moving on..

I recollected that ‘max worker threads’ on this Sql Server was configured as ‘0’ (zero) i.e. unlimited threads a.k.a. unlimited connections. But in reality, unlimited in not possible. Sql Server continues to allow connections until it runs out of resources to allocate for new connections. This is one such situation. It is one thing to know this in theory and another to experience the issue. Never ran into this issue before, very interesting.

When a request comes to Sql Server it goes through several validation steps before it is granted connection into the instance. Once it successfully goes through all the tests, necessary resources are allocated and connection is established. Every connection gets a worker thread assigned to it to perform the task.

As part of it, Sql Server checks:

  • If it there are any available threads. If yes, selects one of them for the new connection.
  • If not, checks if the ‘max worker thread’ threshold is reached.
    • If not, increases the limit and created new threads.
    • If the limit is reached, no new connections are established. And you get this error. [In the error message it says “A connection was successfully established with the server, but then an error occurred“. It was able to establish connection, but ran into issues]

But in our case, the limit has not been reached. The value is set to unlimited (0 – zero). But when Sql Server tried to create new threads, it did not have any resources to assign to this new thread. Each thread needs about .5 MB of memory on a 32-bit machine and about 2 MB on a 64-bit machine. I guess Sql Server did not have any more free memory space to allocate to this new thread. Hence the error.

Side Note: Sql Seerver kills worker threads if they are idle for more than 15 mins.

So, I used DAC (Dedicated Administrator Connection) to login to Sql Server. Since it has its own dedicated resources (scheduler, memory node, etc), I was able to login and kill some blocking processes and idle SPIDS. Voila !! memory usage came back down and I was able to login using my other accounts.

Good Times !!

Best Practicies

  • Each DBA that has access to production, please create dedicated accounts. Do not use ‘sa’ account for anyone. Where needed assign sysadmin permissions to dedicated accounts rather than using ‘sa’ account. It is easier to use ‘sa’ (that creating accounts for each person), but not good.

Hope this helps,

Read Full Post »

Older Posts »