Feeds:
Posts
Comments

Archive for November, 2013

Question for everyone:

What steps could we take to optimize table copy from one database to another?

A few days ago, I was running a large table copy from one database to another. Something like this:


SELECT Col1, Col2 INTO OtherDB.dbo.TableCopy
FROM ThisDB.dbo.Table

This is a huge table, it has close to 6 billion records and the table is close to 300 GB in size. So, I wanted to find a way to optimize such a huge data transfer.

Some of the things considered:

  • Table hints / Query hints to take exclusive lock on the table and use parallelism, but it wasn’t any faster.
  • Data Compression to reduce the I/O. It was a bit faster, but still took a long time.
  • Multiple data files on separate drives. Made it faster, but not fast enough.
  • Made sure the MDF/NDF / LDF files have enough space so there is no AUTO_GROWTH

This is my second question: When back up of entire database can complete it 60 minutes, why does just one table copy take more than that? This is a lab environment, so there is no external traffic.

Any thoughts?

Thanks,
_Sqltimes

Read Full Post »

A few days ago, I ran into several interesting issues, one of them was documented in a previous post here.  Today, I’m going to write about another issue that occurred under similar circumstances. THREADPOOL wait time. Sql Server has its own scheduling mechanism that is part of its ‘operating system’ called SQLOS. It is responsible for scheduling and synchronization of threads along with several other things.

Upon successful authentication of a login, Sql Server looks for an available thread to be assigned to this new login/request. When all the available threads are taken, Sql Server assigns this THREADPOOL wait for this new connection as it is trying to find an available thread. In simpler terms, Sql Server is hung and cannot issue a thread for new connections. At that point, it makes an entry into the Sql Server error log with following message.

New queries assigned to process on Node 0 have not been picked up by a worker thread 
in the last 60 seconds. Blocking or long-running queries can contribute to this 
condition, and may degrade client response time. Use the "max worker threads" 
configuration option to increase number of allowable threads, or optimize current 
running queries. 

As you can see, it could because of several things. If ‘max worker threads’ is reached, Sql Server cannot create any more new threads. We either need to increase that number or find out why the existing connections are not freeing up the threads. May be there is blocking that is preventing any progress. Using DAC connection would help in these situations to resolve (or KILL) the offending processes (SPIDs) and open up resources. DAC has its own scheduler, memory node, etc so, a connection is usually successful in these situations.

If you instead want to increase ‘max worker threads’, we keep to keep other factors in mind. By default, this setting is set to 0 (zero), meaning unlimited threads. Sql Server is smart enough to know when to create new threads and destroy old (and inactive) threads. So, leaving it to the deafult value (0) is not a bad idea. But if constantly run into issues I mentioed in this previous post, may be re-configuring this to a more appropriate setting is fruitful. Microsoft, says the following as the better practice.

Number of CPUs 32-bit computer 64-bit computer
<= 4 processors 256 512
8 processors 288 576
16 processors 352 704
32 processors 480 960

Keep a close monitor on the production system and adjust this setting based on what works for your environment. There is no one setting that works for all.

Hope this helps,

_Sqltimes

Read Full Post »

A few days ago, I was setting up virtual machines on one of my lab servers and I ran into this interesting error.

It’s a Windows Server 2008 R2

 "virtual machine could not start because the hypervisor is not running".

This is a Dell PowerEdge T610 and I’m trying to build multiple virtual machines using Hyper-V tools natively available from within Windows Server 2008 R2. The error message is a bit nebulous. It seems like there is a hypervisor service that needs to be started. But that’s not it.

As soon as I saw that message, I realized that when I built this machine, I made sure to ‘Disable’ the ‘Virtualization Technology’ setting in BIOS for processors. Now its coming back to bite me !! So, shutdown the machine; Powered it back up; Went to BIOS settings; In there, it has a setting for ‘Virtualization Technology’ under processor settings. Enabled this setting and brought the machine back up. Voila !! now the virtual machines are running fine.

For other hardware machines (HP, etc), the setting could be something different. But once you go into BIOS you can easily figure it out.

Some more help if anyone is interested.

Hope this helps,

_Sqltimes

Read Full Post »

A few days ago, I was running a large table copy from one database to another. Something like this:


SELECT Col1, Col2 INTO OtherDB.dbo.TableCopy
FROM ThisDB.dbo.Table

This is a huge table, it has close to 6 billion records and the table is close to 300 GB in size. As the copy progressed, I see a lot of I/O going from one drive to the other. But every few minutes, I/O goes down to zero. It stays there for a few minutes and then revamps to higher numbers. Again after a few minutes, I see the same pattern. On further digging in, I noticed that everytime I/O drops, the SPID is waiting on PREEMPTIVE_OS_WRITEFILEGATHER. Now it makes sense. PREEMPTIVE_OS_WRITEFILEGATHER indicates the Auto Growth event. Looks like as the table copy happens, the database files run out of space and Auto Growth kicks in to increase the file sizes. Since this is a large database, the 10% increase is close to 100 GB and that takes a lot of time. We have a couple of options here:

  • Make sure the database files are pre-sized correctly, so the chances of auto growth are minimized. Assign the size ahead of time, so Sql Server can get contiguous spaces on disk, which in turn makes it faster.
  • Reduce auto growth from percent to a fixed number. Something like 1 GB every time (rather than 10%). But I do not recommend this. This create note fragmentation and in my opinion causes more harm than good. But this may be a good option for some environments.
  • Enable Instant File Initialization. This is an amazing feature that I wrote more about in my previous post. This allows Sql Server to skip the step of zeroing out the new allocated disk space before using.

This PREEMPTIVE_OS_WRITEFILEGATHER wait could occur in other similar situations that require allocation of new space. For example, database restores.

Hope this helps,

_Sqltimes

Read Full Post »

Quick one today:

To insert the output of a stored procedure into a temporary table, I use the following steps.

  • Create a temporary table with necessary table structure
  • Execute the Stored procedure while inserting its output into this table.

Example:


--
-- Create Temporary Table
--
DROP TABLE #who2
GO
CREATE TABLE #who2
(
SPID varchar(255)
, Status varchar(255)
, Login varchar(255)
, HostName varchar(255)
, BlbBy varchar(255)
, DBName varchar(255)
, Command varchar(255)
, CPUTime varchar(255)
, DiskIO varchar(255)
, LastBatch varchar(255)
, ProgramName varchar(255)
, SPID2 varchar(255)
, RequestID varchar(255)
);
GO

--
-- Query stored procedure results into the temporary table
--
INSERT #who2
Exec sp_who2;
GO

select * from #who2;
GO

Hope this helps,
_Sqltimes

Read Full Post »

Quick one today.

One point that I regularly recommend is to optimize TempDB. There are several things we can do to improve overall performance of all the applications hitting the database instance by just improving the performance of TempDB; Does not require changes to a single line of application code. Every improvement you make here, adds to performance improvement to every application that hits a database on this instance. A few days ago, I wrote a post about improving performance by correctly balancing the number of data files for TempDB. In this post, I’d like to add some ‘behind the scenes’ understanding on why we need multiple data files.

As you can imagine, when you create a temporary table, it is an actual table created in TempDB. When you have hundreds of connections executing thousands of queries, a lot of temporary tables are created. This creates a lot of activity within TempDB that needs to be effectively managed to facilitate optimal performance. So let’s take one such incident of creating temporary table.

When a query requests to create a temporary table and store some data (or records) in it, Sql Server tries to allocate some pages. These pages are allocated in extents. An Extent is a group of 8 pages. Before it can allocate pages (or extents), it has to perform some meta-data checks to see what pages are available and where. Once an extent is available, assign the extent for this temporary table request and update meta-data settings to indicate that these extents are now allocated for a particular process (and its execution context). Here lies the trick. There may be thousands of extents available, but the meta-data pages where the tracking happens (PFS — allocation bitmaps) is only one for each data file. So, if you have only one data file, then all the requests for temporary table will be fighting to update these allocation bitmaps resulting in heavy delays (resource contention). When you have multiple data files, each file will have its own allocation bitmaps resulting in reduced contention (relatively).

The number of data files you create for TempDB is an important process. Read this article to know more.

As always, make sure the TempDB data files are of the same size.

Hope this helps,
_Sqltimes

Read Full Post »

Quick one:

As I wrote in one of my previous posts, I am not a fan of running COUNT(*) on any table. I prefer these alternative methods that depend on meta data rather than running a scan on table just to get a count. No matter the size of the table, you can get the row count using meta-data view.

This post is only about COUNT_BIG(*), so:

When you need to perform COUNT(*) on tables that have more than 4 billion rows (or 2 billion if they only use positive integers [greater than 0]), you need BIGINT version of the COUNT(*).


SELECT COUNT(*) FROM DBA.TableName -- smaller tables

SELECT COUNT_BIG(*) FROM DBA.TableName  -- very large tables

Hope this helps,
_Sqltimes

Read Full Post »

Older Posts »