Feeds:
Posts
Comments

Archive for April, 2013

DELETE and UPDATE are one of the most powerful statements in SQL. Being responsible for production systems, a lot of times I need to make changes to data (or meta data) in production directly. Any not every system has a development environment where I could test my SQL statements. Scary !! but sadly true. So usually, I take extra care when I write or run these in a production environment. I take multiple layers of protection to undo any unintended consequences.

But the point of this post is to document a popular and very useful feature of these statements.

Let’s define a data set. Two tables T1 and T2 with just one IDENTITY column with 100 records each. T1 has records starting from 1 to 100 and T2 has records starting from 10 to 109. Records from 1 to 10 match between these two tables. Now, I want to DELETE the records in T1 that match with T2.


CREATE TABLE dbo.t1 (ID INT IDENTITY(1,1), col1 VARCHAR(20))
CREATE TABLE dbo.t2 (ID INT IDENTITY(10,1), col1 VARCHAR(20))
GO

INSERT INTO dbo.t1 DEFAULT VALUES
GO 100
INSERT INTO dbo.t2 DEFAULT VALUES
GO 100

SELECT * FROM dbo.t1
SELECT * FROM dbo.t2

Since this is a simple example, there are many ways to accomplish this. But the technique I am interested in is to combine these two tables in a join and delete from one. The tables in the FROM clause, are computed first and the resultant virtual table is used to perform matching delete in T1.

DELETE FROM dbo.t1
FROM dbo.t1
INNER JOIN dbo.t2
ON t1.ID = t2.ID

Example 2: Update records is T1 that match with T2. Again, because this is a simple example, there are many ways to do this. But I am interested in this particular one.

UPDATE dbo.t1
SET Col1 = 'updated'
FROM dbo.t1
INNER JOIN dbo.t2
ON t1.ID = t2.ID

Hope this helps,
_Sqltimes

Read Full Post »

Update on 9/3/2013:  Added a continuation of this post here for Instance Database File Initialization.

Welcome to the world of Database File Initialization. It’s a wonderful concept. Its not something you’ll encounter every day, but when you do, you can see it in action (or perceived inaction) every single time .

Database File Initialization: Let’s set the stage

When you create a new database or add a new file(s) to an existing database, for example, using the following command:

 ALTER DATABASE [SomeDB]
 ADD FILE
 (
 NAME = 'StopGap_Primary.ndf'
 , FILENAME = 'T:\SQLData\SomeDB\StopGap_Primary.ndf'
 , SIZE = 100 GB
 , MAXSIZE = 200 GB
 , FILEGROWTH= 10 GB
 )
 TO FILEGROUP [PRIMARY]

Sql Server goes to the disk drive (T:\SQLData\SomeDB\) and tries to create a MDF/NDF/LDF file of specified size, in this case a 100 GB file. Creating a file, of any size, by itself is not time consuming. It’s mostly metadata operation where it communicates with the disk subsystem that the disk address locations (clusters/extents/sectors/stripes) from here to there are assigned to a particular file. But Sql Server does more than that. It actually goes to each disk location and fills it with zeroes. This process of initializing a new file with zeroes is what takes time. This is an important step and there is a reason for it.

Two reasons:

  • When you  start writing data to this file, the operation is faster. Sql Server does not need to zero-out the file locations before writing new data.
  • Not doing it is not a sound practice (described below – behind the scenes)

Behind the scenes:

When you delete a file, Operating System does not go to the disk location and delete the contents stored in the disk location. It just does a meta data operation that says, from now on the address locations (previously occupied by this file) are now free and available for reuse. But the content still exists on the disk and if the disk were to fall in the wrong hands, one could still recover data on the disk and retrieve contents of the deleted file.

To keep it simple, I am stopping here, but you can dig deeper and find out more by asking ‘why’ at every step. Ex: Why does Sql need to worry about previously deleted files? etc. But if this is a problems for you, you could take advantage of Instant file Initialization.

Instant file Initialization

Instant File Initialization allows you to quickly allocate file space by skipping the step of zeroing the allocated disk space. This is much faster and when you start writing new data it overwrites any existing content on the disk. To do this, assign SE_MANAGE_VOLUME_NAME permission to the user account under which Sql Server Service runs.

Note:

  • Please note that Log files cannot be initialized instantaneously.
  • When TDE (Transparent Data Encryption) is enabled the data files cannot be initialized instantaneously.

From the MSDN article, the following actions trigger File Initialization.

  • Create a database.
  • Add files, log or data, to an existing database.
  • Increase the size of an existing file (including autogrow operations).
  • Restore a database or filegroup.

Hope this helps,
_Sqltimes

Read Full Post »

Recently, I encountered an interesting error message.

For a database, we needed to add a new file to existing filegroup to help alleviate storage limitation of allocated subsystem. but as I ran the following script, I got the following error.

ALTER DATABASE [DBName]
ADD FILE
(
  NAME		=	'StopGap_Primary.ndf'
, FILENAME	=	'T:\SQLData\Data\StopGap_Primary.ndf'
)
TO FILEGROUP [PRIMARY]

Msg 3023, Level 16, State 2, Line 1
Backup, file manipulation operations (such as ALTER DATABASE ADD FILE) and encryption changes on a database must be serialized. Reissue the statement after the current backup or file manipulation operation is completed.

This is the first time I saw this error. After re-reading the error message several times to decipher the possible reaasons behind the error, I started following up on each leads.

I checked if the

  • LOG backup is running
  • Database backup (FULL/DIFF) is running
  • If anyone else is logged into server running any ALTER DATABASE queries.

It’s a small team that has direct access or has ‘sa’ access to this server, so it was easy to rule out that nobody is running any major database level operation.

Of all the above, I found out that the database backup that started this morning was still running. It was supposed to have completed 30 minutes after starting. but its still running after 10 hours.

Okay, now. I have a new problem at hand. Got to go fix the backup process first before adding any more files.

 

Hope this helps,
_Sqltimes

Read Full Post »

An interesting titbit, that I encountered today is with COLLATION with in Sql Server. As we all know collation is an important and interesting element in data storage and retrieval. It directly affects the way character data is stored, sorted and inturn retrieved.

By default Sql Server collation is set to SQL_Latin1_General_CP1_CI_AS (Latin, case insensitive and accent sensitive) but you can change it during installation or anytime after. Each  database, in an instance, could also be set to a different collation as needed. But I would not recommend setting it to something other than default, unless there is a strong business reason for it. But the beauty here is to know that we can set it at any level: Server, database or even each query or an expression within a query.

Sometimes, not very often, but sometime, I run into the need to add COLLATION to a particular condition with in WHERE clause to retrieve the exact data I need. So in this case I use the following style to set COLLATION to an individual condition expression within a query.

SELECT * FROM dbo.TableName
WHERE FileName = 'TestFileName.JPG' COLLATE SQL_Latin1_General_CP1_CS_AS

For this case, I need to compare character data with case sensitivity, so chose to add ‘COLLATE SQL_Latin1_General_CP1_CS_AS‘, but you can set it to any collation applicable to your situation.

  • Note: Please know that you do not want to do this as a general practice everywhere in application code. If you see a repeated need to sort character data in a particular fashion then it is better to restructure the table to store data in new collation. This helps query performance too. I do this once in a while in production systems, so I do not need to restructure the table.

When I apply new COLLATION to the WHERE clause conditions, the underlying indexes are not utilized. The table data is SCANed rather than SEEKed (or INDEX SEEK). This may be okay for once in awhile ad hoc queries,  but not for constantly running application code. If you need to query the data in a particular collation constantly, its better to restructure the table with new collation so retrieval becomes easier with use of any available indexes.

Hope this helps,
_Sqltimes

Read Full Post »