EXCEPT is a great operator. More importantly it is a set operator. So, by design it is a bit more efficient. This is part of the great set operators like:
- UNION ALL
- Returns values from left query (or data set) that are not in the right query (or data set)
- Return distinct values (like UNION does)
It checks the data from both the data sets and returns distinct values from left side that do not exist in right side data set. This is a great operator to use when cleaning data or comparing data to perform some clean up operations i.e. de-dup or data-sanitize. See the venn diagram below:
This is great for several reasons, but the best reasons are ones based on its prerequisites:
- Data types must match
- Columns and order must match
- Not a prerequisite, but it can compare NULL values. [i.e. SET ANSI_NULLS OFF]. It can compare NULL values as if they are regular literals.
It automatically makes sure the data type matches and you can pick and choose the data set you generate to compare.
Let’s take an example:
Step 1: Create tables
-- -- Create MainTable (left hand side table in comparision) -- DROP TABLE dbo.MainTable GO CREATE TABLE dbo.MainTable ( ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED , Name VARCHAR(20) NULL , Age INT NULL , Date_Of_Birth DATETIME NULL ) GO -- -- Create 'Compare' table. Right hand side table -- DROP TABLE dbo.Compare GO CREATE TABLE dbo.Compare ( ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED , Name VARCHAR(20) NULL , Age INT NULL , Date_Of_Birth DATETIME NULL ) GO
Step 2: Load some records
-- -- Load data into both the tables -- INSERT INTO dbo.MainTable (Name, Age, Date_Of_Birth) VALUES ('Jon', 20, NULL) , ('Ben', 40, '2012-02-02') , ('Test', 45, '2013-05-05') GO INSERT INTO dbo.Compare (Name, Age, Date_Of_Birth) VALUES ('Jon', 20, GETDATE()) , ('Ben', 40, '2012-02-01') , ('Test', 45, '2013-05-05') GO
Step 3: Let’ compare the data
Looking at the data, it could be noticed that there are two records that are different between the tables (ID 1 & 2). But in dbo.MainTable, for record ID = 1, has NULL values in Date_of_Birth column. So, lets see how Sql Server performs the comparision.
-- -- Use EXCEPT operator to compare -- SELECT M.ID , M.Name , M.Age , M.Date_Of_Birth FROM dbo.MainTable AS M EXCEPT SELECT C.ID , C.Name , C.Age , C.Date_Of_Birth FROM dbo.Compare AS C GO
Step 4: Results
As you can see the results match with our expectations. It identifies both ID 1 & 2 are different from data set on the right. This is a great tool in many ways.
Hope this helps,