Although the day started as just any other day, it is ending with a great surprise leaving much desired satisfaction and motivation. I had an opportunity to use MERGE in a production environment.
It was a few years ago (I think so), that I first heard about MERGE statement and got excited and learnt all about it and ran it several times in Play and Dev environments, but never was in a situation where using MERGE seemed appropriate and perfect !! Feels like all my efforts have been stacking up to this event.
Now, we are in the middle of a data migration effort that involves migrating selected seuset of data from old PROD database to new PROD database, before ‘un-plugging’ the old database. Though, both the databases point to (or used by) the same application, the underlying data model is much different. Another reason why DBAs must be involved in data model design discussions. Agile does not give anyone a free pass to do careless development and assure yourself by saying “we’ll change it later”. Anyway, I digress.
After realizing that developers could not migrate the data themselves, they called us, the DBAs.
Back to MERGE. So after building out SQL scripts to accomplish this migration on a set-based approach (not procedural), I asked the developers to run some tests. After all the testing one major flaw was identified. There were duplicate entries in one table. One table had some overlapping accounts between old and new database. Now, how do I solve this without making the code any more complicated.
Off the top of my head, there are two ways:
- Use MERGE
- Use two separate transactions; One does INSERT, the second does UPDATE for matching records.
Perfect !! Use MERGE.
WITH Accounts_CTE (Merchant_ID, Old_Account_ID, Old_Enterprise_ID)
SELECT Merchant_ID AS [Merchant_ID]
WHERE A.Account_Type IN ('pp')
MERGE Sales.dbo.Account AS TGT -- Target Table
USING ( SELECT Merchant_ID
FROM Accounts_CTE) as SRC
(Merchant_ID, Old_Account_ID, Old_Enterprise_ID) -- this is the source table
ON (TGT.merchant_id = SRC.merchant_id) -- important condition for MERGE
WHEN MATCHED THEN
UPDATE SET TGT.Old_Account_ID = SRC.Old_Account_ID
, TGT.Old_Enterprise_ID = SRC.Old_Enterprise_ID
WHEN NOT MATCHED THEN
INSERT (merchant_id, Old_Account_ID, Old_Enterprise_ID)
VALUES ([Merchant_ID, Old_Account_ID, Old_Enterprise_ID);
This simple and intuitive code, helped me avoid two separate transaction; In worst case a procedural looping method; On top of it all MERGE is optimized in recent version to be more efficient than any other individual queries. You can read all about it in this MSDN article.
Note: End MERGE statement with a semicolon ;
As expected, with the above logic, using MERGE, I am able to INSERT new records into the table (from CTE) and update records where there is matching. Do all of this in one swell swoop. I think MERGE deserves a separate post of its own detailing the benefits and applicable scenarios.
Hope this helps,
Read Full Post »