Feeds:
Posts
Comments

Archive for December, 2013

Does it matter where you add a condition? WHERE clause or JOIN clause. How does it change in “INNER JOIN” or “OUTER JOIN”.

Let’s take an example:

CREATE TABLE dbo.LeftTable (ID INT)
CREATE TABLE dbo.RightTable (ID INT)
GO

INSERT INTO dbo.LeftTable (ID) VALUES (1)
INSERT INTO dbo.LeftTable (ID) VALUES (2)
INSERT INTO dbo.LeftTable (ID) VALUES (3)
INSERT INTO dbo.LeftTable (ID) VALUES (4)
INSERT INTO dbo.LeftTable (ID) VALUES (5)
INSERT INTO dbo.LeftTable (ID) VALUES (6)
INSERT INTO dbo.LeftTable (ID) VALUES (7)

INSERT INTO dbo.RightTable (ID) VALUES (8)
INSERT INTO dbo.RightTable (ID) VALUES (2)
INSERT INTO dbo.RightTable (ID) VALUES (3)
INSERT INTO dbo.RightTable (ID) VALUES (9)
INSERT INTO dbo.RightTable (ID) VALUES (5)
INSERT INTO dbo.RightTable (ID) VALUES (6)
GO
Condition Comparision Figure 1

Condition Comparison Figure 1

Now, let’s see how the results differ based on where you put the condition clause.

For INNER JOIN, the end result does not change much. This is because of the order in which Sql Server processes the query and fiilters the records at every step.

SELECT L.ID -- Left table
 , R.ID -- Right table
FROM dbo.LeftTable AS L
INNER JOIN dbo.RightTable AS R
     on L.ID = R.ID
WHERE L.ID = 2

-- compare with below query

SELECT L.ID -- Left table
 , R.ID -- Right table
FROM dbo.LeftTable AS L
INNER JOIN dbo.RightTable AS R
    ON L.ID = R.ID
    AND L.ID = 2
Condition Compare Figure 2

Condition Compare Figure 2

For OUTER JOIN, the results changes depending on where you put the condition.

SELECT L.ID -- Left table
     , R.ID -- Right table
FROM dbo.LeftTable AS L
LEFT OUTER JOIN dbo.RightTable AS R
    on L.ID = R.ID
WHERE R.Id IS NULL

-- Compare the result with query below
SELECT L.ID -- Left table
     , R.ID -- Right table
FROM dbo.LeftTable AS L
LEFT OUTER JOIN dbo.RightTable AS R
    on L.ID = R.ID
    AND R.Id IS NULL
Condition Compare Figure 3

Condition Compare Figure 3

Again, the result is different because of the order in which each section of the query is executed. What is more important is ‘what are you trying to do?’.

This query says that you are looking for records in ‘LEFT’ table, that do not exist (or no matching record) in ‘RIGHT’ table. Hence the condition “WHERE R.Id IS NULL” is added in the WHERE clause. An intermediary virtual table is generated when  ‘LEFT’ and ‘RIGHT’ tables are joined in a “LEFT OUTER JOIN” on condition “ID = ID”. This result has all the records from ‘LEFT’ table. And where ever there is a match with RIGHT table, the record is also displayed. But the important point is that all the records from LEFT are displayed. Now when you apply the condition in WHERE clause “WHERE R.Id IS NULL”, it filters the above intermediary virtual table to just those records with RIGHT column values are NULL. Hence a smaller subset.

SELECT L.ID -- Left table
     , R.ID -- Right table
FROM dbo.LeftTable AS L
LEFT OUTER JOIN dbo.RightTable AS R
    on L.ID = R.ID
    AND R.Id IS NULL

Where as this query is different. It says, I want all the records between LEFT and RIGHT table joined in a LEFT OUTER JOIN based on two conditions.

  • ID matches “ID = ID”
  • AND RIGHT ID is NULL

This does not make any business case sense, does it? It other words we are saying, we want all the matching records while ID on RIGHT must be NULL — all at the same time. So you just get all records from LEFT table with corrpsponding records from RIGHT table listed as NULL. Even when there are matching records (example: records 2, 3, etc), you do not see them because their ID is not NULL. So its self-conflicting, so all you get is just records from LEFT table.

SELECT L.ID -- Left table
 , R.ID -- Right table
FROM dbo.LeftTable AS L
LEFT OUTER JOIN dbo.RightTable AS R
 on L.ID = R.ID
WHERE R.Id IS NULL

Sometime soon, I’ll add apost on the order in which Sql Server processes each section of a query. That would help analyse these situations.

Hope this helps,
_Sqltimes

Read Full Post »

Something interesting I ran into a few days ago, were these new keywords (new to me) in Sql Server.

  • SOME
  • ANY
  • ALL

ALL is used several places so, its not all too new (ex: REBUILD INDEXES, GRANT ALL, etc), but SOME and ANY are particularly new and interesting. What is the purpose? Could it not be done using IN clause?

Talking just about the intent of these clauses, SOME says that ‘return any matching records from parent query, where it matches with some of the result set from the subquery‘. It works in TRUE or FALSE logic. The left hand side (LHS) of SOME logic looks at the scalar list on the right hand side and says if the values in the LHS have matching with some of the scalars values on the RHS. Some is the key here, if there is a match, it returns TRUE orelse FALSE.

Similarly ANY and ALL have similar logic. ANY says, if there is even one match then returns TRUE. ALL needs it to match with all the scalar values. One other requirement is that for SOME/ANY the RHS needs to be a subquery.

There are some situations, we can replace SOME/ANY with IN clause. But there are situations where having SOME/ANY are more appropriate. Also, with SOME/ANY we can use more logical operators =, <, >, <>, etc than with IN clause.

Some Examples:

--
-- Example for SOME - Returns TRUE
--
IF 3 > SOME (
 SELECT 1
 UNION
 SELECT 2
 UNION
 SELECT 3
 UNION
 SELECT 4
 )
PRINT 'TRUE'
ELSE
PRINT 'FALSE' ;

--
-- Example for ALL - Returns FALSE
--
IF 3 > ALL (
 SELECT 1
 UNION
 SELECT 2
 UNION
 SELECT 3
 UNION
 SELECT 4
 )
PRINT 'TRUE'
ELSE
PRINT 'FALSE' ;

--
-- Example for ANY - Returns TRUE
--
IF 3 < ANY (
 SELECT 1
 UNION
 SELECT 2
 UNION
 SELECT 3
 UNION
 SELECT 4
 )
PRINT 'TRUE'
ELSE
PRINT 'FALSE' ;

Example, where using IN clause makes more sense than using SOME. As you can see below, the execution plan is not very different either way. But keep in mind that actual execution plan for your data set might vary (execution plan below is based on a small data set). Let’s say you want to retrieve records that match with a list of scalar values:

SELECT *
FROM dbo.Table
WHERE ID IN (
 SELECT ID
 FROM dbo.AR_Account
 WHERE CustomerPK IN (2, 5)
 )

This could be written using SOME like this:

SELECT *
FROM dbo.Table
WHERE ID = SOME (
 SELECT ID
 FROM dbo.AR_Account
 WHERE CustomerPK IN (2, 5)
 )

Execution Plan:

Execution Plan difference

Execution Plan difference

Hope this helps,
_Sqltimes

Read Full Post »