You will see The difference in rows returned for 'NOT IN' if table1 or table2 has NULL value(s) for this 'a' column
For Performance, the JOIN LEFT/ NULL will be the slowest and be more significant the more rows the tables have.
Details below:
A) Difference in Values returned:
1) NOT IN:
SELECT a FROM table1
WHERE a NOT IN (SELECT a FROM table2)
the IN predicate can return three values TRUE, FALSE or NULL:
TRUE is returned when the value in column 'a' is found in tablw2
FALSE is returned when the value in column 'a' is not found in table2 AND table2 does not contain any NULL values
NULL is returned when the value in column 'a' is NULL, OR the value of 'a' is not NULL and is not found in table2 and table2 contains at least one NULL value
So, because of the NULL value, the IN predicate does not give a definitive answer to whether or not the expression is contained in table2
as long as there are NULL values on either side of the expression, returning NULL instead.
this becomes problem with the negation 'NOT IN' condition since negation of NULL is NULL
As a result, NOT IN condition will never hold for any table that has any NULL value in
a) If table1 has NULL 'a' values, those values will not be returned by NOT IN
b) if table2 has NULL 'a' values, no rows will be returned
2) NOT EXISTS
SELECT a FROM table1
WHERE NOT EXISTS (SELECT * FROM table2 WHERE table1.a = table2.a)
EXISTS returns two values only, TRUE or FALSE
It return TRUE as soon as it finds only a single matching row in table2, or FALSE, if it find none.
NOT EXISTS will always return values from table1 even if there is one or more rows with value NULL in table2 since
NULL values do not satisfy the equality condition
.
3) LEFT JOIN WHERE IS NULL:
I am afraid there is typo in the problem statement for LEFT JOIN SQL, "WHERE table1.a IS NULL" should be "WHERE table2.a IS NULL"
SELECT a FROM table1
LEFT JOIN table2
ON table1.a = table2.a
WHERE table2.a IS NULL
LEFT JOIN / IS NULL will return TRUE only if no row satisfying the equality condition is found in table2 so it will behave same as NOT EXISTS
B ) Performance
Bes way to measure Performance is to use Execution Plan. It will depend on how SQL server optimizer works scanning the tables for matching the rows.
but expect LEFT JOIN / NULL is going to be slower that NOT EXISTS and NOT IN since it will not apply the filter until after pulling all data from Table2
In Summary,
if you know that your data has NO NULL values, then either NOT IN and NOT EXISTS would be better option for performance than LEFT JOIN / NULL
If NULL values are possible, best to use NOT EXISTS as it will return accurate rows and have better performance then LEFT JOIN / NULL