0

I am running the following queries in MS SQL Server (version 2016 if that matters).

SELECT DISTINCT Job_Number
FROM table_A
WHERE Job_Number IS NOT NULL

returns 2376 rows

SELECT *
FROM table_A
WHERE Job_Number IN (SELECT DISTINCT [House Bill] FROM table_B)

returns 137 rows

However,

SELECT *
FROM table_A
WHERE Job_Number NOT IN (SELECT DISTINCT [House Bill] FROM table_B)`

returns 0 rows. I would expect this would return 2239 rows (i.e. 2376 - 137).

Is there any fundamental concept of NOT IN that I have yet to understand here?

Gordon Linoff
  • 1,198,228
  • 53
  • 572
  • 709
cbasah
  • 458
  • 3
  • 10

4 Answers4

2

NOT IN will fail to return results if there are any NULL values returned by your following query: (SELECT DISTINCT [House Bill] FROM table_B)

My suggestion is to modify your query to:

(SELECT DISTINCT [House Bill] FROM table_B WHERE [House Bill] IS NOT NULL)

The reason this happens is because whenever a comparison is made between a row in table_A with NULL value in table_B the logical result is UNKNOWN.

There's more information about this:

1 - here and

2 - here

Radu Gheorghiu
  • 19,131
  • 15
  • 70
  • 102
0

The IN() clause ignores NULL values.

juergen d
  • 195,137
  • 36
  • 275
  • 343
0

The possible reason would be null values in tableb in [House Bill] column :

The correct way would be to use NOT EXISTS OR EXISTS instead of NOT IN AND IN clause :

SELECT a.*
FROM table_A a
WHERE NOT EXISTS (SELECT 1 FROM table_B b WHERE a.Job_Number = b.[House Bill]);
Yogesh Sharma
  • 49,081
  • 5
  • 23
  • 49
0

NOT IN returns no rows at all if the subquery returns NULL -- even once.

For that reason, I strongly recommend using NOT EXISTS. This behaves as you expect:

SELECT a.*
FROM table_A a
WHERE NOT EXISTS (SELECT 1 NFROM table_B b WHERE a.Job_Number = b.[House Bill])
Gordon Linoff
  • 1,198,228
  • 53
  • 572
  • 709