3

I have a table containing a list of numbers. How do I find numbers that are missing from the table?

For instance:

numbers   id
 1         1
 5         2
 3         3
 6         4
 7         5
 8         6
 9         7
 20        8
 ....

How can I find missing numbers? Like 4 and there are no numbers between 9 and 20.

I have tried nothing, but want to know.

Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
john.li
  • 31
  • 1
  • 1
  • 2
  • i have tried it in postgresql, below is the result, how to do it in mysql. select s.i as missed from generate_series(1,100) s(i) where s.i not in (select id from employee); – john.li Aug 08 '16 at 06:43
  • 1
    You should add this as an answer, instead of a comment. – András Váczi Aug 08 '16 at 08:15
  • 1
    This question is not well defined: How can I find missing numbers? You would need to define more closely which numbers are candidates Negative numbers? Only integer numbers? Lower / upper bound? The best answer depends on it. It also depends on the RDBMS yo actually use. Typically, we expect you to declare what you use - except you have reason to need a solution for multiple systems - or a DB-agnostic pure SQL solution? – Erwin Brandstetter Aug 09 '16 at 01:38
  • 1
    Which engine are you using??? The answer varies! – Rick James Aug 12 '16 at 18:20

5 Answers5

4

Assuming you want all missing integer numbers between the minimum and maximum existing id in your table - in a current Postgres 9.5 installation like you commented (or at least 9.3):

@Seb3W already suggested generate_series(). It's more efficient to retrieve min and max in a single query, though. It's also preferable to use like generate_series() in the FROM list instead of the SELECT list (conforming to standard SQL and less error prone).

SELECT id
FROM  (SELECT min(id) AS a, max(id) AS z FROM numbers) x, generate_series(a, z) id
LEFT   JOIN numbers n1 USING (id)
WHERE  n1.id IS NULL;

Once you have the complete set of candidate numbers, use a run-of-the-mill technique to ...

About the LATERAL join:


If you don't care about standard SQL and want to squeeze out the last drop of performance (or in Postgres 9.2 or older without LATERAL joins) you can use generate_series() in the SELECT list, but still make it a single SELECT:

SELECT id
FROM  (SELECT generate_series(min(id), max(id)) FROM numbers) n(id)
LEFT   JOIN numbers n1 USING (id)
WHERE  n1.id IS NULL;

If you are after performance, you should have an index on numbers.id, of course:

CREATE INDEX numbers_id_idx ON numbers (id);
Erwin Brandstetter
  • 175,982
  • 27
  • 439
  • 600
2

In postgresql you can use generate_series :

SELECT
  generate_series FROM GENERATE_SERIES(
    (select min(id) from numbers), (select max(id) from numbers)
  ) 
WHERE
  NOT EXISTS(SELECT id FROM numbers WHERE id = generate_series)

In mysql your need to generate the serie yourself : generate serie mysql

Seb3W
  • 196
  • 4
1

In sql server,we can use the system table master..spt_values.

USE tempdb
GO

create table numbers(id int)
GO

insert into numbers
select 1
insert into numbers
select 1
insert into numbers
select 5
insert into numbers
select 2
insert into numbers
select 3
insert into numbers
select 3
GO

select number from master..spt_values
where type='p' and number <=(select max(id) from numbers)
and number not in (select id from numbers)
Aasim Abdullah
  • 2,915
  • 4
  • 23
  • 40
slade.liu
  • 59
  • 5
0

In MSSQL you can do the following:

--create a your table
CREATE TABLE ##seq
(id INT IDENTITY (1,1)
, txt VARCHAR(50))

--Build your dataset
INSERT INTO ##seq
        (  txt )
VALUES  ( 'boy'), ('girl'),('parent'),( 'boy'), ('girl'),('parent'),( 'boy'), ('girl'),('parent'),( 'boy'), ('girl'),('parent'),( 'boy'), ('girl'),('parent'),( 'boy'), ('girl'),('parent')

--remove your sequence ids to demonstrate
DELETE FROM ##seq WHERE id IN (2,5,7,8,9)

--Query your sequence for gaps 
SELECT (t1.id + 1) as gap_starts_at, 
       (SELECT MIN(t3.id) -1 FROM ##seq t3 WHERE t3.id > t1.id) as gap_ends_at
FROM ##seq t1
WHERE NOT EXISTS (SELECT t2.id FROM ##seq t2 WHERE t2.id = t1.id + 1)
0

For MariaDB, it is something like

SELECT seq
    FROM seq_1_to_20
    LEFT JOIN tbl AS t  ON t.numbers = seq
    WHERE t.numbers IS NULL

reference

Rick James
  • 78,038
  • 5
  • 47
  • 113