In Postgres 9.3.5, I'm importing records from an external source where duplicates are VERY rare, but they do happen. Given a readings table with a unique compound key on (real_time_device_id, recorded_at), the following will fail once in a blue moon:
INSERT INTO readings (real_time_device_id, recorded_at, duration) VALUES
('150', TIMESTAMP WITH TIME ZONE '2014-11-01 23:06:33 -0700', 10.0),
... many more records ...
('150', TIMESTAMP WITH TIME ZONE '2014-11-01 23:06:43 -0700', 10.0);
(FWIW, the above fails 'properly' with a duplicate key violation.)
I know that handling exceptions is expensive, but as I said, duplicate entries are very rare. So to keep the code simple, I followed an example given in Optimal way to ignore duplicate inserts?:
BEGIN
INSERT INTO readings (real_time_device_id, recorded_at, duration) VALUES
('150', TIMESTAMP WITH TIME ZONE '2014-11-01 23:06:33 -0700', 10.0),
... many more records ...
('150', TIMESTAMP WITH TIME ZONE '2014-11-01 23:06:43 -0700', 10.0);
EXCEPTION WHEN unique_violation THEN
-- silently ignore inserts
END;
The above gets two errors:
psql:sketches/t15.sql:11: ERROR: syntax error at or near "INSERT"
LINE 2: INSERT INTO readings (real_time_device_id, recorded_...
^
psql:sketches/t15.sql:14: ERROR: syntax error at or near "EXCEPTION"
LINE 1: EXCEPTION WHEN unique_violation THEN
^
Can anyone set me straight on the correct syntax? Or is my error deeper than mere syntax? (For example, will all of the INSERTs be ignored if there is a single duplicate?)
Generally, what's a good way to do bulk inserts where very few (< .1%) are duplicates?
pl/pgsqlfunction. You can either create a real function using it, or use aDOblock. However, be careful, this way you won't silently get over the failing row. The whole thing won't throw an error but you won't have any lines inserted, either. – András Váczi Dec 02 '14 at 09:16