Handling INSERT race condition for generated unique column value

Question

I am generating a unique slug during creation of a record like so (based in this answer)

INSERT INTO ks.shares (
    , slugbase
    , slugindex
    , slug
    , userid
)
SELECT
    , Pslugbase
    , COALESCE((SELECT(max(slugindex) + 1)
            FROM  ks.shares s
            WHERE s.slugbase = Pslugbase), 0)
    , Pslugbase
        || COALESCE((SELECT '-'::text || (max(slugindex) + 1)::text
            FROM  ks.shares s
            WHERE s.slugbase = Pslugbase), '')
    , Puserid
RETURNING id, modified, slug INTO Vnewshare;

When testing under load this can raise a unique_violation as two concurrent threads both assess what it he current slug index before one inserts a record with the next available slug, leaving the second thread with a now invalid slug.

I attempted to solve the issue by LOOPING over the insert and catching and unique constraint violations:

LOOP
BEGIN
    INSERT INTO ks.shares (
        , slugbase
        , slugindex
        , slug
        , userid
    )
    SELECT
        , Pslugbase
        , COALESCE((SELECT(max(slugindex) + 1)
                FROM  ks.shares s
                WHERE s.slugbase = Pslugbase), 0)
        , Pslugbase
            || COALESCE((SELECT '-'::text || (max(slugindex) + 1)::text
                FROM  ks.shares s
                WHERE s.slugbase = Pslugbase), '')
        , Puserid
    RETURNING id, modified, slug INTO Vnewshare;

    -- INSERT into two other tables

    RETURN json_build_object (
        'data',json_build_object (
            'id',lpad(Vnewshare.id::text,public.padding_constant(),'0'),
            'modified',Vnewshare.modified,
            'slug',Vnewshare.slug
        )
    );
    EXCEPTION WHEN unique_violation THEN
        -- do nothing, and loop to try the INSERT again
END;
END LOOP;

(I took this pattern from an answer on implementing UPSERT but can't find it!)

However, when I deployed this function my DB instance got stuck twice running long queries that involved this function being called concurrently, so I seemed to have somehow replaced a race condition with a deadlock or an infinite loop

pid  |    duration     |                                      query                                      | state
------+-----------------+---------------------------------------------------------------------------------+--------
8187 | 01:47:35.477316 | select ks.post_share($1::text,$2::text,$3::bigint) | active
1188 | 01:57:56.955747 | select ks.post_share($1::text,$2::text,$3::bigint) | active
(2 rows)

Is the approach to use a LOOP here the wrong way to go?
Is there any way to understand what is causing the query to get stuck? It does not happen during load testing but appeared twice on production (which seems counter intuitive)
Should I just return the conflict to the calling process and let the application handle retry?

Any thoughts would be greatly appreciated.

Why you use INSERT .. VALUES instead of INSERT .. SELECT ? Additionally - you can lock the table during the query, and no conflicts... — Akina, Dec 11 '18 at 09:31
@a_horse_with_no_name Instead of the exception? So put the ON CONFLICT clause inside the loop and leave it empty? — Russell Ormes, Dec 11 '18 at 11:00
@Akina I don't have a specific reason. Is INSERT...SELECT better? Will locking the table have any adverse effect on performance under heavy load? — Russell Ormes, Dec 11 '18 at 11:02
@a_horse_with_no_name Ah, that's the bit I don't understand then. What would I pt in the ON CONFLICT clause to call the same insert and then, if it conflicts again, call it again? Essentially I want it to keep trying until it finds a slug that is not used. (we have many people saving to this table with the same slug base) — Russell Ormes, Dec 11 '18 at 11:04
Is INSERT...SELECT better? Of course - one query is better than 2 queries. Will locking the table have any adverse effect on performance under heavy load? Of course. — Akina, Dec 11 '18 at 11:22

score 1 · Answer 1 · answered Dec 11 '18 at 14:56

You don't mention the isolation level, but if it's run under the REPEATABLE READ isolation level, the SELECT part will never see the new values that other transactions inserted concurrently, so this code may be stuck in its loop forever.

Should I just return the conflict to the calling process and let the application handle retry?

Yes, it would make sense. The key is to retry in a different transaction, so it gets a fresh new snapshot, plus it releases whatever locks that may be blocking other transactions.

Another approach would be to not use max(slugindex)+1, but a sequence instead, if the number part of the slug is globally unique, not unique per suffix. The answer you linked is based on the premise that they want abc-1 and xyz-1 as opposed to abc-1 and xyz-2, which is why a sequence is not an option. But it's not obvious that you share that requirement, looking at your SELECT.

score 0 · Answer 2 · answered Dec 12 '18 at 12:20

Thanks to comments from @Akina and the answer from @Daniel I refactored my solution to avoid the infinite loop.

I am sure that Daniel's suggestion about transaction isolation level is relevant and my attempt at retrying the index lookup was getting stuck (although it is not clear why this only happens some of the time and only under high load).

I could not think how to implement this using ON CONFLICT as it does not seem to have the concept of retrying the same INSERT with a different value for the conflicting column. It seems to support UPDATE on the conflicting row, not creation of a new record with adjusted values.

I took on Daniel's point about having a global index over a unique index for each conflict (this was based off how it was done in the legacy system) and after a quick check with the PO I used that to eliminate the race condition.

CREATE SEQUENCE ks.shares_slugindex_seq OWNED BY ks.shares.slug;
-- Set the sequence to the highest index we already have.
-- Although this would eventually happen anyway from the first loop!
select setval('ks.shares_slugindex_seq',  (SELECT max(slugindex) FROM ks.shares));


CREATE OR REPLACE FUNCTION ks.post_share(
    , Pslug        text
    , Puserid      bigint
)
RETURNS JSON AS $$
DECLARE Vnewshare RECORD;
DECLARE Vslugidx int := 0;
DECLARE Vslugbase text := Pslug;
BEGIN
    CASE
        WHEN Pslug IS NULL THEN RAISE null_value_not_allowed USING MESSAGE = 'Pslug parameter is required.'; -- 22004
        WHEN Puserid IS NULL THEN RAISE null_value_not_allowed USING MESSAGE = 'Puserid parameter is required.'; -- 22004
    ELSE
        LOOP
            BEGIN
                INSERT INTO ks.shares ( slugbase, slugindex, slug, userid)
                SELECT Vslugbase, Vslugidx, Pslug, Puserid
                RETURNING id, modified, slug INTO Vnewshare;
                -- Exit when no exception
                EXIT;
            EXCEPTION WHEN unique_violation THEN
                -- add index to slug and loop to try the INSERT again
                Vslugidx = (SELECT nextval('ks.shares_slugindex_seq'));
                Pslug := (SELECT Pslug || '_' || Vslugidx::text);
            END;
        END LOOP;

    --- Do some inserts into other tables using Vnewshare
        RETURN json_build_object (
            -- Return the new record
        );
    END CASE;
END;
$$ LANGUAGE plpgsql;

I choose to try the insert and catch the exception rather than checking to see if the proposed slug already exists for a record and append the index if it does. That is because doing a select and then insert would put me back in the same position of race condition that I was trying to solve.

Thanks for your help!

I am keeping the slugbase and index columns in case I get told to put it back in a months time! — Russell Ormes, Dec 12 '18 at 12:21

Handling INSERT race condition for generated unique column value

2 Answers2