I need to add a new BIGSERIAL column to a huge table (~3 billion records). This question is similar to what I need to do and the accepted answer has helped me somewhat. But I'm still wondering about something. In my case, the table already has a BIGSERIAL column which is the primary key, but many rows have been deleted so now there are gaps. (The table has subsequently been fully vacuumed.) I need to regenerate the values so that they are sequential again. Here are 5 example rows of what I want to achieve where the new_value > 1000:
+---------+---------+
|old_value|new_value|
+---------+---------+
|1026 |1001 |
|1027 |1002 |
|1030 |1003 |
|1032 |1004 |
|1039 |1005 |
+---------+---------+
I have successfully implemented the alternative approach as mentioned in the referenced answer above (CREATE TABLE ... and then INSERT INTO new_table SELECT * FROM ... ), but I would also like to attempt, and benchmark against, the initial suggestion. The problem, however, is that I don't know whether the new_value will be generated in the same order as the old_value as this is a requirement.
How can I ensure the order of the new_value column follows/tracks the order of the old_value column when the new_value column is added using a statement like this:
ALTER TABLE existing_table ADD COLUMN new_value BIGSERIAL;
A different approach
I also attempted the following (that works quite well on a small table), but it's much slower than the alternative suggestion of the referenced answer on very large tables:
ALTER TABLE existing_table ADD COLUMN new_value BIGINT NOT NULL DEFAULT 0;
UPDATE existing_table AS existing
SET new_value = generated.new_id
FROM (
SELECT original.old_value
, row_number() OVER (ORDER BY original.old_value) AS new_id
FROM existing_table AS original
) generated
WHERE existing.old_value = generated.old_value;
identitycolumns. But I do agree with Laurenz: this seems rather futile. Gaps are not a problem. The only job of a generated primary key is to be unique. It is completely irrelevant if there are gaps or what the actual value is. – Apr 13 '22 at 07:13domainevententrytable used in Axon and it has grown too large. We found that many events/records that are stored are not required (if we make some changes to our code). We are therefore busy with a clean-up project. Axon makes use of tracking tokens which can handle gaps, but it adds additional/unnecessary load on the DB to keep track of the gaps, so it's best if the values follow nice and sequential on each other. – HeatZync Apr 13 '22 at 08:30(aggregateidentifier, sequencenumber), but this index is only created after the clean-up/rebuild of the table is done. – HeatZync Apr 14 '22 at 07:12GapAwareTrackingToken. When there are gaps then Axon could fire queries containingWHERE globalindex IN ( ... list of gaps ... ). When there are no gaps then the query doesn't useWHERE globalindex IN. This might not be a major issue for the live system, because it only cares about new rows that are inserted, but there is also the notion of a replay that's used in other circumstances which starts at the first row and runs through the entire table. – HeatZync Apr 14 '22 at 10:37