3

Yes, fillfactor again. I spend many hours reading and I can't decide what's best for each case. I don't understand when and how fragmentation happens. I'm migrating a database from MS SQL Server to PostgreSQL 9.2.

Case 1

10-50 inserts / minute in a sequential (serial) PK, 20-50 reads / hour.

CREATE TABLE dev_transactions (
  transaction_id serial NOT NULL,
  transaction_type smallint NOT NULL,
  moment timestamp without time zone NOT NULL,
  gateway integer NOT NULL,
  device integer NOT NULL,
  controler smallint NOT NULL,
  token integer,
  et_mode character(1),
  status smallint NOT NULL,
  CONSTRAINT pk_dev_transactions PRIMARY KEY (transaction_id)
);

Case 2

Similar structure, index for serial PK, writes in blocks (one shot) of ~ 50.000 registers every 2 months, readings 10-50 / minute.

Does a 50% fillfactor mean that each insert generates a new page and moves 50% of existing rows to a newly generated page?

Does a 50% fillfactor mean frees space is allocated between physical rows in new data pages?

A new page is generated only if there is no free space left in existing pages?

As you can see I'm very confused; I would appreciate some help — maybe a good link to read about PostgreSQL and index fillfactor.

Erwin Brandstetter
  • 539,169
  • 125
  • 977
  • 1,137
HMarioD
  • 804
  • 10
  • 18

1 Answers1

13

FILLFACTOR

With only INSERT and SELECT you should use a FILLFACTOR of 100 everywhere. There is no point in leaving wiggle room per memory block if you are not going to "wiggle" with UPDATEs.

The mechanism behind FILLFACTOR is simple. INSERTs only fill data pages (usually 8 kB blocks) up to the percentage declared by the FILLFACTOR setting. Also, whenever you run VACUUM FULL or CLUSTER on the table, the same wiggle room per block is re-established. Ideally, this allows UPDATE to store new row versions in the same data page, which can provide a substantial performance boost when dealing with lots of UPDATEs. Also beneficial in combination with H.O.T. updates. See:

If there are no updates, don't waste space for this and set FILLFACTOR = 100 (which is the default for tables).

Basic source of information: the manual for CREATE TABLE and CREATE INDEX.

Other optimization

But you can do something else - since you seem to be a sucker for optimization ... :)

CREATE TABLE dev_transactions(
  transaction_id serial PRIMARY KEY
, gateway integer NOT NULL
, moment timestamp NOT NULL
, transaction_type smallint NOT NULL
, status smallint NOT NULL
, device integer NOT NULL
, controler smallint NOT NULL
, token integer
, et_mode character(1)
);

This optimizes your table with regard to data alignment and avoids padding for a typical 64 bit server and saves a few bytes, probably just 8 byte on average - you typically can't squeeze out much with "column tetris:

Keep NOT NULL columns at the start of the table for a very small performance bonus.

Your table has 9 columns. This means an extra 8 bytes for the extended NULL bitmap - which would fit into the initial 1-byte NULL bitmap for just 8 columns.
If you define et_mode and token NOT NULL, all columns are NOT NULLand the NULL bitmap is never used, freeing up 8 bytes per row.
This even works per row if you don't declare the columns NOT NULL. If all columns have values, there is no NULL bitmap for the row. In your special case, this leads to the paradox effect that filling in values for et_mode and token can make your storage size smaller or at least stay the same:

Basic source of information: the manual on Database Physical Storage.

Compare the size of rows (filled with values) with your original table to get definitive proof:

SELECT pg_column_size(t) FROM dev_transactions t;

(Plus maybe padding between rows, as the next row starts at a multiple of 8 bytes.)

Erwin Brandstetter
  • 539,169
  • 125
  • 977
  • 1,137
  • @Erwin thank you very much, this information is very useful and yes I'm a fk sucker for optimisation. I will accept your answer. If you have some link for I understand in deep how the fill factor work, I'll appreciate. – HMarioD Jan 06 '13 at 22:58
  • @HMarioD: I added some more explanation and links to my answer. – Erwin Brandstetter Jan 06 '13 at 23:12
  • Thanks, one more question, I will add a index for transaction_id with fill factor of 100. The UNIQUE keyword is not necessary because the field is PK, right? – HMarioD Jan 06 '13 at 23:25
  • @HMarioD: The *whole index* is not necessary because the field is the PK - which is implemented by way of a fully functional unique index in Postgres. You are already done here. ;) And yes, indexes inherit the fillfactor setting of the table (unless you specify otherwise). – Erwin Brandstetter Jan 06 '13 at 23:27
  • In ms SQL Server is different, a pk can't be considered as an index as long I know. Thanks a lot one more time I will give a trip to your others answer. – HMarioD Jan 06 '13 at 23:32