Composite index design where only a subset will be queried

Question

I have a table with columns A (typedate), B1 ... B24 (all boolean).

Generally, A will always be used in every query and only one of B1 ... B24. A large subset of the B columns (more than half) will generally be queried more than the rest, but all of them will be used fairly frequently.

Should I create indexes on 24 two-tuples of (A, B1) ... (A, B24)?
Or 1 huge composite index encompassing all the columns?

If I'm creating a lot of indexes, at what point am I adding too many indexes to the table and what is the approach to working that out? The table is used in a lot of core functionality and will be queried very frequently.

The B columns will only be filtered by true, and generally only one B column will be used in a single query. So the where clause of the query will generally look like:

... WHERE A = '2018-11-22' AND B12

Are you going to combine predicates on multiple B* columns in a single query or is it just (A, B1), (A, B13), etc.? But please disclose your version of Postgres first. — Erwin Brandstetter, Nov 22 '18 at 16:56
Generally only one B column will be used, e.g. WHERE A = foo AND B1 = bar — Isaac, Nov 22 '18 at 16:58
And am I right to assume the predicate will be ... AND B1 = true (or simply ... AND B1)? Meaning your queries will retrieve rows where the BN column is set (true)? — Erwin Brandstetter, Nov 22 '18 at 17:03
Please [edit] to collect the whole information in your question for the benefit of others. — Erwin Brandstetter, Nov 22 '18 at 17:16

Erwin Brandstetter · Accepted Answer · 2018-11-22T17:29:49.550

Consider N partial indexes:

CREATE INDEX tbl_a_b1_idx ON tbl (A) WHERE B1;

All together only occupying slightly more space than a plain btree index on (A). You can have hundreds of indexes like that without problem. (I had a case with ~ 250.) The number is certainly not exhausted by the maximum of possible columns (250 - 1600 depending on column types).

Since each partial index is comparatively small, this will typically benefit performance as long as query patterns match.

Additional indexes may be useful and adding more index expressions may make sense to get index-only scans ....

Using a current version of Postgres is advisable in any case, but for this in particular, since there have been a number of optimizations for partial indexes lately.

To create all DDL statements:

SELECT format ('CREATE INDEX tbl_a_b%1$s_idx ON tbl(a) WHERE b%1$s;', g)
FROM generate_series(1, 24) g;

Returns:

CREATE INDEX tbl_a_b1_idx ON tbl(a) WHERE b1;
CREATE INDEX tbl_a_b2_idx ON tbl(a) WHERE b2;
...

PostgreSQL multi-column unique constraint and NULL values

Composite index design where only a subset will be queried

1 Answers1