Constraint on multiple columns?

Question

I have a table edges, which describes relations in a graph:

CREATE TABLE IF NOT EXISTS edges (
   src INT NOT NULL REFERENCES nodes(id) ON UPDATE CASCADE ON DELETE CASCADE
  ,tgt INT NOT NULL REFERENCES nodes(id) ON UPDATE CASCADE ON DELETE CASCADE
  ,rel TEXT NOT NULL
  ,rel_type INT NOT NULL
  ,PRIMARY KEY (src, tgt, rel)
  ,UNIQUE (src, tgt, rel)
  );

After inserts:

select * from edges;
 src | tgt |    rel    | rel_type 
-----+-----+-----------+----------
   1 |   2 | 5.4.2.2   |        2
   2 |   3 | 5.3.1.9   |        2
   ...
   5 |   6 | 2.7.1.2   |        1
   5 |   6 | 2.7.1.147 |        1
   6 |   2 | 5.3.1.9   |        2
   6 |   3 | 5.3.1.9   |        2
   ...

I am using rel_type to specify edge directionality (0: undirected; 1: source to target; 2: bidirectional).

Hence, inserting (3, 2, '5.3.1.9', 2) is redundant with respect to the second entry above (for example) -- which already expresses the reciprocal relationship 2 --> 3 and 3 --> 2.

How can I add a constraint that prevents the insertion of those redundant, reciprocal relations -- ideally ON CONFLICT DO NOTHING?

Basically, something like (these don't work: first due to syntax; second due to other issues):

ALTER TABLE edges ADD CONSTRAINT duplicate_rel_check CHECK ((src, tgt) <> (tgt, src) WHERE rel_type = 2);

or

CREATE UNIQUE INDEX ON edges ( greatest(src, tgt, rel_type=2), least(tgt, src, rel_type=2) );

Erwin Brandstetter · Accepted Answer · 2019-06-21T02:34:41.330

4

A UNIQUE, multicolumn, partial expression index should do the trick:

CREATE UNIQUE INDEX ON edges (LEAST(src, tgt), GREATEST(src, tgt), rel)
WHERE rel_type = 2;

Works with INSERT ... ON CONFLICT DO NOTHING.

edited Jun 21 '19 at 02:34

answered Jun 21 '19 at 01:18

Erwin Brandstetter

175,982
27
439
600

Awesome; thank you! Re: cidr: these are biomolecular data (5.4.2.2 etc. are enzyme codes, representing the edges: https://www.genome.jp/dbget-bin/www_bget?ec:5.4.2.2 ...). I have another table (nodes), that specifies the source and target nodes, e.g. C00113, d-glucose 1-phoshate, https://www.genome.jp/dbget-bin/www_bget?C00103) ... https://www.genome.jp/kegg-bin/show_pathway?org_name=hsadd&mapno=00010 – Victoria Stuart Jun 21 '19 at 02:08
1

@VictoriaStuart: I see. The sample just looked like IP addresses. :) – Erwin Brandstetter Jun 21 '19 at 02:34
I nonetheless much appreciated the heads-up on that datatype. ;-) – Victoria Stuart Jun 21 '19 at 03:27
@victoria: just curious - would those sorts of data not better be represented and queried as RDF triples ? – Albert Godfrind Jun 22 '19 at 07:21
@AlbertGodfrind: yes; I am storing data in PostgreSQL, and constructing relational property graphs from those data. – Victoria Stuart Jun 22 '19 at 13:57

Constraint on multiple columns?

1 Answers1

Linked