28

I occasionally download publicly available data sets in the form of Postgres dBs. These datasets are updated/modified/expanded over time by the repository host.

Is there a Postgres command or tool (ideally FOSS) that can show the differences between an OLD and NEW Postgres database ? (A working assumption is that 95% of the entries are unchanged and that the tables and relationships also do not change).

Blaž Dakskobler
  • 1,075
  • 9
  • 15
CuriousGorge
  • 291
  • 1
  • 3
  • 4

4 Answers4

16

Have a look at the Stack Overflow Q & A, How to check difference between two databases in PostgreSQL?; in particular Another PostgreSQL Diff Tool (apgdiff) (schema comparison only). Apgdiff is recommended by Alexander Kuznetsov - a published database book author and bit of a guru.

Maybe also look at Liquibase or Compare Database Table Data.

There are also commercial products - e.g. Altova Database Differencing with DiffDog

Vérace
  • 29,825
  • 9
  • 70
  • 84
  • 2
    apgdiff doesn't compare table data, metaqa is last released 2008 and is broken. liquibase's database diff is not geared towards the OPs use-case: "Data differences (limited), not checked by default" – nyov Sep 01 '19 at 11:03
10

For schema comparisons, pgadmin4 has a Schema Diff feature (in beta as of 26 Jun 2020). https://www.pgadmin.org/

Rob Bednark
  • 2,163
  • 5
  • 21
  • 22
2

There is also:

1

If you need something fairly quick and dirty to compare schemas, I wrote this gist to help compare a running schema to a "latest" schema -- something maybe hosted in SCM. It's working fairly well for a ~50 tables project:

Assuming a database mydb, and a canonical database file defined in a file tables.sql, here is a way to tell if an operational schema differs from the canonical one:

pg_dump -U postgres -a mydb > mydbdata.sql
psql -U postgres -c "drop database mydbcanonical"
psql -U postgres -c "create database mydbcanonical"
psql -U postgres -d mydbcanonical -f tables.sql
psql -U postgres -d mydbcanonical -f mydbdata.sql

The gist is to see if the operational database's data will load cleanly into the canonical schema. These commands depend on the admin to visually parse the output of the last command where the data is loaded into the test database for errors. If there are no errors of concern, the schema is compliant.

Paul White
  • 83,961
  • 28
  • 402
  • 634
Buddy
  • 11
  • 2
  • Does not compare table data. – nyov Sep 01 '19 at 10:23
  • That was okay for my needs. I only needed to compare schemas - the data inside was not as important. I'm sure it's not far from being able to compare data if you can somehow control the order of the dump to be deterministic across versioins/installs/etc. – Buddy Sep 02 '19 at 13:22
  • 1
    It's okay. I was just frustrated that nobody had a good answer to the actual question sofar (Which specifically states "A working assumption is that 95% of the entries are unchanged and that the tables and relationships also do not change".) – nyov Sep 02 '19 at 13:34