I am using Fedora 15 with PostgreSQL 9.1.4. Fedora crashed recently after which:
An attempt to start the PostgreSQL server :
service postgresql-9.1 start
gives
Starting postgresql-9.1 (via systemctl): Job failed. See system logs and 'systemctl status' for details.
[FAILED]
Although, the server starts normally when I start the server for the first time after system reboot.
But, an attempt to use psql gives this error :
psql: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
.s.PGSQL.5432 file is not present anywhere on the system.
A locate .s.PGSQL.5432 outputs nothing.
The system log has this :
Aug 14 17:31:58 localhost systemd[1]: postgresql-9.1.service: control process exited, code=exited status=1
Aug 14 17:31:58 localhost systemd[1]: Unit postgresql-9.1.service entered failed state.
A
systemctl status postgresql-9.1.service
gives
postgresql-9.1.service - SYSV: PostgreSQL database server.
Loaded: loaded (/etc/rc.d/init.d/postgresql-9.1)
Active: failed since Tue, 14 Aug 2012 17:31:58 +0530; 58s ago
Process: 2811 ExecStop=/etc/rc.d/init.d/postgresql-9.1 stop (code=exited, status=1/FAILURE)
Process: 12423 ExecStart=/etc/rc.d/init.d/postgresql-9.1 start (code=exited, status=1/FAILURE)
Main PID: 2551 (code=exited, status=1/FAILURE)
CGroup: name=systemd:/system/postgresql-9.1.service
I had not changed the default setting of fsync so I am guessing, it was set to on. I am on a HDD. The HDD crashed.
HDD crash
The HDD crash resulted in running a manual fsck on a prompt and not gui based. With it repairing gazillion inodes etc.. After which I restarted the system with a Ctrl+Alt+Delete.
PostgreSQL's log has this:
LOG: database system was interrupted; last known up at 2012-08-14 17:31:57 IST
LOG: database system was not properly shut down; automatic recovery in progress
LOG: record with zero length at 0/41A4E58
LOG: redo is not required
FATAL: could not access status of transaction 1
DETAIL: Could not open file "pg_multixact/offsets/0000": No such file or directory.
LOG: startup process (PID 13016) exited with exit code 1
LOG: aborting startup due to startup process failure
Update
Trying to start the server after taking a file system level copy of the /var/lib/pgsql directory, and running ./pg_resetxlog -f /var/lib/pgsql/9.1/data/ with the result xlog -f /var/lib/pgsql/9.1/data/ still yields in :
LOG: database system was interrupted; last known up at 2012-08-14 18:46:36 IST
LOG: database system was not properly shut down; automatic recovery in progress
LOG: record with zero length at 0/6000078
LOG: redo is not required
FATAL: could not access status of transaction 1
DETAIL: Could not open file "pg_multixact/offsets/0000": No such file or directory.
LOG: startup process (PID 13766) exited with exit code 1
LOG: aborting startup due to startup process failure
pg_resetxlogdidn't do any good, so you're into fun territory. Do you have a backup of this database from before the crash? – Craig Ringer Aug 14 '12 at 13:25pg_multixact/offsets/0000that Pg would accept... – Craig Ringer Aug 14 '12 at 13:32sudo -u postgres postgres --single -D /var/lib/pgsql/data postgres... though I don't rate the chances. – Craig Ringer Aug 14 '12 at 13:35backend>sql prompt. It's possible to do limited recovery and repair. If it can start with thepostgresdatabase (last argument) try it again with your database of interest, see if you canselectanything out of any tables. – Craig Ringer Aug 14 '12 at 13:39