Restore bug - Need help in Production

Versions installed

  1. While restoring I got error saying /var/lib/pgbackrest//backup_info.txt not found. Please note // before backup_info.txt. So I opened the pgbackrest.conf file and removed the / from path to point to /var/lib/pgbackrest
  2. Also the backup file had created backup.info whereas restore is looking for backup_info.txt. So I renamed backup.info to backup_info.txt.
  3. Now I get error whereas I have made sure that I have copied the backed up folders in /var/lib/pgbackrest. Please note this is live database so all services and everything is working fine. Any help?

Please ignore the earlier message, I had missed to copy backup_info.txt which has timestamp. So backup_info.txt and backup.info are 2 different files in 2 different folders. But when I restore my backup the postgres doesn’t start.

Does it give any error either on console or in logs?

No error on console or in log files. Last few lines of log files 2017-09-06 13:03:14.225 P01 INFO: local process 1 stop for backup-1 2017-09-06 13:03:14.230 P00 INFO: write /var/lib/pgsql/9.2/data/recovery.conf 2017-09-06 13:03:14.518 P00 INFO: restore global/pg_control (copied last to ensure aborted restores cannot be started) 2017-09-06 13:03:14.528 P00 INFO: restore command end: completed successfully

From Which log file are these lines? Also how do you know the postgres doesn’t start.

These are from /var/log/pgbackrest/bahmni-postgres-restore. Towards the end of the restore when it starts postgres it shows failed but ignoring. When started manually after the restore completes I get

Additionally the postgres log file shows the following /var/lib/pgsql/9.2/data/pg_log/postgresql-Wed.log LOG: could not open file “pg_xlog/00000001000000000000001B” (log file 0, segment 27): No such file or directory LOG: invalid checkpoint record FATAL: could not locate required checkpoint record HINT: If you are not restoring from a backup, try removing the file “/var/lib/pgsql/9.2/data/backup_label”. LOG: startup process (PID 28933) exited with exit code 1 LOG: aborting startup due to startup process failure

@nawazshaikh did you solve this issue and how did you proceeded ?

Please check the day wise pgsql log in /var/lib/pgsql//data/pg_log/ and check if you get a meaningful error such as missing xlog file

@ramashish I think the problem is somewhere else. On the 4th of Feb. I had two folders in var/lib/pgbackrest/backup/bahmni-postgres named

/20210202 and /20210203

Then I ran a new backup and the system made changes and the new backup folders are now

/20210203 and /20210204

As I read in the documentation, this can happen because of the retention_limit = 2 parameter in bahmni-backrest.conf file.

How can I rollback my backups and get back the old folder 20210202 ?

Also can the information under the /var/lib/pgbackest/archives/ help to restore old data ?

I’m running Bahmni 0.92 on CentOS 7.6 with PosgreSQL9.6