Postgresql - What is the best strategy to safely move postgresql backup?

Requirement: postgresql backups should be copied to another server on daily basis to keep them safe using a script.

Problem Statement:

  1. Identify the name of the backup folder created on daily basis
  2. Identify the associated WAL segments

Why WAL segments are required? Without proper WAL segments postgresql cannot be started and end up getting error

INFO: archive-get command end: aborted with exception [119]
LOG:  could not open file "pg_xlog/###" (log file #, segment ##): No such file or directory

Solutions:

Option 1 - More coding to create the copy script

  1. Get the name of the backup folder created on daily basis from the last line of /var/lib/pgbackrest/backup/bahmni-postgres/backup.info. Copy this folder.

  2. Get the name of the WAL segments to be copied to ensure the backup sanctity is tricky as all WAL segments are archived in the same Archive folder /var/lib/pgbackrest/archive/. To identify the name of the WAL segments read /var/lib/pgbackrest/backup/bahmni-postgres/<name of the folder identified in step 1 above>/backup.manifest and read

    [backup]
    backup-archive-start="<start of WAL>"
    backup-archive-stop="<end of WAL>"
    

Copy all files from to from /var/lib/pgbackrest/archive/

  1. Backup folder of step 1 and WAL segment files of step 2 will maintain the sanctity of the backup and will be able to restore on another machine.

Option 2 - Moderate coding to create copy script

Copy WAL segments which are stored in /var/lib/pgsql/9.6/data/pg_xlog/ folder, in to the backup folders using --archive-copy option as mentioned here

Option 3 - Easy coding to create copy script While copying specific backup folder, also copy the entire /var/lib/pgbackrest/archive/. Size is the main Issue with this option archive folder stores all previous WAL segements too.

1 Like

The safer option for postgres backup seems to be to use: pg_dump command, which generates SQLs for all data across all tables in a database, and then can be restored into ANY version of postgres DB.

This seems a less geeky way of doing things, than doing a file-system backup and WAL segments. The filesystem or WAL backups seem to be recommended when DB is large and we want continuous replay backups with zero downtime. Also, they can get complex to restore if postgres versions are different. Isn’t it?

Can you provide more info on why you recommend this approach over pg_dump.

Absolutely right! pg_dump is the safest and for the following reasons

  1. No worries about WAL segment
  2. Safe migration from lower versions to higher versions eg Bahmni 0.91 uses PostgeSQL 9.2 but Bahmni 0.92 onwards uses Postgresql 9.6. and only option to migrate in such a chase is pg_dump.

So why these options are proposed above?

  1. In most cases, production backups are happening at night and copied to another backup store. The proposed options are for safely copying backups generated using Bahmni commands, to a backup store.
  2. Bahmni uses pgbackrest for backup which is different than pg_dump
  3. Bahmni generated backups are stored in /var/lib/pgbackrest/. It has 2 folders archive and backup and one may tend to ignore archive folder (by the name of the folder) and may safely copy only backup folder which may not work in absence of WAL segments which are in the archive folder (unless used --archive-copy option on pgbackrest).
  4. Bahmni Wiki does not specify how to safely move backup generated using Bahmni commands from a production server to a backup store.

Thanks for explaining your rationale @ramashish

I guess Bahmni Wiki needs to have some better documentation (and tooling) on making it easier to do daily/weekly backups and restore – within same or different Postgres versions.

I will keep this in mind. thanks!