From 409b3a186f0f267dafbff04e7a74e262d6e212ad Mon Sep 17 00:00:00 2001 From: Radko Krkos <krkos@cesnet.cz> Date: Mon, 28 Feb 2022 08:33:23 +0100 Subject: [PATCH] Update the PostgreSQL upgrade documentation * Upgrade steps to PostgreSQL v14 from v13 are added. * Upgrade steps for older versions are removed, a warning directing the user to older Mentat versions for them is added. (Redmine issue: #7555) --- doc/sphinx/_doclib/upgrading.rst | 428 ++++++++----------------------- 1 file changed, 110 insertions(+), 318 deletions(-) diff --git a/doc/sphinx/_doclib/upgrading.rst b/doc/sphinx/_doclib/upgrading.rst index e096d82e..b7ec2da7 100644 --- a/doc/sphinx/_doclib/upgrading.rst +++ b/doc/sphinx/_doclib/upgrading.rst @@ -115,223 +115,27 @@ Upgrading to Mentat 2.8 or also proceed with the merge. -.. _section-upgrading-postgresql-10: +.. _section-upgrading-postgresql-13: -Upgrading PostgreSQL from 10.x to 11.x +Upgrading PostgreSQL from 13.x to 14.x -------------------------------------------------------------------------------- Following checklist describes the steps necessary to upgrade the PostgreSQL database -from version ``10.x`` to ``11.x``. +from version ``13.x`` to ``14.x``. .. warning:: - Please be aware, that the database upgrade is NOT a straightforward operation. - It can take a lot of time depending on the size of the current database, - because the data files need to be converted to new format. - -.. code-block:: shell - - # Launch tmux or screen. - tmux - - # Step 0: Activate maintenance mode: - # First update timestamps of maintenance start and maintenance end: - $ vim /etc/mentat/apache/maintenance/.htaccess - # Now bring the Mentat system web interface down and maintenance site up: - $ a2enmod substitute - $ a2dissite site_mentat-ng.conf - $ a2ensite site_maintenance.conf - $ systemctl restart apache2 - - # Step 1: Stop all processes touching the PostgreSQL database: - $ systemctl stop apache2 - $ mentat-controller.py --command stop - $ mentat-controller.py --command disable - $ systemctl stop postgresql - - # Step 2: Install PostgreSQL 11: - $ aptitude update - $ aptitude install postgresql-11 postgresql-11-ip4r postgresql-server-dev-11 postgresql-client-11 - - # Step 3: Verify the installation success (output included): - $ pg_lsclusters - :Ver Cluster Port Status Owner Data directory Log file - :10 main 5432 online postgres /var/lib/postgresql/10/main /var/log/postgresql/postgresql-10-main.log - :11 main 5433 online postgres /var/lib/postgresql/11/main /var/log/postgresql/postgresql-11-main.log - - # Step 4: PostgreSQL was started during installation, stop it again: - $ systemctl stop postgresql - - # Step 5: Drop the default PostgreSQL 11 cluster created during installation: - $ pg_dropcluster 11 main - - # Step 6: Verify the clusters (output included): - $ pg_lsclusters - :Ver Cluster Port Status Owner Data directory Log file - :10 main 5432 down postgres /var/lib/postgresql/10/main /var/log/postgresql/postgresql-10-main.log - - # Step 7: Perform the data migration (slow to complete): - $ pg_upgradecluster --method=upgrade 10 main - - # Step 8: Drop the PostgreSQL 10 data as there are two copies (10+11): - $ pg_dropcluster --stop 10 main - - # Step 9: Remove the old PostgreSQL version: - $ aptitude purge postgresql-10 postgresql-10-ip4r postgresql-server-dev-10 postgresql-client-10 - - # Step 10: Start the DB (maintenance still required, not ready for system uptime): - $ systemctl start postgresql - - # Step 11: From the PostgreSQL shell (psql): - # The CLUSTER is optional, it takes time but can shrink the DB size considerably if not done recently - VACUUM VERBOSE; - -- CLUSTER VERBOSE events; - ANALYZE VERBOSE; - - # Step 12: This is a good time for restart (optional). New kernel? Long uptime & non-ECC RAM? - $ reboot - - # Step 13: Now the system is ready for production, start it up - $ systemctl restart postgresql - $ systemctl start apache2 - $ mentat-controller.py --command enable - $ mentat-controller.py --command start - - # Step 14: Restart the web server that is serving web interface: - $ a2dismod substitute - $ a2dissite site_maintenance.conf - $ a2ensite site_mentat-ng.conf - $ systemctl restart apache2 - -After these steps it is necessary to update following configuration files: - -``/etc/mentat/mentat-cleanup.py.conf`` - Change configuration ``db_path`` to point to correct filesystem location. In default - Debian installations it should look something like this: - - ``"db_path": "/var/lib/postgresql/11/main",`` - - -.. _section-upgrading-postgresql-11: - -Upgrading PostgreSQL from 11.x to 12.x --------------------------------------------------------------------------------- - -Following checklist describes the steps necessary to upgrade the PostgreSQL database -from version ``11.x`` to ``12.x``. - -.. warning:: - - Please be aware, that the database upgrade is NOT a straightforward operation. - It can take a lot of time depending on the size of the current database, - because the data files need to be converted to new format. - - Upgrade to the latest version of Mentat prior to upgrading PostgreSQL. - -.. code-block:: shell - - # Launch tmux or screen. - tmux - - # Step 0: Activate maintenance mode: - # First update timestamps of maintenance start and maintenance end: - $ vim /etc/mentat/apache/maintenance/.htaccess - # Now bring the Mentat system web interface down and maintenance site up: - $ a2enmod substitute - $ a2dissite site_mentat-ng.conf - $ a2ensite site_maintenance.conf - $ systemctl restart apache2 - - # Step 1: Stop all processes touching the PostgreSQL database: - $ sudo systemctl stop warden_filer_cesnet_receiver.service - $ sudo systemctl disable warden_filer_cesnet_receiver.service - $ sudo mentat-controller.py --command stop - $ sudo mentat-controller.py --command disable - $ systemctl restart postgresql - - ### There can be no DB writes beyond this point as we are about to drop indices to ensure data integrity! - - # Step 2: Connect to current database: - $ psql mentat_events - DROP INDEX events_detecttime_idx; - DROP INDEX events_combined_idx; - DROP INDEX events_storagetime_idx; - DROP INDEX events_eventseverity_idx; - ALTER TABLE events DROP CONSTRAINT events_pkey; - VACUUM FREEZE VERBOSE; - CHECKPOINT; - - # Step 3: Stop PostgreSQL: - $ sudo systemctl stop postgresql - - # Step 4: Install PostgreSQL 12: - $ sudo apt-get update - $ sudo apt-get install postgresql-12 postgresql-12-ip4r postgresql-server-dev-12 postgresql-client-12 - - # Step 5: Migration: - $ sudo pg_lsclusters - Ver Cluster Port Status Owner Data directory Log file - 11 main 5432 online postgres /var/lib/postgresql/11/main /var/log/postgresql/postgresql-11-main.log - 12 main 5433 online postgres /var/lib/postgresql/12/main /var/log/postgresql/postgresql-12-main.log - - $ sudo systemctl stop postgresql - - $ sudo pg_dropcluster 12 main - - $ sudo pg_lsclusters - Ver Cluster Port Status Owner Data directory Log file - 11 main 5432 down postgres /var/lib/postgresql/11/main /var/log/postgresql/postgresql-11-main.log - - # This will require *temporarily* setting wal_level to 'logical' (in postgresql.conf) - it is set to 'minimal' if you followed configuration advice from docs - # Alternatively one can ommit the --link parameter, but that requires free space for a 1:1 copy and of course also takes much longer - $ sudo pg_upgradecluster --method=upgrade --link 11 main - - $ sudo pg_dropcluster 11 main - - # Step 6: Remove PostgreSQL 11 and all prior versions: - $ sudo apt-get remove --purge postgresql-11 postgresql-client-11 postgresql-server-dev-11 postgresql-11-ip4r postgresql-9.4 postgresql-9.5 postgresql-9.6 postgresql-10 - - # Step 7: Start PostgreSQL: - $ sudo systemctl start postgresql - - # Step 8: Recreate indices: - REINDEX DATABASE mentat_events; - ALTER TABLE events ADD PRIMARY KEY (id); - CREATE INDEX IF NOT EXISTS events_detecttime_idx ON events USING BTREE (detecttime); - CREATE INDEX IF NOT EXISTS events_storagetime_idx ON events USING BTREE (storagetime); - CREATE INDEX IF NOT EXISTS events_eventseverity_idx ON events USING BTREE (eventseverity) WHERE eventseverity IS NOT NULL; - CREATE INDEX IF NOT EXISTS events_combined_idx ON events USING GIN (category, node_name, protocol, source_port, target_port, source_type, target_type, node_type, resolvedabuses, inspectionerrors); - CHECKPOINT; - ANALYZE VERBOSE; - - # Step 9: This is a good time for restart (optional). New kernel? Long uptime & non-ECC RAM? - $ reboot - - # Step 10: Start Mentat and all other services: - $ systemctl restart postgresql - $ sudo mentat-controller.py --command enable - $ sudo mentat-controller.py --command start - $ sudo systemctl start warden_filer_cesnet_receiver.service - $ sudo systemctl enable warden_filer_cesnet_receiver.service - - # Step 11: Restart the web server that is serving web interface: - $ a2dismod substitute - $ a2dissite site_maintenance.conf - $ a2ensite site_mentat-ng.conf - $ systemctl restart apache2 - + Please be aware that running Mentat is supported only on a specific major + version of PostgreSQL (usually the latest), as the Mentat's database access code + is tuned to it and sometimes also requires the new features. -.. _section-upgrading-postgresql-12: + For upgrading PostgreSQL from older versions, please refer to the documentation + of previous versions of Mentat. -Upgrading PostgreSQL from 12.x to 13.x --------------------------------------------------------------------------------- - -Following checklist describes the steps necessary to upgrade the PostgreSQL database -from version ``12.x`` to ``13.x``. .. warning:: - Please be aware, that the database upgrade is NOT a straightforward operation. + Please be aware that the database upgrade is NOT a straightforward operation. It can take a lot of time depending on the size of the current database, because the data files need to be converted to new format. @@ -339,119 +143,107 @@ from version ``12.x`` to ``13.x``. .. code-block:: shell - # Launch tmux or screen. - tmux - - # Step 0: Activate maintenance mode: - # First update timestamps of maintenance start and maintenance end: - $ vim /etc/mentat/apache/maintenance/.htaccess - # Now bring the Mentat system web interface down and maintenance site up: - $ a2enmod substitute - $ a2dissite site_mentat-ng.conf - $ a2ensite site_maintenance.conf - $ systemctl restart apache2 - - # Step 1: Stop all processes touching the PostgreSQL database: - $ sudo systemctl stop warden_filer_cesnet_receiver.service - $ sudo systemctl disable warden_filer_cesnet_receiver.service - $ sudo mentat-controller.py --command stop - $ sudo mentat-controller.py --command disable - # Make sure there are no open or stale transactions or maintenance running - $ systemctl restart postgresql - - ### There must be no DB writes beyond this point as we are about to drop indices to ensure data integrity! - - # Step 2: Connect to current database: - $ psql mentat_events - DROP INDEX events_detecttime_idx; - DROP INDEX events_combined_idx; - DROP INDEX events_storagetime_idx; - DROP INDEX events_eventseverity_idx; - ALTER TABLE events_json DROP CONSTRAINT events_json_id_fkey; - ALTER TABLE events_json DROP CONSTRAINT events_json_pkey; - ALTER TABLE events DROP CONSTRAINT events_pkey; - VACUUM FREEZE; - CHECKPOINT; - - # Step 3: Stop PostgreSQL: - $ sudo systemctl stop postgresql - - # Step 4: Install PostgreSQL 13: - $ sudo apt-get update - $ sudo apt-get install postgresql-13 postgresql-13-ip4r postgresql-server-dev-13 postgresql-client-13 - - # Step 5: Back up the default PostgreSQL v13 configuration file - # This is used later in step 9. - $ cp /etc/postgresql/13/main/postgresql.conf ~/postgresql_13_default.conf - - # Step 6: Migration: - $ sudo pg_lsclusters - Ver Cluster Port Status Owner Data directory Log file - 12 main 5432 online postgres /var/lib/postgresql/12/main /var/log/postgresql/postgresql-12-main.log - 13 main 5433 online postgres /var/lib/postgresql/13/main /var/log/postgresql/postgresql-13-main.log - - $ sudo systemctl stop postgresql - - $ sudo pg_dropcluster 13 main - - $ sudo pg_lsclusters - $ sudo pg_lsclusters - Ver Cluster Port Status Owner Data directory Log file - 12 main 5432 down postgres /var/lib/postgresql/12/main /var/log/postgresql/postgresql-12-main.log - - # Change wal_level to 'logical' (in postgresql.conf) if it is se to 'minimal' (which should, if you followed configuration advice from docs). - # This is *temporary* change for migration. - # Alternatively one can ommit the --link argument, but that requires free space for a 1:1 copy and of course also takes much longer - $ sudo pg_upgradecluster --method=upgrade --link 12 main - - $ sudo pg_dropcluster 12 main - - # Step 7: Remove PostgreSQL 12 and all prior versions: - $ $ sudo apt-get remove --purge postgresql-12 postgresql-client-12 postgresql-server-dev-12 postgresql-12-ip4r postgresql-11 postgresql-client-11 postgresql-server-dev-11 postgresql-11-ip4r postgresql-10 postgresql-9.4 postgresql-9.5 postgresql-9.6 - - # Step 8: Update the configuration file - # This is the most laborous step, which I have found no way of automating. Also, rarely the options are just reordered, which complicates the merge process. - $ sudo vimdiff /etc/postgresql/13/main/postgresql.conf ~/postgresql_13_default.conf - - # Change the following options in /etc/postgresql/13/main/postgresql.conf: - autovacuum_vacuum_insert_threshold = -1 - - # Change the setting for wal_level back to 'minimal' if it was changed in step 7. - - # Step 9a: Reboot the system: - # OPTIONAL: This is a good time to reboot the machine if desired (kernel update, long uptime & non-ECC RAM). Alternatively, just follow with 9b. - $ sudo reboot - - # Step 9b: Start PostgreSQL: - # Only if 9a was skipped. - $ sudo systemctl start postgresql - - # Step 10: Recreate indices: - # psql mentat_events - ANALYZE; - REINDEX DATABASE mentat_events; - ALTER TABLE events ADD PRIMARY KEY (id); - ALTER TABLE events_json ADD PRIMARY KEY (id); - ALTER TABLE events_json ADD FOREIGN KEY (id) REFERENCES events(id) ON DELETE CASCADE; - CREATE INDEX IF NOT EXISTS events_detecttime_idx ON events USING BTREE (detecttime); - CREATE INDEX IF NOT EXISTS events_storagetime_idx ON events USING BTREE (storagetime); - CREATE INDEX IF NOT EXISTS events_eventseverity_idx ON events USING BTREE (eventseverity) WHERE eventseverity IS NOT NULL; - CREATE INDEX IF NOT EXISTS events_combined_idx ON events USING GIN (category, node_name, protocol, source_port, target_port, source_type, target_type, node_type, resolvedabuses, inspectionerrors); - CREATE INDEX IF NOT EXISTS events_ip_aggr_idx ON events USING GIST (source_ip_aggr_ip4, target_ip_aggr_ip4, source_ip_aggr_ip6, target_ip_aggr_ip6); - CHECKPOINT; - - # Step 11: Start Mentat and all other services: - $ systemctl restart postgresql - $ sudo mentat-controller.py --command enable - $ sudo mentat-controller.py --command start - $ sudo systemctl start warden_filer_cesnet_receiver.service - $ sudo systemctl enable warden_filer_cesnet_receiver.service - - # Step 12: Deactivate maintenance mode and restart the web server that is serving web interface: - $ a2dismod substitute - $ a2dissite site_maintenance.conf - $ a2ensite site_mentat-ng.conf - $ systemctl restart apache2 + # Step 0. Launch tmux or screen + $ tmux + + # Step 1. Activate the maintenance mode website + # First update timestamps of maintenance start and maintenance end: + $ sudo vim /etc/mentat/apache/maintenance/.htaccess + # Now bring the Mentat system web interface down and the maintenance site up: + $ sudo a2enmod substitute + $ sudo a2dissite site_mentat-ng.conf + $ sudo a2ensite site_maintenance.conf + $ sudo systemctl restart apache2 + + # Step 2. Shut down the Mentat's import pipeline and stabilize the DB + $ sudo systemctl stop warden_filer_cesnet_receiver.service + $ sudo systemctl disable warden_filer_cesnet_receiver.service + $ sudo mentat-controller.py --command stop + $ sudo mentat-controller.py --command disable + $ sudo systemctl restart postgresql + + # Step 3. Vacuum the database + # Typically, peer authentication is set up for user postgres in the DB + $ sudo -u postgres vacuumdb -F -j 16 -v -a + + # Step 4. Stop PostgreSQL + $ sudo systemctl stop postgresql + + # Step 5. Install PostgreSQL 14 + $ sudo apt-get update + $ sudo apt-get install postgresql-14 postgresql-14-ip4r postgresql-server-dev-14 postgresql-client-14 + + # Step 6. Back up the default PostgreSQL v14 configuration file + # This is used later in step 9. + $ cp /etc/postgresql/14/main/postgresql.conf ~/postgresql_14_default.conf + + # Step 7. Migration + $ sudo pg_lsclusters + Ver Cluster Port Status Owner Data directory Log file + 13 main 5432 online postgres /var/lib/postgresql/13/main /var/log/postgresql/postgresql-13-main.log + 14 main 5433 online postgres /var/lib/postgresql/14/main /var/log/postgresql/postgresql-14-main.log + + $ sudo systemctl stop postgresql + + $ sudo pg_dropcluster 14 main + + $ sudo pg_lsclusters + Ver Cluster Port Status Owner Data directory Log file + 13 main 5432 down postgres /var/lib/postgresql/13/main /var/log/postgresql/postgresql-13-main.log + + # This will require *temporarily* setting wal_level to 'logical' (in postgresql.conf), as + # it is set to 'minimal' if you followed configuration advice from the docs. + # Alternatively one can ommit the --link parameter, but that requires free space for + # a 1:1 copy and of course also takes much longer. + $ sudo sed -i -E 's/^(wal_level\s*=\s*)[a-z]+/\1logical/' /etc/postgresql/13/main/postgresql.conf + $ sudo pg_upgradecluster --method=upgrade --link 13 main + + $ sudo pg_dropcluster 13 main + + # Step 8. Remove PostgreSQL 13 and potential leftovers from previous versions + $ sudo apt-get remove --purge postgresql-13 postgresql-client-13 postgresql-server-dev-13 postgresql-13-ip4r postgresql-13 postgresql-12 postgresql-client-12 postgresql-server-dev-12 postgresql-12-ip4r postgresql-12 postgresql-client-11 postgresql-server-dev-11 postgresql-11-ip4r postgresql-10 postgresql-9.6 postgresql-9.5 postgresql-9.4 + + # Step 9. Update the configuration file + # Related to #6480. This is the most laborous step, which I is not yet automated. + # Also, sometimes the options are just reordered, what complicates the merge process. + $ sudo vimdiff /etc/postgresql/14/main/postgresql.conf ~/postgresql_14_default.conf + + # Change the setting for wal_level back to minimal if it was changed in step 7. + $ sudo sed -i -E 's/^(wal_level\s*=\s*)[a-z]+/\1minimal/' /etc/postgresql/14/main/postgresql.conf + + # Step 10a. Reboot the system + # OPTIONAL: This is a good time to reboot the machine if desired (kernel update, + # long runtime & non-ECC RAM). Alternatively, just follow with 10b. + $ sudo reboot + + # Step 10b. Start PostgreSQL + # Only if step 10a was skipped. + $ sudo systemctl start postgresql + + # Step 11a. Cleanup & optimization + # As PostgreSQL upgrade is done roughly once a year (that is the cadence of major version + # release), it is a good point to do VACUUM FULL. Alternatively, if downtime has to be + # minimized at all costs, continue with step 11b. + # Skipping will save about 10 minutes. It is not recommended. + $ sudo -u postgres vacuumdb -f -j 16 -v -a -z + + # Step 11b. The mandatory ANALYZE + # Only if step 11a was skipped. + # At least an ANALYZE run is required as the statistics are not carried over + # during the upgrade. + $ sudo -u postgres vacuumdb -Z -j 16 -a + + # Step 12. Start Mentat + $ sudo mentat-controller.py --command enable + $ sudo mentat-controller.py --command start + $ sudo systemctl enable warden_filer_cesnet_receiver.service + $ sudo systemctl start warden_filer_cesnet_receiver.service + + # Step 13. Deactivate the maintenance mode website + $ sudo a2dismod substitute + $ sudo a2dissite site_maintenance.conf + $ sudo a2ensite site_mentat-ng.conf + $ sudo systemctl restart apache2 .. _section-upgrading-geoip: -- GitLab