From 409b3a186f0f267dafbff04e7a74e262d6e212ad Mon Sep 17 00:00:00 2001
From: Radko Krkos <krkos@cesnet.cz>
Date: Mon, 28 Feb 2022 08:33:23 +0100
Subject: [PATCH] Update the PostgreSQL upgrade documentation

* Upgrade steps to PostgreSQL v14 from v13 are added.
* Upgrade steps for older versions are removed, a warning directing
  the user to older Mentat versions for them is added.

(Redmine issue: #7555)
---
 doc/sphinx/_doclib/upgrading.rst | 428 ++++++++-----------------------
 1 file changed, 110 insertions(+), 318 deletions(-)

diff --git a/doc/sphinx/_doclib/upgrading.rst b/doc/sphinx/_doclib/upgrading.rst
index e096d82e..b7ec2da7 100644
--- a/doc/sphinx/_doclib/upgrading.rst
+++ b/doc/sphinx/_doclib/upgrading.rst
@@ -115,223 +115,27 @@ Upgrading to Mentat 2.8
   or also proceed with the merge.
 
 
-.. _section-upgrading-postgresql-10:
+.. _section-upgrading-postgresql-13:
 
-Upgrading PostgreSQL from 10.x to 11.x
+Upgrading PostgreSQL from 13.x to 14.x
 --------------------------------------------------------------------------------
 
 Following checklist describes the steps necessary to upgrade the PostgreSQL database
-from version ``10.x`` to ``11.x``.
+from version ``13.x`` to ``14.x``.
 
 .. warning::
 
-    Please be aware, that the database upgrade is NOT a straightforward operation.
-    It can take a lot of time depending on the size of the current database,
-    because the data files need to be converted to new format.
-
-.. code-block:: shell
-
-    # Launch tmux or screen.
-    tmux
-
-    # Step 0: Activate maintenance mode:
-    # First update timestamps of maintenance start and maintenance end:
-    $ vim /etc/mentat/apache/maintenance/.htaccess
-    # Now bring the Mentat system web interface down and maintenance site up:
-    $ a2enmod substitute
-    $ a2dissite site_mentat-ng.conf
-    $ a2ensite site_maintenance.conf
-    $ systemctl restart apache2
-
-    # Step 1: Stop all processes touching the PostgreSQL database:
-    $ systemctl stop apache2
-    $ mentat-controller.py --command stop
-    $ mentat-controller.py --command disable
-    $ systemctl stop postgresql
-
-    # Step 2: Install PostgreSQL 11:
-    $ aptitude update
-    $ aptitude install postgresql-11 postgresql-11-ip4r postgresql-server-dev-11 postgresql-client-11
-
-    # Step 3: Verify the installation success (output included):
-    $ pg_lsclusters
-    :Ver Cluster Port Status Owner    Data directory              Log file
-    :10  main    5432 online postgres /var/lib/postgresql/10/main /var/log/postgresql/postgresql-10-main.log
-    :11  main    5433 online postgres /var/lib/postgresql/11/main /var/log/postgresql/postgresql-11-main.log
-
-    # Step 4: PostgreSQL was started during installation, stop it again:
-    $ systemctl stop postgresql
-
-    # Step 5: Drop the default PostgreSQL 11 cluster created during installation:
-    $ pg_dropcluster 11 main
-
-    # Step 6: Verify the clusters (output included):
-    $ pg_lsclusters
-    :Ver Cluster Port Status Owner    Data directory              Log file
-    :10  main    5432 down   postgres /var/lib/postgresql/10/main /var/log/postgresql/postgresql-10-main.log
-
-    # Step 7: Perform the data migration (slow to complete):
-    $ pg_upgradecluster --method=upgrade 10 main
-
-    # Step 8: Drop the PostgreSQL 10 data as there are two copies (10+11):
-    $ pg_dropcluster --stop 10 main
-
-    # Step 9: Remove the old PostgreSQL version:
-    $ aptitude purge postgresql-10 postgresql-10-ip4r postgresql-server-dev-10 postgresql-client-10
-
-    # Step 10: Start the DB (maintenance still required, not ready for system uptime):
-    $ systemctl start postgresql
-
-    # Step 11: From the PostgreSQL shell (psql):
-    # The CLUSTER is optional, it takes time but can shrink the DB size considerably if not done recently
-    VACUUM VERBOSE;
-    -- CLUSTER VERBOSE events;
-    ANALYZE VERBOSE;
-
-    # Step 12: This is a good time for restart (optional). New kernel? Long uptime & non-ECC RAM?
-    $ reboot
-
-    # Step 13: Now the system is ready for production, start it up
-    $ systemctl restart postgresql
-    $ systemctl start apache2
-    $ mentat-controller.py --command enable
-    $ mentat-controller.py --command start
-
-    # Step 14: Restart the web server that is serving web interface:
-    $ a2dismod substitute
-    $ a2dissite site_maintenance.conf
-    $ a2ensite site_mentat-ng.conf
-    $ systemctl restart apache2
-
-After these steps it is necessary to update following configuration files:
-
-``/etc/mentat/mentat-cleanup.py.conf``
-    Change configuration ``db_path`` to point to correct filesystem location. In default
-    Debian installations it should look something like this:
-
-    ``"db_path": "/var/lib/postgresql/11/main",``
-
-
-.. _section-upgrading-postgresql-11:
-
-Upgrading PostgreSQL from 11.x to 12.x
---------------------------------------------------------------------------------
-
-Following checklist describes the steps necessary to upgrade the PostgreSQL database
-from version ``11.x`` to ``12.x``.
-
-.. warning::
-
-    Please be aware, that the database upgrade is NOT a straightforward operation.
-    It can take a lot of time depending on the size of the current database,
-    because the data files need to be converted to new format.
-
-    Upgrade to the latest version of Mentat prior to upgrading PostgreSQL.
-
-.. code-block:: shell
-
-    # Launch tmux or screen.
-    tmux
-
-    # Step 0: Activate maintenance mode:
-    # First update timestamps of maintenance start and maintenance end:
-    $ vim /etc/mentat/apache/maintenance/.htaccess
-    # Now bring the Mentat system web interface down and maintenance site up:
-    $ a2enmod substitute
-    $ a2dissite site_mentat-ng.conf
-    $ a2ensite site_maintenance.conf
-    $ systemctl restart apache2
-
-    # Step 1: Stop all processes touching the PostgreSQL database:
-    $ sudo systemctl stop warden_filer_cesnet_receiver.service
-    $ sudo systemctl disable warden_filer_cesnet_receiver.service
-    $ sudo mentat-controller.py --command stop
-    $ sudo mentat-controller.py --command disable
-    $ systemctl restart postgresql
-
-    ### There can be no DB writes beyond this point as we are about to drop indices to ensure data integrity!
-
-    # Step 2: Connect to current database:
-    $ psql mentat_events
-    DROP INDEX events_detecttime_idx;
-    DROP INDEX events_combined_idx;
-    DROP INDEX events_storagetime_idx;
-    DROP INDEX events_eventseverity_idx;
-    ALTER TABLE events DROP CONSTRAINT events_pkey;
-    VACUUM FREEZE VERBOSE;
-    CHECKPOINT;
-
-    # Step 3: Stop PostgreSQL:
-    $ sudo systemctl stop postgresql
-
-    # Step 4: Install PostgreSQL 12:
-    $ sudo apt-get update
-    $ sudo apt-get install postgresql-12 postgresql-12-ip4r postgresql-server-dev-12 postgresql-client-12
-
-    # Step 5: Migration:
-    $ sudo pg_lsclusters
-    Ver Cluster Port Status Owner    Data directory              Log file
-    11  main    5432 online postgres /var/lib/postgresql/11/main /var/log/postgresql/postgresql-11-main.log
-    12  main    5433 online postgres /var/lib/postgresql/12/main /var/log/postgresql/postgresql-12-main.log
-
-    $ sudo systemctl stop postgresql
-
-    $ sudo pg_dropcluster 12 main
-
-    $ sudo pg_lsclusters
-    Ver Cluster Port Status Owner    Data directory              Log file
-    11  main    5432 down   postgres /var/lib/postgresql/11/main /var/log/postgresql/postgresql-11-main.log
-
-    # This will require *temporarily* setting wal_level to 'logical' (in postgresql.conf) - it is set to 'minimal' if you followed configuration advice from docs
-    # Alternatively one can ommit the --link parameter, but that requires free space for a 1:1 copy and of course also takes much longer
-    $ sudo pg_upgradecluster --method=upgrade --link 11 main
-
-    $ sudo pg_dropcluster 11 main
-
-    # Step 6: Remove PostgreSQL 11 and all prior versions:
-    $ sudo apt-get remove --purge postgresql-11 postgresql-client-11 postgresql-server-dev-11 postgresql-11-ip4r postgresql-9.4 postgresql-9.5 postgresql-9.6 postgresql-10
-
-    # Step 7: Start PostgreSQL:
-    $ sudo systemctl start postgresql
-
-    # Step 8: Recreate indices:
-    REINDEX DATABASE mentat_events;
-    ALTER TABLE events ADD PRIMARY KEY (id);
-    CREATE INDEX IF NOT EXISTS events_detecttime_idx ON events USING BTREE (detecttime);
-    CREATE INDEX IF NOT EXISTS events_storagetime_idx ON events USING BTREE (storagetime);
-    CREATE INDEX IF NOT EXISTS events_eventseverity_idx ON events USING BTREE (eventseverity) WHERE eventseverity IS NOT NULL;
-    CREATE INDEX IF NOT EXISTS events_combined_idx ON events USING GIN (category, node_name, protocol, source_port, target_port, source_type, target_type, node_type, resolvedabuses, inspectionerrors);
-    CHECKPOINT;
-    ANALYZE VERBOSE;
-
-    # Step 9: This is a good time for restart (optional). New kernel? Long uptime & non-ECC RAM?
-    $ reboot
-
-    # Step 10: Start Mentat and all other services:
-    $ systemctl restart postgresql
-    $ sudo mentat-controller.py --command enable
-    $ sudo mentat-controller.py --command start
-    $ sudo systemctl start warden_filer_cesnet_receiver.service
-    $ sudo systemctl enable warden_filer_cesnet_receiver.service
-
-    # Step 11: Restart the web server that is serving web interface:
-    $ a2dismod substitute
-    $ a2dissite site_maintenance.conf
-    $ a2ensite site_mentat-ng.conf
-    $ systemctl restart apache2
-
+   Please be aware that running Mentat is supported only on a specific major
+   version of PostgreSQL (usually the latest), as the Mentat's database access code
+   is tuned to it and sometimes also requires the new features.
 
-.. _section-upgrading-postgresql-12:
+   For upgrading PostgreSQL from older versions, please refer to the documentation
+   of previous versions of Mentat.
 
-Upgrading PostgreSQL from 12.x to 13.x
---------------------------------------------------------------------------------
-
-Following checklist describes the steps necessary to upgrade the PostgreSQL database
-from version ``12.x`` to ``13.x``.
 
 .. warning::
 
-    Please be aware, that the database upgrade is NOT a straightforward operation.
+    Please be aware that the database upgrade is NOT a straightforward operation.
     It can take a lot of time depending on the size of the current database,
     because the data files need to be converted to new format.
 
@@ -339,119 +143,107 @@ from version ``12.x`` to ``13.x``.
 
 .. code-block:: shell
 
-    # Launch tmux or screen.
-    tmux
-
-    # Step 0: Activate maintenance mode:
-    # First update timestamps of maintenance start and maintenance end:
-    $ vim /etc/mentat/apache/maintenance/.htaccess
-    # Now bring the Mentat system web interface down and maintenance site up:
-    $ a2enmod substitute
-    $ a2dissite site_mentat-ng.conf
-    $ a2ensite site_maintenance.conf
-    $ systemctl restart apache2
-
-    # Step 1: Stop all processes touching the PostgreSQL database:
-    $ sudo systemctl stop warden_filer_cesnet_receiver.service
-    $ sudo systemctl disable warden_filer_cesnet_receiver.service
-    $ sudo mentat-controller.py --command stop
-    $ sudo mentat-controller.py --command disable
-    # Make sure there are no open or stale transactions or maintenance running
-    $ systemctl restart postgresql
-
-    ### There must be no DB writes beyond this point as we are about to drop indices to ensure data integrity!
-
-    # Step 2: Connect to current database:
-    $ psql mentat_events
-    DROP INDEX events_detecttime_idx;
-    DROP INDEX events_combined_idx;
-    DROP INDEX events_storagetime_idx;
-    DROP INDEX events_eventseverity_idx;
-    ALTER TABLE events_json DROP CONSTRAINT events_json_id_fkey;
-    ALTER TABLE events_json DROP CONSTRAINT events_json_pkey;
-    ALTER TABLE events DROP CONSTRAINT events_pkey;
-    VACUUM FREEZE;
-    CHECKPOINT;
-
-    # Step 3: Stop PostgreSQL:
-    $ sudo systemctl stop postgresql
-
-    # Step 4: Install PostgreSQL 13:
-    $ sudo apt-get update
-    $ sudo apt-get install postgresql-13 postgresql-13-ip4r postgresql-server-dev-13 postgresql-client-13
-
-    # Step 5: Back up the default PostgreSQL v13 configuration file
-    # This is used later in step 9.
-    $ cp /etc/postgresql/13/main/postgresql.conf ~/postgresql_13_default.conf
-
-    # Step 6: Migration:
-    $ sudo pg_lsclusters
-    Ver Cluster Port Status Owner    Data directory              Log file
-    12  main    5432 online postgres /var/lib/postgresql/12/main /var/log/postgresql/postgresql-12-main.log
-    13  main    5433 online postgres /var/lib/postgresql/13/main /var/log/postgresql/postgresql-13-main.log
-
-    $ sudo systemctl stop postgresql
-
-    $ sudo pg_dropcluster 13 main
-
-    $ sudo pg_lsclusters
-    $ sudo pg_lsclusters
-    Ver Cluster Port Status Owner    Data directory              Log file
-    12  main    5432 down   postgres /var/lib/postgresql/12/main /var/log/postgresql/postgresql-12-main.log
-
-    # Change wal_level to 'logical' (in postgresql.conf) if it is se to 'minimal' (which should, if you followed configuration advice from docs).
-    # This is *temporary* change for migration.
-    # Alternatively one can ommit the --link argument, but that requires free space for a 1:1 copy and of course also takes much longer
-    $ sudo pg_upgradecluster --method=upgrade --link 12 main
-
-    $ sudo pg_dropcluster 12 main
-
-    # Step 7: Remove PostgreSQL 12 and all prior versions:
-    $ $ sudo apt-get remove --purge postgresql-12 postgresql-client-12 postgresql-server-dev-12 postgresql-12-ip4r postgresql-11 postgresql-client-11 postgresql-server-dev-11 postgresql-11-ip4r postgresql-10 postgresql-9.4 postgresql-9.5 postgresql-9.6
-
-    # Step 8: Update the configuration file
-    # This is the most laborous step, which I have found no way of automating. Also, rarely the options are just reordered, which complicates the merge process.
-    $ sudo vimdiff /etc/postgresql/13/main/postgresql.conf ~/postgresql_13_default.conf
-
-    # Change the following options in /etc/postgresql/13/main/postgresql.conf:
-    autovacuum_vacuum_insert_threshold = -1
-
-    # Change the setting for wal_level back to 'minimal' if it was changed in step 7.
-
-    # Step 9a: Reboot the system:
-    # OPTIONAL: This is a good time to reboot the machine if desired (kernel update, long uptime & non-ECC RAM). Alternatively, just follow with 9b.
-    $ sudo reboot
-
-    # Step 9b: Start PostgreSQL:
-    # Only if 9a was skipped.
-    $ sudo systemctl start postgresql
-
-    # Step 10: Recreate indices:
-    # psql mentat_events
-    ANALYZE;
-    REINDEX DATABASE mentat_events;
-    ALTER TABLE events ADD PRIMARY KEY (id);
-    ALTER TABLE events_json ADD PRIMARY KEY (id);
-    ALTER TABLE events_json ADD FOREIGN KEY (id) REFERENCES events(id) ON DELETE CASCADE;
-    CREATE INDEX IF NOT EXISTS events_detecttime_idx ON events USING BTREE (detecttime);
-    CREATE INDEX IF NOT EXISTS events_storagetime_idx ON events USING BTREE (storagetime);
-    CREATE INDEX IF NOT EXISTS events_eventseverity_idx ON events USING BTREE (eventseverity) WHERE eventseverity IS NOT NULL;
-    CREATE INDEX IF NOT EXISTS events_combined_idx ON events USING GIN (category, node_name, protocol, source_port, target_port, source_type, target_type, node_type, resolvedabuses, inspectionerrors);
-    CREATE INDEX IF NOT EXISTS events_ip_aggr_idx ON events USING GIST (source_ip_aggr_ip4, target_ip_aggr_ip4, source_ip_aggr_ip6, target_ip_aggr_ip6);
-    CHECKPOINT;
-
-    # Step 11: Start Mentat and all other services:
-    $ systemctl restart postgresql
-    $ sudo mentat-controller.py --command enable
-    $ sudo mentat-controller.py --command start
-    $ sudo systemctl start warden_filer_cesnet_receiver.service
-    $ sudo systemctl enable warden_filer_cesnet_receiver.service
-
-    # Step 12: Deactivate maintenance mode and restart the web server that is serving web interface:
-    $ a2dismod substitute
-    $ a2dissite site_maintenance.conf
-    $ a2ensite site_mentat-ng.conf
-    $ systemctl restart apache2
+    # Step 0. Launch tmux or screen
+        $ tmux
+
+    # Step 1. Activate the maintenance mode website
+        # First update timestamps of maintenance start and maintenance end:
+        $ sudo vim /etc/mentat/apache/maintenance/.htaccess
+        # Now bring the Mentat system web interface down and the maintenance site up:
+        $ sudo a2enmod substitute
+        $ sudo a2dissite site_mentat-ng.conf
+        $ sudo a2ensite site_maintenance.conf
+        $ sudo systemctl restart apache2
+
+    # Step 2. Shut down the Mentat's import pipeline and stabilize the DB
+        $ sudo systemctl stop warden_filer_cesnet_receiver.service
+        $ sudo systemctl disable warden_filer_cesnet_receiver.service
+        $ sudo mentat-controller.py --command stop
+        $ sudo mentat-controller.py --command disable
+        $ sudo systemctl restart postgresql
+
+    # Step 3. Vacuum the database
+        # Typically, peer authentication is set up for user postgres in the DB
+        $ sudo -u postgres vacuumdb -F -j 16 -v -a
+
+    # Step 4. Stop PostgreSQL
+        $ sudo systemctl stop postgresql
+
+    # Step 5. Install PostgreSQL 14
+        $ sudo apt-get update
+        $ sudo apt-get install postgresql-14 postgresql-14-ip4r postgresql-server-dev-14 postgresql-client-14
+
+    # Step 6. Back up the default PostgreSQL v14 configuration file
+        # This is used later in step 9.
+        $ cp /etc/postgresql/14/main/postgresql.conf ~/postgresql_14_default.conf
+
+    # Step 7. Migration
+        $ sudo pg_lsclusters
+        Ver Cluster Port Status Owner    Data directory              Log file
+        13  main    5432 online postgres /var/lib/postgresql/13/main /var/log/postgresql/postgresql-13-main.log
+        14  main    5433 online postgres /var/lib/postgresql/14/main /var/log/postgresql/postgresql-14-main.log
+
+        $ sudo systemctl stop postgresql
+
+        $ sudo pg_dropcluster 14 main
+
+        $ sudo pg_lsclusters
+        Ver Cluster Port Status Owner    Data directory              Log file
+        13  main    5432 down   postgres /var/lib/postgresql/13/main /var/log/postgresql/postgresql-13-main.log
+
+        # This will require *temporarily* setting wal_level to 'logical' (in postgresql.conf), as
+        # it is set to 'minimal' if you followed configuration advice from the docs.
+        # Alternatively one can ommit the --link parameter, but that requires free space for
+        # a 1:1 copy and of course also takes much longer.
+        $ sudo sed -i -E 's/^(wal_level\s*=\s*)[a-z]+/\1logical/' /etc/postgresql/13/main/postgresql.conf
+        $ sudo pg_upgradecluster --method=upgrade --link 13 main
+
+        $ sudo pg_dropcluster 13 main
+
+    # Step 8. Remove PostgreSQL 13 and potential leftovers from previous versions
+        $ sudo apt-get remove --purge postgresql-13 postgresql-client-13 postgresql-server-dev-13 postgresql-13-ip4r postgresql-13 postgresql-12 postgresql-client-12 postgresql-server-dev-12 postgresql-12-ip4r postgresql-12 postgresql-client-11 postgresql-server-dev-11 postgresql-11-ip4r postgresql-10 postgresql-9.6 postgresql-9.5 postgresql-9.4
+
+    # Step 9. Update the configuration file
+        # Related to #6480. This is the most laborous step, which I is not yet automated.
+        # Also, sometimes the options are just reordered, what complicates the merge process.
+        $ sudo vimdiff /etc/postgresql/14/main/postgresql.conf ~/postgresql_14_default.conf
+
+        # Change the setting for wal_level back to minimal if it was changed in step 7.
+        $ sudo sed -i -E 's/^(wal_level\s*=\s*)[a-z]+/\1minimal/' /etc/postgresql/14/main/postgresql.conf
+
+    # Step 10a. Reboot the system
+        # OPTIONAL: This is a good time to reboot the machine if desired (kernel update,
+        # long runtime & non-ECC RAM). Alternatively, just follow with 10b.
+        $ sudo reboot
+
+    # Step 10b. Start PostgreSQL
+        # Only if step 10a was skipped.
+        $ sudo systemctl start postgresql
+
+    # Step 11a. Cleanup & optimization
+        # As PostgreSQL upgrade is done roughly once a year (that is the cadence of major version
+        # release), it is a good point to do VACUUM FULL. Alternatively, if downtime has to be
+        # minimized at all costs, continue with step 11b.
+        # Skipping will save about 10 minutes. It is not recommended.
+        $ sudo -u postgres vacuumdb -f -j 16 -v -a -z
+
+    # Step 11b. The mandatory ANALYZE
+        # Only if step 11a was skipped.
+        # At least an ANALYZE run is required as the statistics are not carried over
+        # during the upgrade.
+        $ sudo -u postgres vacuumdb -Z -j 16 -a
+
+    # Step 12. Start Mentat
+        $ sudo mentat-controller.py --command enable
+        $ sudo mentat-controller.py --command start
+        $ sudo systemctl enable warden_filer_cesnet_receiver.service
+        $ sudo systemctl start warden_filer_cesnet_receiver.service
+
+    # Step 13. Deactivate the maintenance mode website
+        $ sudo a2dismod substitute
+        $ sudo a2dissite site_maintenance.conf
+        $ sudo a2ensite site_mentat-ng.conf
+        $ sudo systemctl restart apache2
 
 
 .. _section-upgrading-geoip:
-- 
GitLab