PostgreSQL Test AQMS System

Documents

Jiggle Configuration File: https://assets.isti.com/~pfriberg/pg_jiggle_ncsn.tar.gz
Early Notes about PostgreSQL migration.
Early Tutorial - AQMS/PostgreSQL DB
AQMS PostgreSQL Installation Notes (From ISTI).
Migrating Data from Oracle to PostgreSQL (configuration files) (From UW).
PostgreSQL Backup Notes (From UW).
PostgreSQL Data Types.
NCEDC Hybrid Model.
Meeting NC - 2018/11/08 (Notes, DC apps, PostgreSQL-Oracle differences).
Migration progress
Switchover process
From Oracle to PostgreSQL: Part 1 | Part 2

PostgreSQL Database

- Installing PostgreSQL 11:

Install the repository RPM:

[root@rumble ~]# yum install https://download.postgresql.org/pub/repos/yum/11/redhat/rhel-7-x86_64/pgdg-redhat11-11-2.noarch.rpm
...
Installed:
  pgdg-redhat11.noarch 0:11-2
  
Complete!
[root@rumble ~]#

Install the client packages:

[root@rumble ~]# yum install postgresql11
...
Installed:
  postgresql11.x86_64 0:11.1-1PGDG.rhel7

Dependency Installed:
  postgresql11-libs.x86_64 0:11.1-1PGDG.rhel7

Complete!
[root@rumble ~]#

Install the server packages:

[root@rumble ~]# yum install postgresql11-server
...
Installed:
  postgresql11-server.x86_64 0:11.1-1PGDG.rhel7

Complete!
[root@rumble ~]#

Initialize the database and enable automatic start:

[root@rumble ~]# /usr/pgsql-11/bin/postgresql-11-setup initdb
Initializing database ... OK

[root@rumble ~]# systemctl enable postgresql-11
Created symlink from /etc/systemd/system/multi-user.target.wants/postgresql-11.service to /usr/lib/systemd/system/postgresql-11.service.
[root@rumble ~]# systemctl start postgresql-11
[root@rumble ~]# systemctl status postgresql-11
● postgresql-11.service - PostgreSQL 11 database server
   Loaded: loaded (/usr/lib/systemd/system/postgresql-11.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2018-12-12 10:29:32 PST; 8s ago
     Docs: https://www.postgresql.org/docs/11/static/
  Process: 27661 ExecStartPre=/usr/pgsql-11/bin/postgresql-11-check-db-dir ${PGDATA} (code=exited, status=0/SUCCESS)
 Main PID: 27667 (postmaster)
    Tasks: 8
   CGroup: /system.slice/postgresql-11.service
           ├─27667 /usr/pgsql-11/bin/postmaster -D /var/lib/pgsql/11/data/
           ├─27669 postgres: logger  
           ├─27671 postgres: checkpointer  
           ├─27672 postgres: background writer  
           ├─27673 postgres: walwriter  
           ├─27674 postgres: autovacuum launcher  
           ├─27675 postgres: stats collector  
           └─27676 postgres: logical replication launcher  

Dec 12 10:29:31 rumble.geo.berkeley.edu systemd[1]: Starting PostgreSQL 11 database server...
Dec 12 10:29:31 rumble.geo.berkeley.edu postmaster[27667]: 2018-12-12 10:29:31.989 PST [27667] LOG:  listening on IPv4 address "127.0.0.1", port 5432
Dec 12 10:29:32 rumble.geo.berkeley.edu postmaster[27667]: 2018-12-12 10:29:32.005 PST [27667] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
Dec 12 10:29:32 rumble.geo.berkeley.edu postmaster[27667]: 2018-12-12 10:29:32.041 PST [27667] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
Dec 12 10:29:32 rumble.geo.berkeley.edu postmaster[27667]: 2018-12-12 10:29:32.075 PST [27667] LOG:  redirecting log output to logging collector process
Dec 12 10:29:32 rumble.geo.berkeley.edu postmaster[27667]: 2018-12-12 10:29:32.075 PST [27667] HINT:  Future log output will appear in directory "log".
Dec 12 10:29:32 rumble.geo.berkeley.edu systemd[1]: Started PostgreSQL 11 database server.
[root@rumble ~]#

- Use of temporary tables:

CREATE TABLE creates a fairly extensive lock in PostgreSQL and even when you DROP TABLE again, they
are not actually cleaned up until a process called VACUUM runs.

- AQMS DB Instance Creation:

Install the following extensions:

                                            List of installed extensions
          Name          | Version |   Schema   |                             Description
------------------------+---------+------------+---------------------------------------------------------------------
 dblink                 | 1.2     | public     | connect to other PostgreSQL databases from within a database
 fuzzystrmatch          | 1.1     | public     | determine similarities and distance between strings
 pglogical              | 2.4.2   | pglogical  | PostgreSQL Logical Replication
 pljava                 | 1.5.8   | sqlj       | PL/Java procedural language (https://tada.github.io/pljava/)
 plpgsql                | 1.0     | pg_catalog | PL/pgSQL procedural language
 postgis                | 2.5.3   | public     | PostGIS geometry, geography, and raster spatial types and functions
 postgis_tiger_geocoder | 2.5.3   | tiger      | PostGIS tiger geocoder and reverse geocoder
 postgis_topology       | 2.5.3   | topology   | PostGIS topology spatial types and functions
(8 rows)

Edit the pg_hba.conf file accordingly and restart the database server.
Retrieve and extract DB package DBpg_NC.tar.gz.
Install libmseed & AQMSpg_ext.so:

> cd DBpg/create/postgresql-extension
> make
> sudo make install

Create database objects (Look at DBpg_NC/create/README):

>  cd DBpg/create

>  vi pg.env
	- Update PSQLDIR.
	- Update PGDATABASE.
	- Update passwords.

> ./generate_sql_scripts.sql
	- Create password file.
	- Update templates (db, users, sequences).

> ./run_as_postgres.sql
	- Create DB.
	- Activate extensions.
	- Create roles.
	- Create users.

> ./run_sql_scripts.sql
	- Creates sequences.
	- Grant privileges on sequences.
	- Create Waveform schema.
	- Create Parametric Information schema.
	- Create Instrument Responses schema.
	- Create Hardware Tracking schema.
	- Create Application schema.
	- Grant privileges on all tables.
	- Create views.
	- Grant privileges on views.
	- Create stored procedures.
	- Create indexes.
	- Create missing objects.
	- Populate specific tables.
	- Remove specific constraints for export/import.

- Replication Setup:

PostgreSQL Server Replication Settings:

1.- Settings in ~/10/data/pg_hba.conf:

 * pp1:
	host    replication  repadmin        127.0.0.1/32          md5 # Localhost
	host    archdb1      repadmin        127.0.0.1/32          md5 # Localhost

	host    replication  repadmin        169.229.197.23/24     md5 # Public pp1&rt1
	host    archdb1      repadmin        169.229.197.23/24     md5 # Public pp1&rt1

	host    replication  repadmin        130.118.43.106/24     md5 # Public pp2&rt2
	host    archdb1      repadmin        130.118.43.106/24     md5 # Public pp2&rt2

	host    replication  repadmin        130.118.119.106/24    md5 # Private pp2&rt2
	host    archdb1      repadmin        130.118.119.106/24    md5 # Private pp2&rt2

 * pp2:
	host    replication  repadmin        127.0.0.1/32          md5 # Local host
	host    archdb1      repadmin        127.0.0.1/32          md5 # Local host

	host    replication  repadmin        169.229.197.23/24     md5 # Public pp1&rt1
	host    archdb1      repadmin        169.229.197.23/24     md5 # Public pp1&rt1

	host    replication  repadmin        130.118.43.106/24     md5 # Public pp2&rt2
	host    archdb1      repadmin        130.118.43.106/24     md5 # Public pp2&rt2

	host    replication  repadmin        130.118.119.106/24    md5 # Private pp2&rt2
	host    archdb1      repadmin        130.118.119.106/24    md5 # Private pp2&rt2

 * rt1:
	host    replication  repadmin        127.0.0.1/32          md5 # Locahost
	host    rtdb1        repadmin        127.0.0.1/32          md5 # Locahost

	host    rtdb1        repadmin        169.229.197.23/24     md5 # Public pp1&rt1
	host    rtdb1        repadmin        130.118.43.106/24     md5 # Public pp2&rt2
	host    rtdb1        repadmin        130.118.119.106/24    md5 # Private pp2&rt2

 * rt2:
	host    replication  repadmin        127.0.0.1/32          md5 # Localhost
	host    rtdb2        repadmin        127.0.0.1/32          md5 # Localhost

	host    rtdb2        repadmin        169.229.197.23/24     md5 # Public pp1&rt1
	host    rtdb2        repadmin        130.118.43.106/24     md5 # Public pp2&rt2
	host    rtdb2        repadmin        130.118.119.106/24    md5 # Private pp2&rt2


2.- Settings in ~/10/data/postgresql.conf:

 * pp1:
	listen_addresses = '*'	    # what IP address(es) to listen on;
	wal_level = 'logical'       # pglogical
	max_worker_processes = 10   # pglogical one per database needed on provider node
                            	    # pglogical one per node needed on subscriber node
	max_replication_slots = 80  # pglogical one per node needed on provider node (pp should be twice rt setting)
	max_wal_senders = 3         # pglogical one per node needed on provider node (pp =3)
	shared_preload_libraries = 'replica' # pglogical (replica for pp machines)
	track_commit_timestamp = on # pglogical needed for last/first update wins conflict resolution
                            	    # pglogical property available in PostgreSQL 9.5+
	wal_keep_segments = 512     # (Only pp) in logfile segments, 16MB each; 0 disables
	hot_standby = off           # (Only standby pp) "off" disallows queries during recovery

 * pp2:
	listen_addresses = '*'	    # what IP address(es) to listen on;
	wal_level = 'logical'       # pglogical
	max_worker_processes = 10   # pglogical one per database needed on provider node
                            	    # pglogical one per node needed on subscriber node
	max_replication_slots = 80  # pglogical one per node needed on provider node (pp should be twice rt setting)
	max_wal_senders = 3         # pglogical one per node needed on provider node (pp =3)
	shared_preload_libraries = 'replica' # pglogical (replica for pp machines)
	track_commit_timestamp = on # pglogical needed for last/first update wins conflict resolution
                            	    # pglogical property available in PostgreSQL 9.5+
	wal_keep_segments = 512     # (Only pp) in logfile segments, 16MB each; 0 disables
	hot_standby = on            # (Only standby pp) "off" disallows queries during recovery

 * rt1:
	listen_addresses = '*'	    # what IP address(es) to listen on;
	wal_level = 'logical'       # pglogical
	max_worker_processes = 10   # pglogical one per database needed on provider node
                            	    # pglogical one per node needed on subscriber node
	max_replication_slots = 40  # pglogical one per node needed on provider node (pp should be twice rt setting)
	max_wal_senders = 10        # pglogical one per node needed on provider node (pp =3)
	shared_preload_libraries = 'pglogical' # pglogical (replica for pp machines)
	track_commit_timestamp = on # pglogical needed for last/first update wins conflict resolution
                               	    # pglogical property available in PostgreSQL 9.5+

 * rt2:
	listen_addresses = '*'	    # what IP address(es) to listen on;
	wal_level = 'logical'       # pglogical
	max_worker_processes = 10   # pglogical one per database needed on provider node
                                    # pglogical one per node needed on subscriber node
	max_replication_slots = 40  # pglogical one per node needed on provider node (pp should be twice rt setting)
	max_wal_senders = 10        # pglogical one per node needed on provider node (pp =3)
	shared_preload_libraries = 'pglogical' # pglogical (replica for pp machines)
	track_commit_timestamp = on # pglogical needed for last/first update wins conflict resolution
                            	    # pglogical property available in PostgreSQL 9.5+

PGLogical Replication From RT1/RT2 To PP1:

1.- Install 2ndQuadrant pglogical extension on rtdb1, rtdb2 & archdb1:

 * As root:
        curl https://access.2ndquadrant.com/api/repository/dl/default/release/10/rpm | bash
        yum install postgresql10-pglogical
        
 * As postgres:
	$ sudo systemctl restart postgresql-10
	$ psql
	= alter system set shared_preload_libraries = 'pglogical';
	= \c archdb1
	= CREATE EXTENSION IF NOT EXISTS pglogical;
	= set pglogical.conflict_resolution = 'last_update_wins';


2.- Create provider & subscriber nodes:

 * rt1:
	= \c rtdb1
	= SELECT pglogical.drop_node('rtnode1',false);
	= SELECT pglogical.create_node(node_name := 'rtnode1',dsn:='host=rt1 port=5432 dbname=rtdb1 user=repadmin password=repadmin_pass');
	= GRANT CREATE, USAGE ON SCHEMA pglogical TO repadmin;

 * rt2:
	= \c rtdb2
	= SELECT pglogical.drop_node('rtnode2',false);
	= SELECT pglogical.create_node(node_name := 'rtnode2',dsn:='host=rt2 port=5432 dbname=rtdb2 user=repadmin password=repadmin_pass');
        = GRANT CREATE, USAGE ON SCHEMA pglogical TO repadmin;

 * pp1:
	= \c archdb1
	= GRANT CREATE, USAGE ON SCHEMA pglogical TO repadmin;
	= SELECT pglogical.drop_node('ppnode1',false);
        = SELECT pglogical.create_node(node_name := 'ppnode1',dsn:='host=pp1 port=5432 dbname=archdb1 user=repadmin password=repadmin_pass');

 To check the results:
	= SELECT * FROM pglogical.node_interface;


3.- Create replication sets on rt1 & rt2:

	= \c rtdb1
	= SELECT pglogical.drop_replication_set(set_name:='rtrepset');
	= SELECT pglogical.create_replication_set(set_name:='rtrepset', replicate_insert:=TRUE,replicate_update:=TRUE, replicate_delete:=FALSE, replicate_truncate:=FALSE);
	= DO $$
	DECLARE
	count INTEGER := 0;
	reptables varchar[] := ARRAY['amp','ampset',
	'ampsettypes','arrival','assoc_region_group',
	'assocamm','assocamo','assocaro','assoccom','assoccoo',
	'assocevampset','assocnte','assoctypecat','assocwae','autoposter',
	'coda','credit','credit_alias','epochtimebase',
	'event','eventcategory','eventmatchpassingscore','eventprefmag',
	'eventprefmec','eventprefor','eventpriorityparams','eventtype',
	'filename','gazetteer_region','gazetteer_region_group',
	'gazetteerbigtown','gazetteerpt','gazetteerquake','gazetteertown',
	'gazetteertype','jasieventlock','magprefpriority','mec','mecchannel',
	'mecdata','mecfreq','mecfreqdata','mecobject','messagetext','netmag',
	'nettrig','orgprefpriority','origin','origin_error','pathname',
	'remark','request_card','subdir','swarm_events','swarm_state',
	'trig_channel','unassocamp','wavefileroots','waveform','waveroots'];
	tbl varchar;
	BEGIN
	FOREACH tbl IN ARRAY reptables LOOP
	PERFORM
	pglogical.replication_set_add_table(set_name:='rtrepset', relation:='trinetdb.'||tbl,synchronize_data:=TRUE);
	count := count+1;
	END LOOP;
	-- RETURN count ;
	END;
	$$;

 Tables can be removed from a replication set with:
	= SELECT pglogical.replication_set_remove_table('rtrepset', 'trinetdb.<table_name>');

4.- Create subscriptions on pp1:

	= \c archdb1
	= SELECT pglogical.create_subscription(
	subscription_name := 'rt1subscr',
	provider_dsn := 'host=rt1 port=5432 user=repadmin
	password=repadmin_pass_ucb dbname=rtdb1',
	replication_sets := ARRAY['rtrepset'],
	synchronize_structure := 'FALSE',
	synchronize_data := FALSE
	);

	= SELECT pglogical.create_subscription(
	subscription_name := 'rt2subscr',
	provider_dsn := 'host=rt2 port=5432 user=repadmin
	password=repadmin_pass_wr dbname=rtdb2',
	replication_sets := ARRAY['rtrepset'],
	synchronize_structure := 'FALSE',
	synchronize_data := FALSE
	);

 In case of permission errors you may need to grant repadmin privileges to the schema, e.g.:
	= \c rtdb1
	= GRANT CREATE, USAGE ON SCHEMA pglogical TO repadmin;

 Subscriptions can be dropped with:
	= SELECT pglogical.drop_subscription( subscription_name := 'rt1subscr');
        = SELECT pglogical.drop_subscription( subscription_name := 'rt2subscr');

 To check the replication:
	= \c archdb1
	= SELECT subscription_name,status,provider_node,replication_sets FROM pglogical.show_subscription_status();
 	 subscription_name |   status    | provider_node | replication_sets 
	-------------------+-------------+---------------+------------------
 	 rt1subscr         | replicating | rtnode1       | {rtrepset}
 	 rt2subscr         | replicating | rtnode2       | {rtrepset}

	 if status is not 'replicating', do:
		SELECT pglogical.alter_subscription_synchronize(subscription_name := 'rt1subscr',truncate:=FALSE);
		or/and
		SELECT pglogical.alter_subscription_synchronize(subscription_name := 'rt2subscr',truncate:=FALSE);

 To temporarily disable the subscription:
	= \c archdb1
	= SELECT pglogical.alter_subscription_disable(subscription_name := 'rt1subscr', immediate := TRUE);
	= SELECT pglogical.alter_subscription_disable(subscription_name := 'rt2subscr', immediate := TRUE);

 To re-enable the subscription:
	= \c archdb1
	= SELECT pglogical.alter_subscription_enable(subscription_name := 'rt1subscr', immediate := TRUE);
	= SELECT pglogical.alter_subscription_enable(subscription_name := 'rt2subscr', immediate := TRUE);

 Notes:
	Replication sets provide a mechanism to control which tables in the database will be
	replicated and which actions on those tables will be replicated.

	Each replicated set can specify individually if INSERTs, UPDATEs, DELETEs and TRUNCATEs
	on the set are replicated. Every table can be in multiple replication sets and every
	subscriber can subscribe to multiple replication sets as well. The resulting set of tables
	and actions replicated is the union of the sets the table is in. The tables are not
	replicated until they are added into a replication set.

	There are three pre-existing replication sets named “default”, “default_insert_only” and
	“ddl_sql”. The “default” replication set is defined to replicate all changes to tables in.
	The “default_insert_only” only replicates INSERTs and is meant for tables that don’t have
	primary key. The “ddl_sql” replication set is defined to replicate schema changes specified
	by pglogical.replicate_ddl_command.

	UPDATEs and DELETEs cannot be replicated for tables that lack a PRIMARY KEY or other valid
	replica identity such as a UNIQUE constraint. Replication has no way to find the tuple that
	should be updated/deleted since there is no unique identifier.

Streaming Replication Between PP Instances:

1.- Restart postgres server for settings to take effect:
$ systemctl reload postgresql-10

2.- Create a DB on the standby machine:
$ /usr/pgsql-10/bin/postgresql-10-setup initdb
Initializing database ... OK

3.- Copy an image from active DB:
* Stop server:
$ systemctl stop postgresql-10

* Copy an image of active db on active server to local directory as user postgres@pp2:
$ /usr/pgsql-10/bin/pg_basebackup -h pp1 -R -D /var/lib/pgsql/10/data -U repadmin -v -P

where:
'-h pp1' specifies the active host.
'-R' writes a provisional recovery file.
When you pass -R, it automatically creates a recovery.conf file that
contains the role of the DB instance and the details of its master.
It is mandatory to create the recovery.conf file on the slave in order
to set up a streaming replication.
'-D ...' specifies where the image should be put.
'-U repadmin' is the replication user name.
'-v' specifies verbose mode.
'-P' specifies to prompt for password.

* Start server:
$ systemctl start postgresql-10.service
$ pg_ctl -D $PGDATA start

4.- Replication monitoring:

* Processes:
On Master
==========
$ ps -eaf | grep sender

On Slave
==========
$ ps -eaf | grep receiver
$ ps -eaf | grep startup

You must see those all three processes running on master and slave.

* SELECT * FROM pg_stat_replication;
will return useful information on active DB and empty on standby DB.

* SELECT * FROM pg_stat_wal_receiver;
will return useful information on standby DB but empty on active DB.

* SELECT pg_is_in_recovery();
will return TRUE on standby DB and FALSE on other nodes.

5.- Remarks:

* Archive mode:
To run in archive mode:
1.- Set the following parameters in postgresql.conf:
- archive_mode : Must be set to ON to enable archiving of WALs.
- archive_command : This parameter takes a shell command or external
programs. It can be a simple copy command to copy the WAL segments
to another location or a script that has the logic to archive the
WALs to S3 or a remote backup server.
2.- Set the parameter 'restore_command' set appropriately in recovery.conf.
This parameter takes a shell command (or a script) that can be used to
fetch the WAL needed by a standby if the WAL is not available on the active
node.

You may not need archive_mode or archive_command or restore_command for
streaming replication. But, when you setup streaming replication and
due to a network lag or whatever reason, if the standby is falling behind,
also if the WALs in pg_xlog or pg_wal are recycled, without archiving the
WALs the slave can never get back to sync with the active node.

If you use streaming replication without file-based continuous archiving,
the server might recycle old WAL segments before the standby has received
them. If this occurs, the standby will need to be reinitialized from a new
base backup. You can avoid this by setting wal_keep_segments to a value
large enough to ensure that WAL segments are not recycled too early, or by
configuring a replication slot for the standby.

Replication slots provide an automated way to ensure that the active node does not
remove WAL segments until they have been received by all standbys, and that the
active node does not remove rows which could cause a recovery conflict even when the
standby is disconnected.
In lieu of using replication slots, it is possible to prevent the removal of old WAL
segments using wal_keep_size, or by storing the segments in an archive using
archive_command. However, these methods often result in retaining more WAL segments
than required, whereas replication slots retain only the number of segments known to
be needed. On the other hand, replication slots can retain so many WAL segments that
they fill up the space allocated for pg_wal; max_slot_wal_keep_size limits the size of
WAL files retained by replication slots.

You can create a replication slot like this:

= SELECT * FROM pg_create_physical_replication_slot('node_a_slot');
slot_name | lsn
-------------+-----
node_a_slot |

= SELECT slot_name, slot_type, active FROM pg_replication_slots;
slot_name | slot_type | active
-------------+-----------+--------
node_a_slot | physical | f
(1 row)

To configure the standby to use this slot, primary_slot_name should be configured on the
standby. Here is a simple example (in recovery.conf):

primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
primary_slot_name = 'node_a_slot'

* Failover:
If the primary server fails then the standby server should begin failover procedures.

If the standby server fails then no failover need take place. If the standby server can
be restarted, even some time later, then the recovery process can also be restarted
immediately, taking advantage of restartable recovery. If the standby server cannot be
restarted, then a full new standby server instance should be created.

If the primary server fails and the standby server becomes the new primary, and then the
old primary restarts, you must have a mechanism for informing the old primary that it is
no longer the primary. This is sometimes known as STONITH (Shoot The Other Node In The Head),
which is necessary to avoid situations where both systems think they are the primary, which
will lead to confusion and ultimately data loss.

Many failover systems use just two systems, the primary and the standby, connected by some
kind of heartbeat mechanism to continually verify the connectivity between the two and the
viability of the primary. It is also possible to use a third system (called a witness server)
to prevent some cases of inappropriate failover, but the additional complexity might not be
worthwhile unless it is set up with sufficient care and rigorous testing.

PostgreSQL does not provide the system software required to identify a failure on the primary
and notify the standby database server. Many such tools exist and are well integrated with
the operating system facilities required for successful failover, such as IP address migration.

Once failover to the standby occurs, there is only a single server in operation. This is known
as a degenerate state. The former standby is now the primary, but the former primary is down
and might stay down. To return to normal operation, a standby server must be recreated, either
on the former primary system when it comes up, or on a third, possibly new, system. The
pg_rewind utility can be used to speed up this process on large clusters. Once complete,
the primary and standby can be considered to have switched roles. Some people choose to use a
third server to provide backup for the new primary until the new standby server is recreated,
though clearly this complicates the system configuration and operational processes.

So, switching from primary to standby server can be fast but requires some time to re-prepare
the failover cluster. Regular switching from primary to standby is useful, since it allows
regular downtime on each system for maintenance. This also serves as a test of the failover
mechanism to ensure that it will really work when you need it. Written administration
procedures are advised.

To trigger failover of a log-shipping standby server, run pg_ctl promote, call pg_promote(),
or create a trigger file with the file name and path specified by the promote_trigger_file.
If you're planning to use pg_ctl promote or to call pg_promote() to fail over, promote_trigger_file
is not required. If you're setting up the reporting servers that are only used to offload
read-only queries from the primary, not for high availability purposes, you don't need to
promote it.

pg_rewind — Synchronize a PostgreSQL data directory with another data directory that was
forked from it.

repmgr — Open-source tool suite for managing replication and failover in a cluster of PostgreSQL
servers. It enhances PostgreSQL's built-in hot-standby capabilities with tools to set
up standby servers, monitor replication, and perform administrative tasks such as
failover or manual switchover operations.

DDL and replication:

        - DDL commands are NOT supported in the pglogical replication but pglogical provides the
          pglogical.replicate_ddl_command function to allow DDL to be run on the provider and
          subscriber at a consistent point.
          
          pglogical.replicate_ddl_command(command text, replication_sets text[])
              Execute locally and then send the specified command to the replication queue for
              execution on subscribers which are subscribed to one of the specified
              replication_sets.
              Parameters:
                   command – DDL query to execute
                   replication_sets – array of replication sets which this command should be
                                      associated with, default “{ddl_sql}”
              Examples:
                   SELECT pglogical.replicate_ddl_command('alter table trinetdb.swarm_state add testfield bigint', '{rtrepset}');
                   SELECT pglogical.replicate_ddl_command('alter table trinetdb.swarm_state drop column testfield', '{rtrepset}');
              Notes:
                   To prevent the following error:
                        ERROR:  permission denied for schema pglogical
                   Grant the following permission:
                        GRANT USAGE ON SCHEMA pglogical TO <user>;
                                      
        - Streaming replication reads changes at the block level from the write-ahead logs, so
          the databases are kept physically the same. Since DDL is, just like DML, recorded in the
          WAL, DDL and DML are both transparently replicated.

- REPMGR setup:

REPMGR SETUP ON PP1/PP2:
------------------------

1.- Install 2ndQuadrant repmgr extension on archdb1:

 * As root on the primary server (pp1):
	% curl https://dl.2ndquadrant.com/default/release/get/10/rpm | bash
	% yum install repmgr10

 * The repmgr extension can only be installed by a superuser. If the repmgr user is a superuser, repmgr will create the extension automatically.
   Alternatively, the extension can be created manually by a superuser (with "CREATE EXTENSION repmgr") before executing repmgr primary register.

2.- Update postgresql.conf file on pp1 & pp2:

	max_wal_senders = 10
	max_replication_slots = 10
	wal_level = 'replica'
	hot_standby = on
	archive_mode = on
	archive_command = '/bin/true'

3.- Create a repmgr database on primary server (pp1):
	% createdb repmgr -O repadmin

4.- Update pg_hba.conf file on pp1 & pp2:

  * On pp1:
	local   replication        repadmin                                trust
	local   repmgr             repadmin                                trust

	host   replication         repadmin    127.0.0.1/32                trust #localhost
	host   repmgr              repadmin    127.0.0.1/32                trust #localhost

	host    replication        repadmin    169.229.197.23/32           trust #public pp1
	host    repmgr             repadmin    169.229.197.23/32           trust #public pp1

	host    replication        repadmin    130.118.43.106/32           trust #public pp2
	host    repmgr             repadmin    130.118.43.106/32           trust #public pp2

	host    archdb1            repadmin    169.229.197.29/32           trust #public rt1
	host    archdb1            repadmin    130.118.43.105/32           trust #public rt2

  * On pp2:
	local   replication         repadmin                               trust
	local   repmgr              repadmin                               trust

	host   replication         repadmin    127.0.0.1/32                trust #localhost
	host   repmgr              repadmin    127.0.0.1/32                trust #localhost

	host   replication         repadmin    130.118.43.106/32           trust #public pp2
	host   repmgr              repadmin    130.118.43.106/32           trust #public pp2

	host   replication         repadmin    169.229.197.23/32           trust #public pp1
	host   repmgr              repadmin    169.229.197.23/32           trust #public pp1

	host   archdb1             repadmin    169.229.197.29/32           trust #public rt1
	host   archdb1             repadmin    130.118.43.105/32           trust #public rt2

5.- Prepare the standby (pp2):

  * On the standby, check that the primary database is reacheable:
	% psql 'host=pp1 user=repadmin dbname=repmgr connect_timeout=2'

6.- Create repmgr configuration file on pp1 & pp2:

  * On the primary server (pp1):
	% cat /etc/repmgr/10/repmgr.conf
		node_id=1
		node_name=pp1
		conninfo='host=pp1 user=repadmin dbname=repmgr connect_timeout=2 application_name=pp1'
		data_directory =  '/home/pgsql/data'
		service_start_command = 'sudo /bin/systemctl start postgresql-10'
		service_stop_command = 'sudo /bin/systemctl stop postgresql-10'
		service_restart_command = 'sudo /bin/systemctl restart postgresql-10'
		service_reload_command = 'sudo /bin/systemctl reload postgresql-10'

  * On the standby server (pp2):
	% cat /etc/repmgr/10/repmgr.conf
		node_id=2
		node_name=walreceiver
		conninfo='host=pp2 user=repadmin dbname=repmgr connect_timeout=2 application_name=walreceiver'
		data_directory = '/home/pgsql/data'
		service_start_command = 'sudo /bin/systemctl start postgresql-10'
		service_stop_command = 'sudo /bin/systemctl stop postgresql-10'
		service_restart_command = 'sudo /bin/systemctl restart postgresql-10'
		service_reload_command = 'sudo /bin/systemctl reload postgresql-10'

7.- Register the servers:

  * On the primary server (pp1):
	% repmgr -f /etc/repmgr/10/repmgr.conf primary register
	% repmgr -f /etc/repmgr/10/repmgr.conf cluster show

  * On the standby server (pp2):
	% repmgr -f /etc/repmgr/10/repmgr.conf standby register
	% repmgr -f /etc/repmgr/10/repmgr.conf cluster show

- BARMAN setup:

BARMAN setup on pp1 - Target DB (archdb1@pp1) - Using continuously streaming backups with WALs:


* Install PostgreSQL.

* Install Barman:

  $ curl https://dl.2ndquadrant.com/default/release/get/deb | sudo bash
  $ sudo apt-get update
  $ sudo apt-get install barman

* Configure Barman: (as barman)

  - Update /etc/barman/barman.conf:
	[barman]
	barman_user = barman
	configuration_files_directory = /etc/barman/conf.d
	barman_home = /var/lib/barman
	log_file = /var/log/barman/barman.log
	log_level = INFO
	retention_policy = RECOVERY WINDOW OF 1 WEEKS

* Configure server connection: (as barman)

  - Create /etc/barman/conf.d/pg.conf:
	[pg]
	description =  "archdb1 Database (Streaming-Only)"
	conninfo = host=pp1 user=barman dbname=archdb1
	streaming_conninfo = host=pp1 user=barman
	backup_method = postgres
	streaming_archiver = on
	slot_name = barman

* Set up storage directory: (as barman)

  $ ll -l /var/lib/barman/
  total 0
  lrwxrwxrwx 1 barman barman 19 Feb  1 19:31 pg -> /work/pp1/barman/pg
  $ 

* Add 'export PATH=$PATH:/usr/pgsql-10/bin/' to ~barman/.bash_profile (as barman)

* Create barman user for PostgreSQL: (as postgres)

  $ createuser -s -P barman

* Test psql connection: (as barman)

  - Create ~barman/.pgpass:
	pp1:5432:*:barman:barman_pass_ucb

  - Test connection:
	psql -c 'SELECT version()' -U barman -h pp1 archdb1

* Configure PostgreSQL: (as postgres)

  - Update pga_hba.conf to include:
	# barman
	host	archdb1	    barman	    169.229.197.23/32	     md5 #public pp1
	host	replication barman	    169.229.197.23/32        md5 #public pp1

* Test replication connection: (as barman)

  $ psql -U barman -h pp1 -c 'IDENTIFY_SYSTEM' replication=1

* Set up SSH for restores: (as barman)

  - Run: $ ssh-keygen -t rsa
  - Save the key to ~/.ssh/pg
  - Copy the value of the public key in ~/.ssh/pg.pub
  - On the PostgreSQL server, open or create ~postgres/.ssh/authorized_keys and add the public key

* Create replication slot: (as barman)

  $ barman receive-wal —-create-slot pg

* Run maintenance tasks: (as barman)

  $ barman cron
  Starting WAL archiving for server pg
  Starting streaming archiver for server pg
  $

* Check the server configuration: (as barman)

  $ barman check pg
  Server pg:
	PostgreSQL: OK
	superuser or standard user with backup privileges: OK
	PostgreSQL streaming: OK
	wal_level: OK
	replication slot: OK
	directories: OK
	retention policy settings: OK
	backup maximum age: OK (no last_backup_maximum_age provided)
	backup minimum size: OK (0 B)
	wal maximum age: OK (no last_wal_maximum_age provided)
	wal size: OK (0 B)
	compression settings: OK
	failed backups: OK (there are 0 failed backups)
	minimum redundancy requirements: OK (have 0 backups, expected at least 0)
	pg_basebackup: OK
	pg_basebackup compatible: OK
	pg_basebackup supports tablespaces mapping: OK
	systemid coherence: OK (no system Id stored on disk)
	pg_receivexlog: OK
	pg_receivexlog compatible: OK
	receive-wal running: OK
	archiver errors: OK
$ 

* Perform a full backup: (as barman)

  $ barman backup pg
  Starting backup using postgres method for server pg in /var/lib/barman/pg/base/20220201T200921
  Backup start at LSN: CE/8CBB0E90
  Starting backup copy via pg_basebackup for 20220201T200921
  Copy done (time: 1 minute, 39 seconds)
  Finalising the backup.
  This is the first backup for server pg
  Backup size: 7.9 GiB
  Backup end at LSN: CE/8CCF8438 (0000000C000000CE0000008C, 00CF8438)
  Backup completed (start time: 2022-02-01 20:09:21.534511, elapsed time: 1 minute, 40 seconds)
  WARNING: IMPORTANT: this backup is classified as WAITING_FOR_WALS, meaning that Barman has not received yet all the required WAL files for the backup consistency.
  This is a common behaviour in concurrent backup scenarios, and Barman automatically set the backup as DONE once all the required WAL files have been archived.
  Hint: execute the backup command with '--wait'
  $ 

* List available backups: (as barman)

  $ barman list-backup pg
  pg 20220201T200921 - Tue Feb  1 20:11:01 2022 - Size: 7.9 GiB - WAL Size: 0 B
  $ 
  
* Delete a backup: (as barman)

  $ barman delete pg 20220203T162634
  Deleting backup 20220203T162634 for server pg
  Delete associated WAL segments:
	0000000C000000CF0000008B
	0000000C000000CF0000008C
	0000000C000000CF0000008D
	0000000C000000CF0000008E
	0000000C000000CF0000008F
  Deleted backup 20220203T162634 (start time: Sat Feb  5 09:10:45 2022, elapsed time: less than one second)
  $ 

* Check the status of the PostgreSQL server: (as barman)

  $ barman status pg
  Server pg:
	Description: archdb1 Database (Streaming-Only)
	Active: True
	Disabled: False
	PostgreSQL version: 10.10
	Cluster state: in archive recovery
	pgespresso extension: Not available
	Current data size: 6.4 GiB
	PostgreSQL Data directory: /var/lib/pgsql/10/data
	Passive node: False
	Retention policies: enforced (mode: auto, retention: RECOVERY WINDOW OF 1 WEEKS, WAL retention: MAIN)
	No. of available backups: 1
	First available backup: 20220201T200921
	Last available backup: 20220201T200921
	Minimum redundancy requirements: satisfied (1/0)
  $ 

* Check the streaming status: (as barman)

  $ barman replication-status pg
  Status of streaming clients for server 'pg':
  Current LSN on master: CE/92C33B00
  Number of streaming clients: 1

  1. Async WAL streamer
     Application name: barman_receive_wal
     Sync stage      : 3/3 Remote write
     Communication   : TCP/IP
     IP Address      : 169.229.197.23 / Port: 42350 / Host: -
     User name       : barman
     Current state   : streaming (async)
     Replication slot: barman
     WAL sender PID  : 32160
     Started at      : 2022-02-01 20:26:30.779840-08:00
     Sent LSN   : CE/92C33B00 (diff: 0 B)
     Write LSN  : CE/92C33B00 (diff: 0 B)
     Flush LSN  : CE/92000000 (diff: -12.2 MiB)
$ 

* Set up cron jobs: (as barman)

  * * * * * /usr/bin/barman cron
  0 0 * * * export PATH=$PATH:/usr/pgsql-10/bin/;/usr/bin/barman backup pg

* Recovery:

  - Stop the PostgreSQL server.
  - Get a list of possible backup spots: $ barman list-backup pg
  - Get details for the backup you choose: $ barman show-backup pg <backup_id>. Note that <backup_id> is the timestamp of the backup.
  - Run: $ barman recover --remote-ssh-command "ssh -i ~/.ssh/pg -o StrictHostKeyChecking=no postgres@pp1" --target-time="<YYYY-MM-DD HH:MM:SS.SSSSSS+00:00>" pg <backup_id> /var/lib/pgsql/10/data
  - Restart the PostgreSQL server.

NC AQMS Test System

- Topology:

169.229.197.29 - rt1.geo.berkeley.edu (192.168.116.29 - rt1.ncss-ucb.cisn.org)
169.229.197.23 - pp1.geo.berkeley.edu (192.168.116.23 - pp1.ncss-ucb.cisn.org)
130.118.43.105 - rt2.wr.usgs.gov (130.118.119.105 - private)
130.118.43.106 - pp2.wr.usgs.gov (130.118.119.106 - private)

- Replicated tables:

All tables are replicated between pp1 (archdb1) and pp2 (archdb1 [same DB name on both PP]).
A subset of tables (from the PI, WF & AP schema) are replicated from rt1 (rtdb1) and rt2 (rtdb2) to pp1 or pp2 (tables in bold were removed from the replication set on rtdb1 & rtdb2):
- amp
- ampset
- ampsettypes (loaded via DDL script)
- arrival
- assoc_region_group (loaded via DDL script - Not used in NC)
- assocamm
- assocamo
- assocaro
- assoccom
- assoccoo
- assocevampset
- assocnte
- assoctypecat (loaded via DDL script)
- assocwae
- autoposter (Populated manually)
- coda
- credit
- credit_alias (Populated manually)
- epochtimebase (loaded via DDL script)
- event
- eventcategory (loaded via DDL script)
- eventmatchpassingscore (Populated manually)
- eventprefmag
- eventprefmec
- eventprefor
- eventpriorityparams (Populated manually)
- eventtype (loaded via DDL script NOT COMPLETE)
- filename
- gazetteer_region (loaded via regionDB)
- gazetteer_region_group (loaded via DDL script)
- gazetteerbigtown (loaded via placesDB)
- gazetteerpt (loaded via placesDB & quarryDB)
- gazetteerquake (loaded via placesDB)
- gazetteertown (loaded via placesDB)
- gazetteertype (Populated manually)
- jasieventlock
- magprefpriority (loaded via magprefDB)
- mec
- mecchannel
- mecdata
- mecfreq
- mecfreqdata
- mecobject
- messagetext (loaded via DDL script)
- netmag
- nettrig
- orgprefpriority (Populated manually)
- origin
- origin_error
- pathname
- remark
- request_card
- subdir
- swarm_events
- swarm_state
- trig_channel
- unassocamp
- wavefileroots (Not used in NC)
- waveform
- waveroots (Populated manually)

- Non replicated tables:

AP Schema:
- alarm_action
- appchannels (updated via StationUI)
- applications (updated via StationUI)
- assocevents
- assocwfrc
- channelmap_ampparms (updated via codaparamsDB)
- channelmap_codaparms (updated via codaparamsDB)
- config_channel (?)
- leap_seconds (loaded via DDL script)
- pcs_signal
- pcs_state
- pcs_transition
- peer_system_status
- program (?)
- rt_role
- stacorrections (updated via codaparamsDB (Md) & loadCISNmagcorr (Ml))
- subnet (used only at Caltech to populate EW carlsubtrig file)
- subnet_channel (used only at Caltech to populate EW carlsubtrig file)
- system_status

IR Schema: (updated via stnxml2aqms)
- channel_comment
- channel_data
- coefficients
- d_abbreviation
- d_comment
- d_format
- d_format_data
- d_unit
- dc
- dc_data
- decimation
- dm
- pn
- pn_data
- poles_zeros
- polynomial
- pz
- pz_data
- sensitivity
- simple_response
- station_comment
- station_data

HT Schema: (not used)
- datalogger
- datalogger_board
- datalogger_module
- filamp
- filamp_pchannel
- filter
- filter_fir
- filter_fir_data
- filter_sequence
- filter_sequence_data
- response
- response_hp
- response_lp
- response_pn
- response_pn_data
- response_pz
- sensor
- sensor_component
- station
- station_datalogger
- station_datalogger_lchannel
- station_datalogger_pchannel
- station_digitizer
- station_digitizer_pchannel
- station_filamp
- station_filamp_pchannel
- station_sensor
- station_sensor_component

- Views:

active_channels_view (updated via AppChan.sql)
all_channels_view (updated via AppChan.sql)
jasi_ampparms_view
jasi_channel_view
jasi_config_view
jasi_response_view
jasi_station_view
md_magparms_view
ml_magparms_view

- DB objects created manually:

pdl_product (table - RT/PP - replicated)
pdl_ids (table - RT/PP - replicated)
dbsetseq (sequence - RT/PP)
gazetteerline (table - RT/PP - not replicated) (Not used in NC)
gazetteerquarry (table - RT/PP - not replicated) (loaded via quarryDB)
realarm (table - PP)
logger_info (table - PP) (updated via ?)
CS schema: (tables - RT/PP - not replicated) (loaded via DDL script except for c_channeldata)
- c_channeldata (updated via stnxml2aqms)
- c_auth
- c_datum
- c_fltrtyp
- c_lgrtyp
- c_loggerdata
- c_net
- c_ornt
- c_orntabbrv
- c_phystyp
- c_rectyp
- c_sensordata
- c_site
- c_snstyp
- c_timeclk
- c_units

- Database Sequences:

Values ending in:	are generated on:
1,6	RTDB1
2,7	RTDB2
3,8	ARCHDB1

PI Schema:
- evseq ⇒ Event.evid
- orseq ⇒ Origin.orid
- magseq ⇒ Netmag.magid
- arseq ⇒ Arrival.arid
- ampseq ⇒ Amp.ampid
- commseq ⇒ Remark.commid
- mecseq ⇒ Mec.mecid
- mecdataseq ⇒ MecData.mecdataid
- mecfreqseq ⇒ MecFreq.mecfreqid
- coseq ⇒ Coda.coid
- ampsetseq ⇒ AmpSet.ampsetid
- catseq ⇒ EventCategory.catid

WF Schema:
- waseq ⇒ Waveform.wfid
- fiseq ⇒ Filename.fileid
- sdseq ⇒ SubDir.subdirid

AP Schema:
- unassocseq ⇒ UnAssocAmp.ampid
- sigseq ⇒ PCS_Signal.sigid
- reqseq ⇒ Request_Card.rcid
- dbsetseq ⇒ PDL_Ids.dbsetid
- ntseq ⇒ NetTrig.ntid
- trigseq ⇒ Trig_Channel.trigid
- gazseq ⇒ GazetteerPt.gazid (Not used in NC)

IR Schema:
- dcseq ⇒ DC.key
- dmseq ⇒ DM.key
- abbseq ⇒ D_Abbreviation.id
- uniseq ⇒ D_Unit.id
- comseq ⇒ D_Comment.id
- forseq ⇒ D_Format.id
- poseq ⇒ PN.key
- pzseq ⇒ PZ.key

- Database Users:

User	Description
browser	Guest user (read only)
code	Stored procedures owner
dcarchiver	Waveform archiver user
operator	Jiggle user
postgres	Super user
repadmin	Replication user
rtem	Real time user
tpp	Post processing user
trinetdb	Schema owner
trinetdb_execute	Role (execute stored procedures)
trinetdb_read	Role (read access)
trinetdb_write	Role (write access)
waveform_read	Role (read waveforms via DB (jiggle))

                                                          List of roles
    Role name     |                         Attributes                         |                    Member of                    
------------------+------------------------------------------------------------+-------------------------------------------------
 browser          |                                                            | {trinetdb_read}
 code             |                                                            | {trinetdb_read,trinetdb_write}
 dcarchiver       |                                                            | {trinetdb_read,trinetdb_write,trinetdb_execute}
 operator         |                                                            | {trinetdb_write,trinetdb_execute}
 postgres         | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
 repadmin         | Replication                                                | {}
 rtem             |                                                            | {trinetdb_read,trinetdb_write,trinetdb_execute}
 tpp              |                                                            | {trinetdb_read,trinetdb_write,trinetdb_execute}
 trinetdb         |                                                            | {trinetdb_execute}
 trinetdb_execute | Cannot login                                               | {}
 trinetdb_read    | Cannot login                                               | {waveform_read}
 trinetdb_write   | Cannot login                                               | {trinetdb_read}
 waveform_read    | Cannot login                                               | {}

- Ports:

PostgreSQL listener: 5432
Jiggle solution server: 6600 and 6601
Winston wave server: 16023 or 16024
Proxy Wave Server: 9321

Setup of export_ack ports for picks, codas, carlstatrigs for testing:
    192.168.200.51  ucbns1.ucb.cisn.org         port: 5207  Module ID: MOD_EXPORT_PKTRIG_UCBNS1_BK3
    192.168.200.52  ucbns2.ucb.cisn.org         port: 5207  Module ID: MOD_EXPORT_PKTRIG_UCBNS2_BK3
    192.168.200.51  ucbns1.ucb.cisn.org         port: 5208  Module ID: MOD_EXPORT_PKTRIG_UCBNS1_BK4
    192.168.200.52  ucbns2.ucb.cisn.org         port: 5208  Module ID: MOD_EXPORT_PKTRIG_UCBNS2_BK4 
    192.168.201.76  mnlons1.mp.cisn.org         port: 16003 Module ID: MOD_EXP_PKTRG2UCB3
    192.168.201.77  mnlons2.mp.cisn.org         port: 16003 Module ID: MOD_EXP_PKTRG2UCB3
    192.168.119.76  mnlons1.ncss-mp.cisn.org    port: 16003 Module ID: MOD_EXP_PKTRG2UCB3
    192.168.119.77  mnlons2.ncss-mp.cisn.org    port: 16003 Module ID: MOD_EXP_PKTRG2UCB3

Setup of ADA feed ports for testing:
    on adadup_master servers (ucbns1, ucbns2, mnlons1, mnlons2)
    SLAVE_IP_ADDRESS  <IPaddress of RT test>
    SLAVE_IP_PORT     45101 (for mnlons1)
    SLAVE_IP_PORT     45102 (for mnlons1)
    SLAVE_IP_PORT     45121 (for ucbns1)
    SLAVE_IP_PORT     45122 (for ucbns2)

    on adadup_slave (the test RT) setup one processs for each ADA master:
    MASTER_IP_ADDRESS   ucbns1.seismo.berkeley.edu      SLAVE_IP_PORT   45121
    MASTER_IP_ADDRESS   ucbns2.seismo.berkeley.edu      SLAVE_IP_PORT   45122
    MASTER_IP_ADDRESS   mnlons1.ncss-mp.cisn.org        SLAVE_IP_PORT   45101
    MASTER_IP_ADDRESS   mnlons2.ncss-mp.cisn.org        SLAVE_IP_PORT   45102

- Permits needed:

-----------------------
Permits needed at BSL
-----------------------
Source				Destination		Reason

Menlo mnlort1:<random>		BSL ucbns1:5206	        Exporting picks from BSL NS to Menlo RT 
Menlo mnlort1:<random>		BSL ucbns2:5206	        Exporting picks from BSL NS to Menlo RT 
BSL ucbrt:<random>		BSL ucbns1:5203	        Exporting picks from BSL NS to BSL RT   
BSL ucbrt:<random>		BSL ucbns2:5203	        Exporting picks from BSL NS to BSL RT   
BSL VboxRT:<random>             BSL ucbns1:5207         Exporting picks from BSL NS to BSL Postgres test RT
BSL VboxRT:<random>             BSL ucbns1:5207         Exporting picks from BSL NS to BSL Postgres test RT
Menlo mnlons1:<random>		BSL ucbrt:45101		Pushing ada from Menlo to BSL // USGS: need to setup a new adadup_master instance on mnlons1 //
Menlo mnlons2:<random>		BSL ucbrt:45102		Pushing ada from Menlo to BSL // USGS: need to setup a new adadup_master instance on mnlons2 //
BSL ucbns1:<random>		BSL ucbrt:45121		Pushing ada from BSL to BSL // BSL: need to setup a new adadup_master instance on ucbns1
BSL ucbns2:<random>		BSL ucbrt:45122		Pushing ada from BSL to BSL // BSL: need to setup a new adadup_master instance on ucbns2

Menlo mnlort1:<random>		BSL ucbrt:9322		Proxy waveserver for AZ,BK,BP,LB,PB,US nets
Menlo mnlort1:<random>		BSL ucbrt:9323		Proxy waveserver for archived continuous

Menlo mnlort1:<random>		BSL ucbrt:5432		Postgres between RTs
Menlo mnlodb1:<random>		BSL ucbpp:5432		Postgres between PPs

-----------------------
Permits needed at USGS
-----------------------
Source				Destination		Reason

BSL ucbrt:<random>		Menlo mnlons1:16002	Exporting picks from Menlo NS to BSL RT 
BSL ucbrt:<random>		Menlo mnlons2:16002	Exporting picks from Menlo NS to BSL RT 
BSL seiche:<random>             Menlo mnlons1:16001     Exporting picks from Menlo NS to BSL test RT
BSL VboxRT:<random>             Menlo mnlons1:16003     Exporting picks from Menlo NS to BSL Postgres test RT
BSL VboxRT:<random>             Menlo mnlons2:16003     Exporting picks from Menlo NS to BSL Postgres test RT
Menlo mnlort1:<random>		Menlo mnlons1:15999	Exporting picks from Menlo NS to Menlo RT  // spare already existing at 15998 // 
Menlo mnlort1:<random>		Menlo mnlons2:15999	Exporting picks from Menlo NS to Menlo RT 
BSL ucbns1:<random>		Menlo mnlort1:45121	Pushing ada from BSL to Menlo  // BSL: need to setup a new adadup_master instance on ucbns1
BSL ucbns2:<random>		Menlo mnlort1:45122	Pushing ada from BSL to Menlo  // BSL: need to setup a new adadup_master instance on ucbns2
Menlo mnlons1:<random>		Menlo mnlort1:45101	Pushing ada from Menlo to Menlo // USGS: need to setup a new adadup_master instance on mnlons1
Menlo mnlons2:<random>		Menlo mnlort1:45102	Pushing ada from Menlo to Menlo // USGS: need to setup a new adadup_master instance on mnlons2

BSL ucbrt:<random>		Menlo mnlort1:9322	Proxy waveserver for all NCSN
BSL ucbrt:<random>		Menlo mnlort1:9327	Proxy waveserver for CI, NN

BSL ucbrt:<random>		Menlo mnlort:5432	Postgres between RTs
BSL ucbpp:<random>		Menlo mnlodb1:5432	Postgres between PPs

Note: VboxRT still needs an IP address assigned.

- PL/Java setup: GitLab

Install PL/Java:

 * As aqms:
        $ cd /home/aqms/PL_Java
        $ wget https://gitlab.com/aqms-swg/aqms-jiggle/-/raw/jc-issue-128-pljava/PL_Java/centOS%207/pljava-pg10.20-amd64-Linux-gpp.jar
        $ wget https://gitlab.com/aqms-swg/aqms-jiggle/-/package_files/34725732/download
        $ export PATH="/usr/pgsql-10/bin:$PATH"
        $ sudo java -jar /home/aqms/PL_Java/pljava-pg10.20-amd64-Linux-gpp.jar

 * Please download the right version of the prebuilt PL/Java package based on your own operating system
   and current version of postgres in your machine,

Install PL/Java PostgreSQL Extension:

 * As postgres:
        $ psql -U postgres rtdb1
        = ALTER DATABASE rtdb1 SET pljava.libjvm_location TO '/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.322.b06-1.el7_9.x86_64/jre/lib/amd64/server/libjvm.so';
        = \q

        $ psql -U postgres rtdb1
        = create schema if not exists db_cleanup authorization code;
        = grant usage on schema db_cleanup to trinetdb_read,trinetdb_execute;
         
        = create schema if not exists eventpriority authorization code;
        = grant usage on schema eventpriority to trinetdb_read,trinetdb_execute;

        = create schema if not exists file_util authorization code;
        = grant usage on schema file_util to trinetdb_read,trinetdb_execute;

        = create schema if not exists formats authorization code;
        = grant usage on schema formats to trinetdb_read,trinetdb_execute;

        = create schema if not exists wavearc authorization code;
        = grant usage on schema wavearc to trinetdb_read,trinetdb_execute;

        = create extension pljava;
        = select sqlj.install_jar('file:/home/aqms/PL_Java/your-jiggle.jar','jigglejar',true);
        = select sqlj.set_classpath('assocamp', 'jigglejar');
        = select sqlj.set_classpath('db_cleanup', 'jigglejar');
        = select sqlj.set_classpath('eventpriority', 'jigglejar');
        = select sqlj.set_classpath('file_util', 'jigglejar');
        = select sqlj.set_classpath('formats', 'jigglejar');
        = select sqlj.set_classpath('wavearc', 'jigglejar');
        = select sqlj.set_classpath('wheres', 'jigglejar');
        = grant usage on schema sqlj to trinetdb_read,trinetdb_execute;

Test Installation:

        $ psql -U rtem rtdb1
        = select wheres.where_from_type(35.7, -117.7, 1., 1, 'bigtown');

Upgrading to a new version of Jiggle.jar:

        $ psql -U postgres rtdb1
        = select sqlj.replace_jar('file:/home/postgres/jiggle_name.jar', 'jigglejar',true);

- StationUI setup: GitLab

Clone the repo:

% git clone https://gitlab.com/aqms-swg/stationui.git

Create a virtual environment:

% cd stationui/
% python3 -m venv /home/aqms/dir

Activate the environment:

% source /home/aqms/dir/bin/activate

Edit and install the requirements:

% vi requirements.txt 
% python3 -m pip install -r requirements.txt

Make sure that the Applications table contains the 'description' field:

The table applications is one of the 2 tables that StationUI updates. Add the column description
with a data type of varchar2(1024) to this table, if this column doesn't already exist.

Configure Django:

% vi django/datacenter/settings.py 
% mkdir /tmp/logs

Create and populate StationUI logging tables:

% cd django/stationui/
% cp user_action_tables_POSTGRES.sql user_action_tables_POSTGRES.sql.ORIG
% vi user_action_tables_POSTGRES.sql
	> GRANT ALL on stationui_action_types to rtem;
	> GRANT ALL on stationui_action_types_id_seq to rtem;
	> GRANT ALL on stationui_actions to rtem;
	> GRANT ALL on stationui_actions_id_seq to rtem;
	> 
	> GRANT ALL ON SCHEMA pglogical TO rtem;

% psql -d rtdb1 -U trinetdb
= \i user_action_tables_POSTGRES.sql

Create admin user:

% cd ..
% python3 manage.py migrate auth
% python3 manage.py migrate 
% python3 manage.py createsuperuser --username=<username> --email=<email>

Configure default rules:

% vi stationui/defaultrules.py

Start the server:

source /home/aqms/dir/bin/activate
python3 manage.py runserver <host>:8000 >& stationui.log &

- Data Migration & Switchover (ora2pg): [rt1/pp1 active]

Install ora2pg on pp1.

Retrieve and extract DB package ora2pg_NC.tar.gz.

Stop RT & PP processes on PostgreSQL systems.

% aqms forcestop (as aqms on rt1, rt2 & pp1)

Stop replication from RTs to PP on PostgreSQL systems.

> SELECT pglogical.alter_subscription_disable(subscription_name := 'rt1subscr', immediate := TRUE); (on archdb1)
> SELECT pglogical.alter_subscription_disable(subscription_name := 'rt2subscr', immediate := TRUE); (on archdb1)

Delete all rows from all the PostgreSQL databases.

> \i sql/truncate.sql (on rtdb1, rtdb2 & archdb1)
> \i sql/drop_sequences.sql (on rtdb1, rtdb2 & archdb1)

Reset the start value of the sequences (on rtdb1, rtdb2, archdb1 with the appropriate intervals) to the highest values (or higher) of the old Oracle databases.

% sqlplus browser@publicdb @get_max_seq.sql 
% vi sql/create_sequences.sql (on rtdb1, rtdb2 & archdb1)
> \i sql/create_sequences.sql (on rtdb1, rtdb2 & archdb1)

Stop PP processes on Oracle system.

Disable the Event table trigger on the PostgreSQL PP system.

> ALTER TABLE event DISABLE TRIGGER post_new_event; (on archdb1)

Turn on PostgreSQL RT systems AND stop replication from Oracle RTs to Oracle PPs at the same time (goal is: oracle PP system no longer changing).

% aqms start (as aqms on rt1 & rt2)

At this point the PostgreSQL RT systems are live and alarming.

Turn on the data transfer from Oracle PP to file to PostgreSQL PP (using ora2pg).

* Tables exported/imported:
	AMP
	AMPSET
	ARRIVAL
	ASSOCAMM
	ASSOCAMO
	ASSOCARO
	ASSOCCOM
	ASSOCCOO
	ASSOCEVAMPSET
	ASSOCNTE
	ASSOCWAE
	CODA
	CREDIT
	EVENT
	EVENTPREFMAG
	EVENTPREFMEC
	EVENTPREFOR
	FILENAME
	MEC
	MECCHANNEL
	MECDATA
	MECFREQ
	MECFREQDATA
	MECOBJECT
	NETMAG
	NETTRIG
	ORIGIN
	ORIGIN_ERROR
	PATHNAME
	REMARK
	REQUEST_CARD
	SUBDIR
	SWARM_EVENTS
	SWARM_STATE
	TRIG_CHANNEL
	UNASSOCAMP
	WAVEFORM
	PDL_PRODUCT
	PDL_IDS
* Export tables from Oracle DB: [82G - 4:40h]
	% cd export/
	% source init.ora2pg
	% ./run_export.csh
* Import tables into PostgreSQL DB: [1:35h]
	% cd import/
	% ./run_import.csh

Rsync event waveform archive from Oracle (/work/dc22/ncedc/events/) to PostgreSQL system. [18G - 0:41h]

% rsync -av -e ssh --progress ncss@ucbpp.geo.berkeley.edu:/work/dc22/ncedc/events/active/ /home/aqms/data/events/active (as aqms on pp1)

Re-enable the Event table trigger on the PostgreSQL PP system.

> ALTER TABLE event ENABLE ALWAYS TRIGGER post_new_event; (on archdb1)

Start PP processes on PostgreSQL system.

% aqms start (as aqms on pp1)

Turn back on replication from PostgreSQL RTs to PostgreSQL PP.

> SELECT pglogical.alter_subscription_enable(subscription_name := 'rt1subscr', immediate := TRUE); (on archdb1)
> SELECT pglogical.alter_subscription_enable(subscription_name := 'rt2subscr', immediate := TRUE); (on archdb1)

Start PP processes on Oracle system.

Turn back on replication from Oracle RTs to Oracle PPs.

- PCS System:

Pete mentioned in the Wiki:

    "NCSS originally used the autoposter but found that it entered events into PCS too quickly.
    Events would enter the PCS system before there was sufficient event parametric information
    replicated from the RT database to the DC database. As a result, some of the PCS client
    programs would try to work on new events without sufficient information."

On our Oracle AQMS system, “sigswitch” waits for a signal from “trimag” before sending a CMS message to the PP system where “runner” picks it up and start the PCS process.

On our current PostgreSQL system, we have a database trigger on the event table on the PP database that calls a stored procedure to start the PCS process whenever a new row is inserted into the Event table. This means that as soon as a row is populated on the RT DB and replicated to the PP DB, the PCS process is launched for that event. Presumably this could happen before “trimag” is done on the RT side.

We implemented Pete's workaround on the PostgreSQL AQMS system. On the RT side, sigswitch programs for posting events/triggers signals to PP have been set up. On the PP side, PCS poster programs have been deployed to receive CMS messages from the RT system.

In the Oracle AQMS system, the PCS system is initiated by a database job. On the PostgreSQL AQMS system, it is initiated by a trigger on the Event table:

Triggers firing always:
    post_new_event AFTER INSERT ON event FOR EACH ROW EXECUTE PROCEDURE post_new_event()

*    -- This function is run after a new row is inserted into the Event table.
*    -- It figures out what to do for an event of this type from the Autoposter table.
*    -- It then uses the PCS stored procedure putState (post_id) to insert an 
*    -- entry into the PCS_State table.

The 'NewEventPoster' entry in AutoPoster table was removed on archdb1 so that the Event table trigger is only executed for subnet triggers.

archdb1=> select * from autoposter;
   instancename   | controlgroup | sourcetable |   state    | rank | result | acton | eventtype |  lastid  |         lastlddate         |           lddate           | resulttext 
------------------+--------------+-------------+------------+------+--------+-------+-----------+----------+----------------------------+----------------------------+------------
 NewTriggerPoster | EventStream  | archdb1     | NewTrigger |  100 |      1 | EVENT | st        | 60390551 | 2022-08-11 22:11:29.010458 | 2019-11-22 16:45:52.234776 | 
(1 row)

archdb1=>

- Miscellaneous:

MagPrefPriority table was populated manually instead of using the magprefDB script (imports mag_rules.csv).
Applications, AppChannels entries were populated using a modified appChan.sql script. Progid 101 entries were populated manually instead of using the two Java programs (BuildPwsMap & UpdatePwsMap).
Notes from Pete:

The old mag algorithm of "pickew/eqcoda" was what NCSS used when we used
pickew to generate codas. When Lynn wrote the modules coda_aav and coda_dur,
we started using them and configured pick_ew to not produce codas. To
accurately reflect that in the DB, we changed the mag algorithm written by
hyps2ps to "aav/dur/eqcoda". That meant that we needed a magprefpriority rule
for the new mag algorithm. However, we wanted the new Mds to have the same
priority as the old Mds. Because of the constraint in magprefpriority, we had
to allow a slightly different start time to keep the DB happy. In reality, a
start time of "1900/01/01 00:00:00" is the same as "1900/01/01 00:00:01" as
far as RT AQMS processing is concerned. And it saved having to argue the case
with the Schema Change Working Group.

 > adadup_master and adadup_slave don't worry about duplicate data. By the design
 > of rad, it always reads from the WDA on 5 second boundaries. This is done in
 > Convert_Waves_to_Amplitudes::_roundtime(). That ensures that for a given WDA
 > time series, the samples in different instances of rad always refer to the
 > same time intervals. The magic of GCDA's dwriter.C takes care of everythinh
 > else.
 >
 > In RT/branches/linux-dev-CI-branch-64-bit/adadup, I don't see that anything
 > has changed in any of the code for porting to 64-bit. I suspect this
 > still needs to be done, and tested!
 >
 > It is not clear to me that adadup_* could be expected to work in mixed
 > environment: 32-bit on one and and 64-bit on the other. Most of the ADA data
 > types (in lib/rtseis/include/AmplitudeADA.h) include "int" values which will
 > be of different sizes in these two environments.
 >
As soon as I sent this, I realized that it is wrong: int's as 32-bit. It is
long's that could be 32-bit or 64-bit. And this does not explain the problems
you have between MP and UCB.

I did not write adadup. It was done at UCB by a Bellorussian fellow. He left
in a huff after a couple of years. The code is horrible as far as I can tell
and has almost no ducumentation, but it seems to work. When I ported it to
Linux, I compared ADAs on source and destination systems using "adadata" (part
of gcda_utils). I found a bug that skipped sending the last byte of a
collection of ADA values; fixed 2013-01-11.

Over the years, I found that adadup_master does not work properly when
START_TIMES_MODE is set to 1: if a SNCL is used for a while and then
subsequently deleted from the AppChannels table, adadup_master will crash when
restarted. This seems to be a result of reading the START_TIMES_FILE and
finding a start time for a SNCL that no longer is configured for use. Setting
START_TIMES_MODE to 2 avoids this problem.

At all other RSN's the request_card table is not populated on the RT systems anymore; NC is the exception. Therefore the ntrcg2 program was ported; it runs on the RT systems at rt1 and rt2 and then replicate the request cards upwards to the archdb1 system. The PostgreSQL SQL script that sets up RT to PP replication already contains this table (https://vault.gps.caltech.edu/trac/cisn/browser/DB/branches/uw-dev-branch/DBpg/create/replication/rt-pp-repsetup.sql).
The waveform archiver 'wanc' is specific to NC; therefore it was ported. It uses a table AssocWfRC that is not used by the other RSNs. Pete mentioned:

I wrote the NC waveform archiver (wanc) to meet NCEDC (i.e. Doug)
requirements. He wanted to record each whole segment of waveform as a separate
entry; the existing AQMS waveform archiver did not do that. If the NCEDC
requirements have changed, you can use whatever archover you want.

Another feature of wanc is that it tries repeatedly to fill requests that are
not complete. At the time (2005-2006) the AQMS archiver gave up after it got
any data for a request, complete or not. For wanc, we use the AssocWfRc table
to record partially complete requests. After I did that, Caltech decided they
wanted the same feature but Kalpesh implemented it by adding to the
request_card table. Transferring from one system (wanc with AssocWfRc) to
another (wa with modified request_card table) would be quite difficult on a
running AQMS system. But when moving from an Oracle system to Postgres would
not add any problems to what will already be a challenge.

The last feature I added to wanc is the ability to configure an instance of
wanc to handle a limited list of network codes; the default is to handle them
all. That design fits well with the way telemetry for different networks is
handled. In particular, the BG network is (or was) quite slow; and there are
lots of small earthquakes in the Geysers area. So one instance of wanc is
configured to handle only the BG network. 

Last time I looked, the standard AQMS archiver has multiple threads to handle
the load but spreads requests out to each thread randomly. I don't know how
well that would work for NC.

The traditional WA uses the waveroots db table to find the head of the archive and for any event uses waveroots/EVID/filename.dfile (where filename is a table entry). For NCSN the file structure is waveroots/subdir/filename.dfile (e.g. /home/aqms/data/events/active/NC.60031002/MCB.NC.HHZ..D.ms).
Hypomag requires two Caltech specific tables (AQMS_HOST_ROLE & APP_HOST_ROLE). Note from Ellen:

The idea of these tables was that the pcs client could make a decision about whether to run
a process on an event depending on which server it was running on.  But this logic as far as I
know was supposed to be in the pcs client only -- not in the process.

- ADA slave and Postgres system:

The ADA dup system is something unique to NCSS and not used by any other RSN, so this was a first for ISTI encountering this configuration and operation. It seems very hacked as per my comments to the SWG since it dumps endless logs and doesn’t quit if there is a problem. Logs build so quickly that they will choke a filesystem in days if not deleted. Some example sizes are 300M log files on the rt1 server (look in /home/aqms/logs). For this reason, Paul Milligan has had to manually start/stop master procs to get this working … or the master logs fill up too.

To keep the ADA consistent with the other modules that are set up in the new way with pg AQMS, we put all of the configuration files in the /home/aqms/configs/adadup_slave and we control the start and stop of the various modules with adadup_ctl.sh (found in the aqms user’s pathway). This script starts, stops, and provides status to the 4 adadup_slave processes, with the directive start, stop, and status. See example below:

[aqms@rt1 adadup_slave]$ adadup_ctl.sh status
adadup_slave /home/aqms/configs/adadup_slave/adadup_rt1_to_menlo1.cfg NOT RUNNING
adadup_slave /home/aqms/configs/adadup_slave/adadup_rt1_to_menlo2.cfg NOT RUNNING
adadup_slave /home/aqms/configs/adadup_slave/adadup_rt1_to_ucb1.cfg NOT RUNNING
adadup_slave /home/aqms/configs/adadup_slave/adadup_rt1_to_ucb2.cfg NOT RUNNING
[aqms@rt1 adadup_slave]$ adadup_ctl.sh start
Program: conlog 2016-06-03 (Built Nov 19 2019 06:50:46 by aqms@rt1)
Arguments: /home/aqms/logs/adadup_rt1_to_menlo1_console
Log file: /home/aqms/logs/adadup_rt1_to_menlo1_console
Time: 2021/03/18,14:27:42.7752
adadup_slave /home/aqms/configs/adadup_slave/adadup_rt1_to_menlo1.cfg running on pid 14434
Program: conlog 2016-06-03 (Built Nov 19 2019 06:50:46 by aqms@rt1)
Arguments: /home/aqms/logs/adadup_rt1_to_menlo2_console
Log file: /home/aqms/logs/adadup_rt1_to_menlo2_console
Time: 2021/03/18,14:27:42.7893
adadup_slave /home/aqms/configs/adadup_slave/adadup_rt1_to_menlo2.cfg running on pid 14442
Program: conlog 2016-06-03 (Built Nov 19 2019 06:50:46 by aqms@rt1)
Arguments: /home/aqms/logs/adadup_rt1_to_ucb1_console
Log file: /home/aqms/logs/adadup_rt1_to_ucb1_console
Time: 2021/03/18,14:27:42.8049
adadup_slave /home/aqms/configs/adadup_slave/adadup_rt1_to_ucb1.cfg running on pid 14450
Program: conlog 2016-06-03 (Built Nov 19 2019 06:50:46 by aqms@rt1)
Arguments: /home/aqms/logs/adadup_rt1_to_ucb2_console
Log file: /home/aqms/logs/adadup_rt1_to_ucb2_console
Time: 2021/03/18,14:27:42.8347
adadup_slave /home/aqms/configs/adadup_slave/adadup_rt1_to_ucb2.cfg running on pid 14458
[aqms@rt1 adadup_slave]$ adadup_ctl.sh status
adadup_slave /home/aqms/configs/adadup_slave/adadup_rt1_to_menlo1.cfg running on pid 14434
adadup_slave /home/aqms/configs/adadup_slave/adadup_rt1_to_menlo2.cfg running on pid 14442
adadup_slave /home/aqms/configs/adadup_slave/adadup_rt1_to_ucb1.cfg running on pid 14450
adadup_slave /home/aqms/configs/adadup_slave/adadup_rt1_to_ucb2.cfg running on pid 14458
[aqms@rt1 adadup_slave]$

- NCSS DRP:

In the database:

psql -U rtem -d archdb1 -h 127.0.0.1
select * from pcs_transition;
  groupold   | sourceold |  stateold  | resultold |  groupnew   | sourcenew | statenew | ranknew | resultnew | auth | subsource |           lddate
-------------+-----------+------------+-----------+-------------+-----------+----------+---------+-----------+------+-----------+----------------------------
 EventStream | archdb1   | NewEvent   |         1 | EventStream | archdb1   | rcg_rt   |     100 |           |      |           | 2019-11-22 08:28:39.23686
 EventStream | archdb1   | rcg_rt     |         1 | null        | null      | null     |         |           |      |           | 2019-11-22 08:28:39.320322
 EventStream | archdb1   | NewTrigger |         1 | EventStream | archdb1   | rcg_trig |     100 |           |      |           | 2019-11-22 08:28:39.403891
 EventStream | archdb1   | rcg_trig   |         1 | null        | null      | null     |         |           |      |           | 2019-11-22 08:28:39.488333
(4 rows)

Per standard install:

[as root] Php and other modules already installed
[as root] Httpd enabled and started
[as aqms] export dbasename=archdb1
puttrans for MakeGif and MakeTrigGif
mkdir ~/data/waveform_gifs
~/www/eventfiles already linked to above directory
No SELinux
ln -s ~aqms/www /var/www/html/drp
wfgifctl_.sh start (works, no problems)
drp_pcs.sh and trp_pcs.sh already have archdb1 as db and 7 days age
Properties_event.cfg and properties_trigger.cfg
- set tpp password
- Set a default velocity model (used binder’s allcal_model.d and POS 1.78 from hypoinverse setup)
cgi-bin/phpmods/SiteConf.php
- $networkName = “NCSN”;
- $networkCode = “NC”;
- $inRegionBoundaryPolygon = array('lat1 lon1’,’lat2 lon2’…’lat1 lon1’) [set to the NC authoritative polygon in ~/config/db/sql/pg_network_regions.sql]
- Shakemap default for local shakemaps set to “http://www.ncedc.org/ncsn” which is probably incorrect
- Reporting addresses set to @ncedc.org, but they probably don’t exist
- $GIFarchive set to this machine’s IP/display.php
cgi-bin/phpmods/ppx.php
- Browser password set
- Tpp password set
- Archdb1 host set to pp1
[as postgres] edit 10/data/pg_hba.conf
- Add: host archdb1 browser 0.0.0.0/0 md5
- bin/pg_ctl reload

- Notes from UW:

Failover:

The shadow is essentially a clone of the primary.

In an orderly role switch,  The primary is made shadow and vice versa
in a single command (could roll your own but repmgr does all the heavy
lifting for you).  In a failover, the db you bring up must first be
made into a synchronized clone-then your can role switch.  This is
fairly straightforward to do but I need to write some documentation. 
If everything is working right, the subscriptions on the new primary
come up automatically (new shadow is read only so no subscriptions are
possible).

It is possible to have automatic failover, but I haven't implemented
yet.  Strictly manual, but single command (at least the db part). 
repmgr can handle this as well, but I didn't want to add complications.
Of course if the failover is due to db failure, the new shadow must be
re-synced before the role can be switched.  

There are several ways to resync depending on what you are working with
in terms of damage and how long the outage was. This will probably
always need to be a manual operation. However, if the wal files are
available then the resync happens automatically.

You also need to make sure the formerly offline db comes up as shadow
(can’t have two masters).

I think the safest recovery method involves pg_rewind.  I have used but only
once. This is like a complete rebuild from master, but only applies changes
since the two db were in sync.

Waveform Archiving:

The waveform archiver runs on the primary system but its disk is continuously synced with a
matching disk on the shadow machine. A cron job runs every 15 minutes on both machines;
It only functions if it is running on the standby system though.

Q/A session with Victor:

1.- Are you using asynchronous logical streaming replication from RT to PP?

We use the free version of the 2ndQuadrant pglogical extension. Once again, there is a paid version,
and the documentation could be better, but once you get the hang of it this works very well. 2ndQuadrant
is a very good open-source player, both contributing heavily to postgres and maintaining open-sourceversions
of all their software. I would not commit to this if not (have been bitten hard by vendor lock in).

2.- What about replication from PP master to PP standby? Is it the same?

No, here we use the built-in streaming logical replication. Lots of details on how this was arrived at.
We configure as hot-standby which means that the standby database can be queried read-only. This has
turned out to be very useful as some science users have done queries that really tax the db and we can
point them to the read-only db without taxing the primary. This can be easily cascaded to research
databases, which is a plan for the future. We use the 2ndQuadrant repmgr extension to manage streaming
replication and role swaps, though this could be done with custom scripts. LOTS of details here, but
this is the gist.

3.- Do you know the differences between 2nd Quadrant pglogical and the core postgreSQL logical replication?
From what I understand, you chose to use the former. Any benefits to that?

2ndQuadrant software is developed in cooperation with the main postgres development team and they pledge
to fold appropriate aspects of their infrastructure into the postgres trunk over time. The logical
replication features in pg10 are ports of the pglogical package. The syntax is somewhat different, and
in some ways better than the pglogical syntax. It was not an obvious decision whether to switch to the
built-in package when it came out in v10. We decided to stay with pglogical because:

1. They will be continuing development.
2. They pledge to continue folding features into the core postgres version.
3. There are some convenient but not critical features we make use of.
4. v10 version is new and rapidly evolving.

I intend to do some experimenting with the core package when I get time, but there is no rush.

4.- When you do a role switch, is there any chance that some transactions will get lost in the process?
Like if new transactions were to happen while the switch process is taking place. If so, how do you
handle those?

There is a distinction here between RT->PP logical and PP->PP streaming, but in both cases there is lots
of protection and semaphore locks to prevent data loss in various failure scenarios. Except in the case
of sudden hardware failure, if you can write it it will be propagated.

5.- What about if PP master crashes. Can you do a role switch and redirect any pending transactions
that were bound to PP master to the new PP master or are those lost?

We do not have the PP role switch set to automatic, though this is possible. In practice, there is
negligible delay between a commit on the primary and shadow. Streaming is set up so that the target
will always be in a consistent state, for example a non-committed transaction will be rolled back.
Data loss is possible in a sudden hardware crash regardless of whether streaming is involved. The
difference is that it may be in an inconsistent state on primary (if sufficiently disorderly shutdown)
while it will always be consistent on shadow. We have an ip address dedicated to the primary database
and part of the role switch is bringing this virtual address up on the new primary.

6.- What if you do a role switch from PP1 to PP2 but some transactions bound to PP2 were still pending on PP1.
How would you recover from such a scenario?

Oops, was talking about PP->PP above, so will talk about RT->PP here. Both PP machines subscribe to logical
streaming on both RT machines through distinct subscriptions. The standby is not writeable, so these
subscriptions are disabled. When it becomes primary, the subscriptions are automatically enabled and it
will sync with the current state of both RT systems. This is well tested and works remarkably well.

NCSS Wiki

Table of Contents

PostgreSQL Test AQMS System

Documents

PostgreSQL Database

NC AQMS Test System