User Tools

Site Tools


operations:db_ncedc:pgsql_prod

PostgreSQL Production AQMS System

Documents

PostgreSQL Database

- PostgreSQL 14 Installation: (Client/Server/Devel - done by sys admins)

- AQMS DB Instance Creation:

  • Install the following extensions: (Software installed by sys admins)
                                         List of installed extensions
          Name          | Version |   Schema   |                         Description                          
------------------------+---------+------------+--------------------------------------------------------------
 dblink                 | 1.2     | public     | connect to other PostgreSQL databases from within a database
 fuzzystrmatch          | 1.1     | public     | determine similarities and distance between strings
 pglogical              | 2.4.3   | pglogical  | PostgreSQL Logical Replication
 pljava                 | 1.5.8   | sqlj       | PL/Java procedural language (https://tada.github.io/pljava/)
 plpgsql                | 1.0     | pg_catalog | PL/pgSQL procedural language
 postgis                | 3.3.2   | public     | PostGIS geometry and geography spatial types and functions
 postgis_tiger_geocoder | 3.3.2   | tiger      | PostGIS tiger geocoder and reverse geocoder
 postgis_topology       | 3.3.2   | topology   | PostGIS topology spatial types and functions
(8 rows)
  • Edit the pg_hba.conf & postgresql.conf files accordingly and restart the database server (see Replication section).
> vi ~/data/pg_hba.conf
> vi ~/data/postgresql.conf
> sudo systemctl restart postgresql-14
  • Retrieve and extract DB package DBpg_NC.tar.gz.
  • Make sure LD_LIBRARY_PATH is set either at the system or service level:
% sudo ldconfig /usr/local/lib

[Service] Environment=LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64:/share/apps/lib:/share/apps/lib64
> cd DBpg/create/postgresql-extension
> vi Makefile
> make
> sudo make install
  • Create database objects (Look at DBpg_NC/create/README):
>  cd DBpg/create

>  vi pg.env
	- Update PSQLDIR.
	- Update PGDATABASE.
	- Update passwords.

> ./generate_sql_scripts.sql
	- Create password file.
	- Update templates (db, users, sequences).

> ./run_as_postgres.sql
	- Create DB.
	- Activate extensions.
	- Create roles.
	- Create users.

> ./run_sql_scripts.sql
	- Creates sequences.
	- Grant privileges on sequences.
	- Create Waveform schema.
	- Create Parametric Information schema.
	- Create Instrument Responses schema.
	- Create Hardware Tracking schema.
	- Create Application schema.
	- Grant privileges on all tables.
	- Create views.
	- Grant privileges on views.
	- Create stored procedures.
	- Create indexes.
	- Create missing objects.
	- Populate specific tables.
	- Remove specific constraints for export/import.
  • If the database is a real time one, drop the trigger on the event table and insert a row in the RT_ROLE & PEER_SYSTEM_STATUS tables:
[postgres@aqms-bk-rt ~]$ psql -d rtdb1 -U trinetdb
psql (14.6)
Type "help" for help.

rtdb1=> DROP TRIGGER post_new_event ON EVENT;
DROP TRIGGER
rtdb1=> \q
[postgres@aqms-bk-rt ~]$ 
[postgres@aqms-bk-rt ~]$ psql -U rtem rtdb1
psql (14.6)
Type "help" for help.

rtdb1=> select * from rt_role;
 primary_system |     modification_time      
----------------+----------------------------
 true           | 2023-03-07 00:45:20.992049
(1 row)

rtdb1=> select * from peer_system_status;
   status    |      mod_time       
-------------+---------------------
 OPERATIONAL | 2020-01-01 00:00:00
(1 row)

rtdb1=> \q
[postgres@aqms-bk-rt ~]$ 

- Replication Setup:

  • PostgreSQL Server Replication Settings:
- Settings in ~/data/pg_hba.conf:
 * Updated 'scram-sha-256' entries to 'md5'.

 * aqms-bk-pp:
        local   all          postgres                              peer
        local   all          all                                   md5
        
	host    replication  repadmin        127.0.0.1/32          md5 # Localhost
	host    archdb1      repadmin        127.0.0.1/32          md5 # Localhost

	host    replication  repadmin        169.229.197.34/24     md5 # Public  aqms-bk-rt & aqms-bk-pp
	host    archdb1      repadmin        169.229.197.34/24     md5 # Public  aqms-bk-rt & aqms-bk-pp

	host    replication  repadmin        192.168.116.45/24     md5 # Private aqms-bk-rt & aqms-bk-pp
	host    archdb1      repadmin        192.168.116.45/24     md5 # Private aqms-bk-rt & aqms-bk-pp

	host    replication  repadmin        137.227.232.160/24    md5 # Public  aqms-nc-rt & aqms-nc-pp
	host    archdb1      repadmin        137.227.232.160/24    md5 # Public  aqms-nc-rt & aqms-nc-pp

	host    replication  repadmin        192.168.119.114/24    md5 # Private aqms-nc-rt & aqms-nc-pp
	host    archdb1      repadmin        192.168.119.114/24    md5 # Private aqms-nc-rt & aqms-nc-pp

 * aqms-nc-pp:
        [Set up as part of streaming replication]
        
 * aqms-bk-rt:
        local   all          postgres                              peer
        local   all          all                                   md5
        
	host    replication  repadmin        127.0.0.1/32          md5 # Locahost
	host    rtdb1        repadmin        127.0.0.1/32          md5 # Locahost

	host    rtdb1        repadmin        169.229.197.34/24     md5 # Public  aqms-bk-rt & aqms-bk-pp
	host    rtdb1        repadmin        192.168.116.45/24     md5 # Private aqms-bk-rt & aqms-bk-pp

	host    rtdb1        repadmin        137.227.232.160/24    md5 # Public  aqms-nc-rt & aqms-nc-pp
	host    rtdb1        repadmin        192.168.119.114/24    md5 # Private aqms-nc-rt & aqms-nc-pp

 * aqms-nc-rt:
        local   all          postgres                              peer
        local   all          all                                   md5
        
	host    replication  repadmin        127.0.0.1/32          md5 # Locahost
	host    rtdb2        repadmin        127.0.0.1/32          md5 # Locahost

	host    rtdb2        repadmin        169.229.197.34/24     md5 # Public  aqms-bk-rt & aqms-bk-pp
	host    rtdb2        repadmin        192.168.116.45/24     md5 # Private aqms-bk-rt & aqms-bk-pp

	host    rtdb2        repadmin        137.227.232.160/24    md5 # Public  aqms-nc-rt & aqms-nc-pp
	host    rtdb2        repadmin        192.168.119.114/24    md5 # Private aqms-nc-rt & aqms-nc-pp


- Settings in ~/data/postgresql.conf:

 * aqms-bk-pp:
	listen_addresses = '*'	    # what IP address(es) to listen on;
	wal_level = 'logical'       # pglogical
	max_worker_processes = 10   # pglogical one per database needed on provider node
                            	    # pglogical one per node needed on subscriber node
	max_replication_slots = 80  # pglogical one per node needed on provider node (pp should be twice rt setting)
	max_wal_senders = 10        # pglogical one per node needed on provider node
	shared_preload_libraries = 'pglogical' # pglogical
	track_commit_timestamp = on # pglogical needed for last/first update wins conflict resolution
                            	    # pglogical property available in PostgreSQL 9.5+
	wal_keep_size = 8192        # (Only pp) in logfile segments*16, 16MB each; 0 disables
	hot_standby = on            # "off" disallows queries during recovery
        hot_standby_feedback = on   # send info from standby to prevent query conflicts

 * aqms-nc-pp:
	[Set up as part of streaming replication]

 * aqms-bk-rt:
	listen_addresses = '*'	    # what IP address(es) to listen on;
	wal_level = 'logical'       # pglogical
	max_worker_processes = 10   # pglogical one per database needed on provider node
                            	    # pglogical one per node needed on subscriber node
	max_replication_slots = 40  # pglogical one per node needed on provider node (pp should be twice rt setting)
	max_wal_senders = 10        # pglogical one per node needed on provider node
	shared_preload_libraries = 'pglogical' # pglogical
	track_commit_timestamp = on # pglogical needed for last/first update wins conflict resolution

 * aqms-nc-rt:
	listen_addresses = '*'	    # what IP address(es) to listen on;
	wal_level = 'logical'       # pglogical
	max_worker_processes = 10   # pglogical one per database needed on provider node
                                    # pglogical one per node needed on subscriber node
	max_replication_slots = 40  # pglogical one per node needed on provider node (pp should be twice rt setting)
	max_wal_senders = 10        # pglogical one per node needed on provider node
	shared_preload_libraries = 'pglogical' # pglogical
	track_commit_timestamp = on # pglogical needed for last/first update wins conflict resolution
  • PGLogical Replication From rtdb1/rtdb2 To archdb1:
1.- Install 2ndQuadrant pglogical extension on rtdb1, rtdb2 & archdb1:

 * Download and install 2ndQuadrant extension (done by sys admins).
        
 * As postgres: (if extension was installed after the database was created)
	$ sudo systemctl restart postgresql-14
	$ psql
	= alter system set shared_preload_libraries = 'pglogical';
	= \c archdb1
	= CREATE EXTENSION IF NOT EXISTS pglogical;

 * As postgres:
	= set pglogical.conflict_resolution = 'last_update_wins';

2.- Create provider & subscriber nodes:

 * aqms-bk-rt:
	= \c rtdb1
	= SELECT pglogical.drop_node('rtnode1',false);
	= SELECT pglogical.create_node(node_name := 'rtnode1',dsn:='host=aqms-bk-rt.ncss-ucb.cisn.org port=5432 dbname=rtdb1 user=repadmin password=repadmin_pass');
	= GRANT CREATE, USAGE ON SCHEMA pglogical TO repadmin;

 * aqms-nc-rt:
	= \c rtdb2
	= SELECT pglogical.drop_node('rtnode2',false);
	= SELECT pglogical.create_node(node_name := 'rtnode2',dsn:='host=aqms-nc-rt.ncss-mp.cisn.org port=5432 dbname=rtdb2 user=repadmin password=repadmin_pass');
        = GRANT CREATE, USAGE ON SCHEMA pglogical TO repadmin;

 * aqms-bk-pp:
	= \c archdb1
	= GRANT CREATE, USAGE ON SCHEMA pglogical TO repadmin;
	= SELECT pglogical.drop_node('ppnode1',false);
        = SELECT pglogical.create_node(node_name := 'ppnode1',dsn:='host=aqms-bk-pp.ncss-ucb.cisn.org port=5432 dbname=archdb1 user=repadmin password=repadmin_pass');

 To check the results:
	= SELECT * FROM pglogical.node_interface;

3.- Create replication sets on aqms-bk-rt & aqms-nc-rt:

	= \c rtdb1
	= SELECT pglogical.drop_replication_set(set_name:='rtrepset');
	= SELECT pglogical.create_replication_set(set_name:='rtrepset', replicate_insert:=TRUE,replicate_update:=TRUE, replicate_delete:=FALSE, replicate_truncate:=FALSE);
	= DO $$
	DECLARE
	count INTEGER := 0;
	reptables varchar[] := ARRAY['amp','ampset','arrival','assocamm',
	'assocamo','assocaro','assoccom','assoccoo','assocevampset',
	'assocnte','assocwae','coda','credit','event',
	'eventprefmag','eventprefmec','eventprefor','filename','mec',
	'mecchannel','mecdata','mecfreq','mecfreqdata','mecobject',
	'netmag','nettrig','origin','origin_error','pathname',
        'pdl_ids','pdl_product','remark','request_card','subdir',
        'swarm_events','swarm_state','trig_channel','unassocamp',
        'wavefileroots','waveform'];
	tbl varchar;
	BEGIN
	FOREACH tbl IN ARRAY reptables LOOP
	PERFORM
	pglogical.replication_set_add_table(set_name:='rtrepset', relation:='trinetdb.'||tbl,synchronize_data:=TRUE);
	count := count+1;
	END LOOP;
	-- RETURN count ;
	END;
	$$;

	= \c rtdb2
	= SELECT pglogical.drop_replication_set(set_name:='rtrepset');
	= SELECT pglogical.create_replication_set(set_name:='rtrepset', replicate_insert:=TRUE,replicate_update:=TRUE, replicate_delete:=FALSE, replicate_truncate:=FALSE);
	= DO $$
	DECLARE
	count INTEGER := 0;
	reptables varchar[] := ARRAY['amp','ampset','arrival','assocamm',
	'assocamo','assocaro','assoccom','assoccoo','assocevampset',
	'assocnte','assocwae','coda','credit','event',
	'eventprefmag','eventprefmec','eventprefor','filename','mec',
	'mecchannel','mecdata','mecfreq','mecfreqdata','mecobject',
	'netmag','nettrig','origin','origin_error','pathname',
        'pdl_ids','pdl_product','remark','request_card','subdir',
        'swarm_events','swarm_state','trig_channel','unassocamp',
        'wavefileroots','waveform'];
	tbl varchar;
	BEGIN
	FOREACH tbl IN ARRAY reptables LOOP
	PERFORM
	pglogical.replication_set_add_table(set_name:='rtrepset', relation:='trinetdb.'||tbl,synchronize_data:=TRUE);
	count := count+1;
	END LOOP;
	-- RETURN count ;
	END;
	$$;

 Tables can be removed from a replication set with:
	= SELECT pglogical.replication_set_remove_table('rtrepset', 'trinetdb.<table_name>');

 Tables belonging to a replication set can be listed with:
        = SELECT relname FROM pglogical.tables WHERE set_name='rtrepset' order by relname;

4.- Create subscriptions on aqms-bk-pp:

	= \c archdb1
	= SELECT pglogical.create_subscription(
	subscription_name := 'rt1subscr',
	provider_dsn := 'host=aqms-bk-rt.ncss-ucb.cisn.org port=5432 user=repadmin
	password=repadmin_pass_ucb dbname=rtdb1',
	replication_sets := ARRAY['rtrepset'],
	synchronize_structure := 'FALSE',
	synchronize_data := FALSE
	);

	= SELECT pglogical.create_subscription(
	subscription_name := 'rt2subscr',
	provider_dsn := 'host=aqms-nc-rt.ncss-mp.cisn.org port=5432 user=repadmin
	password=repadmin_pass_wr dbname=rtdb2',
	replication_sets := ARRAY['rtrepset'],
	synchronize_structure := 'FALSE',
	synchronize_data := FALSE
	);

 In case of permission errors you may need to grant repadmin privileges to the schema, e.g.:
	= \c rtdb1
	= GRANT CREATE, USAGE ON SCHEMA pglogical TO repadmin;

 Subscriptions can be dropped with:
	= SELECT pglogical.drop_subscription( subscription_name := 'rt1subscr');
        = SELECT pglogical.drop_subscription( subscription_name := 'rt2subscr');

 To check the replication:
	= \c archdb1
	= SELECT subscription_name,status,provider_node,replication_sets FROM pglogical.show_subscription_status();
 	 subscription_name |   status    | provider_node | replication_sets 
	-------------------+-------------+---------------+------------------
 	 rt1subscr         | replicating | rtnode1       | {rtrepset}
 	 rt2subscr         | replicating | rtnode2       | {rtrepset}

	 if status is not 'replicating', do:
		SELECT pglogical.alter_subscription_synchronize(subscription_name := 'rt1subscr',truncate:=FALSE);
		or/and
		SELECT pglogical.alter_subscription_synchronize(subscription_name := 'rt2subscr',truncate:=FALSE);

 To temporarily disable the subscription:
	= \c archdb1
	= SELECT pglogical.alter_subscription_disable(subscription_name := 'rt1subscr', immediate := TRUE);
	= SELECT pglogical.alter_subscription_disable(subscription_name := 'rt2subscr', immediate := TRUE);

 To re-enable the subscription:
	= \c archdb1
	= SELECT pglogical.alter_subscription_enable(subscription_name := 'rt1subscr', immediate := TRUE);
	= SELECT pglogical.alter_subscription_enable(subscription_name := 'rt2subscr', immediate := TRUE);
  • Streaming Replication Between PP Instances:
1.- Create a cluster on the standby machine:
	$ /usr/pgsql-14/bin/postgresql-14-setup initdb
	Initializing database ... OK

2.- Copy an image from the active DB:
  * Stop server:
	$ sudo systemctl stop postgresql-14

  * Copy an image of the active db on active server to local directory as user postgres@aqms-nc-pp:
	$ /usr/pgsql-14/bin/pg_basebackup -h aqms-bk-pp.ncss-ucb.cisn.org -R -D /var/lib/pgsql/14/data -U repadmin -v -P

	where:
		 '-h ...' specifies the active host.
		 '-R'     writes configuration for replication.
		 '-D ...' specifies where the image should be put.
		 '-U ...' is the replication user name.
		 '-v'     specifies verbose mode.
		 '-P'     show progress information.

 * Start server:
	$ sudo systemctl start postgresql-14

 * Create .pgpass file:
	$ vi ~/.pgpass

3.- Replication monitoring:

 * Processes:
        On Master
        ==========
        $ ps -eaf | grep sender

        On Slave
        ==========
        $ ps -eaf | grep receiver
        $ ps -eaf | grep startup

        You must see those all three processes running on master and slave.

 * SELECT * FROM pg_stat_replication;
	will return useful information on active DB and empty on standby DB.

 * SELECT * FROM pg_stat_wal_receiver;
 	will return useful information on standby DB but empty on active DB.

 * SELECT pg_is_in_recovery();
	will return TRUE on standby DB and FALSE on other nodes.

4.- Replication slots:

 * Create replication slot as postgres@archdb1 on aqms-nc-pp:
        # SELECT * FROM pg_create_physical_replication_slot('aqmsbkpp');
 
 * Add "primary_slot_name = 'aqmsbkpp'" in postgresql.auto.conf on aqms-bk-pp.

 * SELECT slot_name, slot_type, active FROM pg_replication_slots;
                 slot_name | slot_type | active
                -----------+-----------+--------
                 barman    | physical  | t
                 aqmsbkpp  | physical  | t
                (2 rows)

 * Create replication slot as postgres@archdb1 on aqms-bk-pp:
        # SELECT * FROM pg_create_physical_replication_slot('aqmsncpp');
        
 * Add "primary_slot_name = 'aqmsncpp'" in postgresql.auto.conf on aqms-nc-pp.

 * SELECT slot_name, slot_type, active FROM pg_replication_slots;
                 slot_name | slot_type | active
                -----------+-----------+--------
                 barman    | physical  | t
                 aqmsncpp  | physical  | t
                (2 rows)

- REPMGR setup:

REPMGR SETUP ON AQMS-BK-PP/AQMS-NC-PP:

1.- Install 2ndQuadrant repmgr extension on aqms-bk-pp (done by sys admins).

 * The repmgr extension can only be installed by a superuser. If the repmgr user is a superuser, repmgr will create the extension automatically.
   Alternatively, the extension can be created manually by a superuser (with "CREATE EXTENSION repmgr") before executing repmgr primary register.

2.- Create a repmgr database on primary server (aqms-bk-pp):
	% createdb repmgr -O repadmin

3.- Update pg_hba.conf file on aqms-bk-pp & aqms-nc-pp (add them before any other custom entries):

	local  replication   repadmin        trust
	local  repmgr        repadmin        trust

	host   replication   repadmin        127.0.0.1/32          trust # Localhost
	host   repmgr        repadmin        127.0.0.1/32          trust # Localhost

	host   replication   repadmin        169.229.197.34/24     trust # Public  aqms-bk-rt & aqms-bk-pp
	host   repmgr        repadmin        169.229.197.34/24     trust # Public  aqms-bk-rt & aqms-bk-pp
	host   archdb1       repadmin        169.229.197.34/24     trust # Public  aqms-bk-rt & aqms-bk-pp

	host   replication   repadmin        192.168.116.45/24     trust # Private aqms-bk-rt & aqms-bk-pp
	host   repmgr        repadmin        192.168.116.45/24     trust # Private aqms-bk-rt & aqms-bk-pp
	host   archdb1       repadmin        192.168.116.45/24     trust # Private aqms-bk-rt & aqms-bk-pp

	host   replication   repadmin        137.227.232.160/24    trust # Public  aqms-nc-rt & aqms-nc-pp
	host   repmgr        repadmin        137.227.232.160/24    trust # Public  aqms-nc-rt & aqms-nc-pp
	host   archdb1       repadmin        137.227.232.160/24    trust # Public  aqms-nc-rt & aqms-nc-pp

	host   replication   repadmin        192.168.119.114/24    trust # Private aqms-nc-rt & aqms-nc-pp
	host   repmgr        repadmin        192.168.119.114/24    trust # Private aqms-nc-rt & aqms-nc-pp
	host   archdb1       repadmin        192.168.119.114/24    trust # Private aqms-nc-rt & aqms-nc-pp

4.- Update postgresql.conf file on aqms-bk-pp & aqms-nc-pp:

        wal_log_hints = on

5.- Prepare the standby (aqms-nc-pp):

  * On the standby, check that the primary database is reachable:
	% psql 'host=aqms-bk-pp.ncss-ucb.cisn.org user=repadmin dbname=repmgr connect_timeout=2'

6.- Create repmgr configuration file on aqms-bk-pp & aqms-nc-pp:

  * On the primary server (aqms-bk-pp):
	% cat /etc/repmgr/14/repmgr.conf
		node_id=1
		node_name=aqmsbkpp
		conninfo='host=aqms-bk-pp.ncss-ucb.cisn.org user=repadmin dbname=repmgr connect_timeout=2 application_name=aqmsbkpp'
		data_directory =  '/home/pgsql/data'
                pg_bindir='/usr/pgsql-14/bin/'
		service_start_command = 'sudo /bin/systemctl start postgresql-14'
		service_stop_command = 'sudo /bin/systemctl stop postgresql-14'
		service_restart_command = 'sudo /bin/systemctl restart postgresql-14'
		service_reload_command = 'sudo /bin/systemctl reload postgresql-14'

  * On the standby server (aqms-nc-pp):
	% cat /etc/repmgr/14/repmgr.conf
		node_id=2
		node_name=walreceiver
		conninfo='host=aqms-nc-pp.ncss-mp.cisn.org user=repadmin dbname=repmgr connect_timeout=2 application_name=walreceiver'
		data_directory = '/var/lib/pgsql/14/data'
                pg_bindir='/usr/pgsql-14/bin/'
		service_start_command = 'sudo /bin/systemctl start postgresql-14'
		service_stop_command = 'sudo /bin/systemctl stop postgresql-14'
		service_restart_command = 'sudo /bin/systemctl restart postgresql-14'
		service_reload_command = 'sudo /bin/systemctl reload postgresql-14'

7.- Register the servers:

  * On the primary server (aqms-bk-pp):
	% repmgr -f /etc/repmgr/14/repmgr.conf primary register
	% repmgr -f /etc/repmgr/14/repmgr.conf cluster show

  * On the standby server (aqms-nc-pp):
	% repmgr -f /etc/repmgr/14/repmgr.conf standby register
	% repmgr -f /etc/repmgr/14/repmgr.conf cluster show

8.- Establish SSH passwordless connection between the servers:

  * On each machine run:
       % ssh-keygen
  * Add contents of ~/.ssh/id_rsa.pub to the other machine's ~/.ssh/authorized_keys file.

- BARMAN setup:

BARMAN setup on aqms-bk-aux - Target DB (archdb1@aqms-bk-pp) - Using continuously streaming backups with WALs:


* Install Barman (done by sys admins).

* Configure Barman: (as barman)

  - Update /etc/barman/barman.conf:
	[barman]
	barman_user = barman
	configuration_files_directory = /etc/barman/conf.d
	barman_home = /var/lib/barman
	log_file = /var/log/barman/barman.log
	log_level = INFO
	retention_policy = RECOVERY WINDOW OF 2 WEEKS

* Configure server connection: (as barman)

  - Create /etc/barman/conf.d/pg.conf:
	[pg]
	description =  "archdb1 Database (Streaming-Only)"
	conninfo = host=aqms-bk-pp.geo.berkeley.edu user=barman dbname=archdb1
	streaming_conninfo = host=aqms-bk-pp.geo.berkeley.edu user=barman
	backup_method = postgres
	streaming_archiver = on
	slot_name = barman

* Set up storage directory: (as barman)

  $ ll -l /var/lib/barman
  total 0
  lrwxrwxrwx 1 barman barman 28 Feb 10 08:42 pg -> /data/archdb1-dump/barman/pg
  $

* Add 'export PATH=$PATH:/usr/pgsql-14/bin/' to ~barman/.bash_profile (as barman)

* Create barman user for PostgreSQL: (as postgres)

  $ createuser -s -P barman

* Test psql connection: (as barman)

  - Create ~barman/.pgpass:
	aqms-bk-pp.geo.berkeley.edu:5432:*:barman:barman_password

  - Test connection:
	psql -c 'SELECT version()' -U barman -h aqms-bk-pp.geo.berkeley.edu archdb1

* Configure PostgreSQL: (as postgres)

  - Update pga_hba.conf to include:
	# barman
	host	archdb1	    barman	    169.229.197.88/32	     md5 # Public aqms-bk-aux
	host	replication barman	    169.229.197.88/32        md5 # Public aqms-bk-aux

* Test replication connection: (as barman)

  $ psql -U barman -h aqms-bk-pp.geo.berkeley.edu -c 'IDENTIFY_SYSTEM' replication=1

* Set up SSH for restores: (as barman)

  - Run: $ ssh-keygen -t rsa
  - Save the key to ~/.ssh/pg
  - Copy the value of the public key in ~/.ssh/pg.pub
  - On the PostgreSQL server, open or create ~postgres/.ssh/authorized_keys and add the public key

* Create replication slot: (as barman)

  $ barman receive-wal --create-slot pg

* Run maintenance tasks: (as barman)

  $ barman cron
  Starting WAL archiving for server pg
  Starting streaming archiver for server pg
  $

* Verify the WAL archiving configuration:

  $ barman switch-xlog --force --archive pg
  
* Check the server configuration: (as barman)

  $ barman check pg
  Server pg:
	PostgreSQL: OK
	superuser or standard user with backup privileges: OK
	PostgreSQL streaming: OK
	wal_level: OK
	replication slot: OK
	directories: OK
	retention policy settings: OK
	backup maximum age: OK (no last_backup_maximum_age provided)
	backup minimum size: OK (0 B)
	wal maximum age: OK (no last_wal_maximum_age provided)
	wal size: OK (0 B)
	compression settings: OK
	failed backups: OK (there are 0 failed backups)
	minimum redundancy requirements: OK (have 0 backups, expected at least 0)
	pg_basebackup: OK
	pg_basebackup compatible: OK
	pg_basebackup supports tablespaces mapping: OK
	systemid coherence: OK (no system Id stored on disk)
	pg_receivexlog: OK
	pg_receivexlog compatible: OK
	receive-wal running: OK
	archiver errors: OK
  $ 

* Perform a full backup: (as barman)

  $ barman backup pg
  Starting backup using postgres method for server pg in /var/lib/barman/pg/base/20230210T103815
  Backup start at LSN: 0/5000060 (000000010000000000000005, 00000060)
  Starting backup copy via pg_basebackup for 20230210T103815
  Copy done (time: 1 second)
  Finalising the backup.
  This is the first backup for server pg
  WAL segments preceding the current backup have been found:
	000000010000000000000004 from server pg has been removed
  Backup size: 51.1 MiB
  Backup end at LSN: 0/7000000 (000000010000000000000006, 00000000)
  Backup completed (start time: 2023-02-10 10:38:15.474479, elapsed time: 1 second)
  Processing xlog segments from streaming for pg
	000000010000000000000005
  WARNING: IMPORTANT: this backup is classified as WAITING_FOR_WALS, meaning that Barman has not received yet 
all the required WAL files for the backup consistency.
This is a common behaviour in concurrent backup scenarios, and Barman automatically set the backup as DONE once all the required WAL files have been archived.
Hint: execute the backup command with '--wait'
  $ 

* List available backups: (as barman)

  $ barman list-backup pg
  pg 20230210T103815 - Fri Feb 10 10:38:16 2023 - Size: 67.1 MiB - WAL Size: 0 B
  $ 
  
* Delete a backup: (as barman)

  $ barman delete pg 20230210T103815
  Deleting backup 20230210T103815 for server pg
  Delete associated WAL segments:
  Deleted backup 20230210T103815 (start time: Fri Feb 10 10:40:05 2023, elapsed time: less than one second)
  $ 

* Check the status of the PostgreSQL server: (as barman)

  $ barman status pg
  Server pg:
	Description: archdb1 Database (Streaming-Only)
	Active: True
	Disabled: False
	PostgreSQL version: 14.6
	Cluster state: in production
	pgespresso extension: Not available
	Current data size: 51.1 MiB
	PostgreSQL Data directory: /home/pgsql/data
	Current WAL segment: 000000010000000000000007
	Passive node: False
	Retention policies: enforced (mode: auto, retention: RECOVERY WINDOW OF 2 WEEKS, WAL retention: MAIN)
	No. of available backups: 0
	First available backup: None
	Last available backup: None
	Minimum redundancy requirements: satisfied (0/0)
  $ 

* Check the streaming status: (as barman)

  $ barman replication-status pg
  Status of streaming clients for server 'pg':
  Current LSN on master: 0/70001B0
  Number of streaming clients: 1

  1. Async WAL streamer
     Application name: barman_receive_wal
     Sync stage      : 3/3 Remote write
     Communication   : TCP/IP
     IP Address      : 169.229.197.35 / Port: 48662 / Host: -
     User name       : barman
     Current state   : streaming (async)
     Replication slot: barman
     WAL sender PID  : 698902
     Started at      : 2023-02-10 10:22:31.835987-08:00
     Sent LSN   : 0/70001B0 (diff: 0 B)
     Write LSN  : 0/70001B0 (diff: 0 B)
     Flush LSN  : 0/7000000 (diff: -432 B)
  $ 

* Set up cron jobs: (as barman)

  */10 * * * * export PATH=$PATH:/usr/pgsql-14/bin/;/usr/bin/barman cron >& /dev/null
  0 0 * * * export PATH=$PATH:/usr/pgsql-14/bin/;/usr/bin/barman backup pg --wait

* Recovery:

  - Stop the PostgreSQL server.
  - Get a list of possible backup spots: $ barman list-backup pg
  - Get details for the backup you choose: $ barman show-backup pg <backup_id>. Note that <backup_id> is the timestamp of the backup.
  - Run: $ barman recover --remote-ssh-command "ssh -i ~/.ssh/pg -o StrictHostKeyChecking=no postgres@aqms-bk-pp" --target-time="<YYYY-MM-DD HH:MM:SS.SSSSSS+00:00>" pg <backup_id> /var/lib/pgsql/14/data
  - Restart the PostgreSQL server.

- Replication Monitoring Setup:

* Provide replication statistics access to repadmin user:

$ psql -U postgres rtdb2
rtdb2=# CREATE FUNCTION func_stat_replication() RETURNS SETOF pg_stat_replication as
rtdb2-# $$ select * from pg_stat_replication; $$
rtdb2-# LANGUAGE sql SECURITY DEFINER;
CREATE FUNCTION
rtdb2=# REVOKE EXECUTE ON FUNCTION func_stat_replication() FROM public;
REVOKE
rtdb2=# GRANT EXECUTE ON FUNCTION func_stat_replication() to repadmin;
GRANT
rtdb2=# \q
$ psql -U repadmin rtdb2
psql (14.7)
Type "help" for help.

rtdb2=> SELECT * FROM func_stat_replication();
   pid   | usesysid | usename  | application_name |  client_addr   | client_hostname | client_port |         backend_start         | backend_xmin |   state   |  sent_lsn  | write_lsn  | flush_lsn  | replay
_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state |          reply_time
---------+----------+----------+------------------+----------------+-----------------+-------------+-------------------------------+--------------+-----------+------------+------------+------------+-------
-----+-----------+-----------+------------+---------------+------------+-------------------------------
 1167789 |    17852 | repadmin | rt2subscr        | 192.168.116.43 |                 |       41830 | 2023-04-03 13:35:32.025688-04 |              | streaming | 3/25476660 | 3/25476660 | 3/25476660 | 3/2547
6660 |           |           |            |             0 | async      | 2023-04-03 14:08:35.289162-04
(1 row)

rtdb2=> \q
$

* Streaming replication delay on standby PP DB (in seconds):

SELECT CASE WHEN pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn()
THEN 0
ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp())
END AS log_delay;

* Logical replication delay on RT DB (in bytes):

select pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)::numeric from pg_stat_replication; 

- RT Role Setup:

* Settings in ~/data/pg_hba.conf on aqms-bk-rt:

host    rtdb1        rtem            169.229.197.34/32          md5 # Public aqms-bk-rt
host    rtdb1        rtem            192.168.116.45/32          md5 # Private aqms-bk-rt
host    rtdb1        rtem            137.227.232.160/32         md5 # Public aqms-nc-rt
host    rtdb1        rtem            192.168.119.114/32         md5 # Private aqms-nc-rt

* Settings in ~/data/pg_hba.conf on aqms-nc-rt:

host    rtdb2        rtem            169.229.197.34/32          md5 # Public aqms-bk-rt
host    rtdb2        rtem            192.168.116.45/32          md5 # Private aqms-bk-rt
host    rtdb2        rtem            137.227.232.160/32         md5 # Public aqms-nc-rt
host    rtdb2        rtem            192.168.119.114/32         md5 # Private aqms-nc-rt

- Memory & Log Configuration: PGTune

* Settings in ~/data/postgresql.auto.conf on aqms-bk-rt: [128GB - 64 processors]

max_connections = 40
shared_buffers = 32GB
effective_cache_size = 96GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 500
random_page_cost = 1.1
effective_io_concurrency = 300
work_mem = 13107kB
huge_pages = try
min_wal_size = 4GB
max_wal_size = 16GB
max_worker_processes = 64
max_parallel_workers_per_gather = 32
max_parallel_workers = 64
max_parallel_maintenance_workers = 4

log_connections = ON
log_line_prefix = '%m user=%u db=%d pid=%p '
log_filename = 'postgresql-%Y-%m-%d.log'

* Settings in ~/data/postgresql.auto.conf on aqms-bk-pp: [128GB - 64 processors]

max_connections = 40
shared_buffers = 32GB
effective_cache_size = 96GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 500
random_page_cost = 1.1
effective_io_concurrency = 300
work_mem = 41943kB
huge_pages = try
min_wal_size = 4GB
max_wal_size = 16GB
max_worker_processes = 20
max_parallel_workers_per_gather = 10
max_parallel_workers = 20
max_parallel_maintenance_workers = 4

log_connections = ON
log_line_prefix = '%m user=%u db=%d pid=%p '
log_filename = 'postgresql-%Y-%m-%d.log'

* Settings in ~/data/postgresql.auto.conf on aqms-nc-rt: [64GB - 20 processors]

max_connections = 40
shared_buffers = 16GB
effective_cache_size = 48GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 500
random_page_cost = 1.1
effective_io_concurrency = 300
work_mem = 20971kB
huge_pages = try
min_wal_size = 4GB
max_wal_size = 16GB
max_worker_processes = 20
max_parallel_workers_per_gather = 10
max_parallel_workers = 20
max_parallel_maintenance_workers = 4

log_connections = ON
log_line_prefix = '%m user=%u db=%d pid=%p '
log_filename = 'postgresql-%Y-%m-%d.log'

* Settings in ~/data/postgresql.auto.conf on aqms-nc-pp: [128GB - 20 processors]

max_connections = 40
shared_buffers = 32GB
effective_cache_size = 96GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 500
random_page_cost = 1.1
effective_io_concurrency = 300
work_mem = 41943kB
huge_pages = try
min_wal_size = 4GB
max_wal_size = 16GB
max_worker_processes = 20
max_parallel_workers_per_gather = 10
max_parallel_workers = 20
max_parallel_maintenance_workers = 4

log_connections = ON
log_line_prefix = '%m user=%u db=%d pid=%p '
log_filename = 'postgresql-%Y-%m-%d.log'

- Enabling Database Encryption:

* Settings in ~/data/postgresql.conf:

ssl = on
ssl_cert_file = 'server.crt'
ssl_key_file = 'server.key'

* Settings in ~/data/pg_hba.conf:

hostssl    archdb1      tpp             169.229.197.88/32     md5 # Public aqms-bk-aux

* Certificates creation:

% cd $PGDATA
% openssl req -new -x509 -days 3650 -nodes -text -out server.crt -keyout server.key -subj "/CN=aqms-bk-pp.ncss-ucb.cisn.org"
% chmod og-rwx server.key

NC AQMS Production System

- Topology:

PUBLIC NCSS CISN
169.229.197.34/aqms-bk-rt.geo.berkeley.edu192.168.116.45/aqms-bk-rt.ncss-ucb.cisn.org192.168.200.45/aqms-bk-rt.ucb.cisn.org
169.229.197.35/aqms-bk-pp.geo.berkeley.edu192.168.116.43/aqms-bk-pp.ncss-ucb.cisn.org 192.168.200.43/aqms-bk-pp.ucb.cisn.org
137.227.232.160/aqms-nc-rt.wr.usgs.gov 192.168.119.114/aqms-nc-rt.ncss-mp.cisn.org 192.168.201.114/aqms-nc-rt.mp.cisn.org
137.227.232.161/aqms-nc-pp.wr.usgs.gov 192.168.119.115/aqms-nc-pp.ncss-mp.cisn.org 192.168.201.115/aqms-nc-pp.mp.cisn.org
169.229.197.88/aqms-bk-aux.geo.berkeley.edu192.168.116.41/aqms-bk-aux.ncss-ucb.cisn.org 192.168.200.41/aqms-bk-aux.ucb.cisn.org
137.227.232.162/aqms-nc-aux.wr.usgs.gov 192.168.119.116/aqms-nc-aux.ncss-mp.cisn.org 192.168.201.116/aqms-nc-aux.mp.cisn.org

* All tables are replicated between aqms-bk-pp (archdb1) and aqms-nc-pp (archdb1 [same DB name on both PP]).

  • A subset of tables (from the PI, WF & AP schema) are replicated from aqms-bk-rt (rtdb1) and aqms-nc-rt (rtdb2) to aqms-bk-pp or aqms-nc-pp:
    • amp
    • ampset
    • arrival
    • assocamm
    • assocamo
    • assocaro
    • assoccom
    • assoccoo
    • assocevampset
    • assocnte
    • assocwae
    • coda
    • credit
    • event
    • eventprefmag
    • eventprefmec
    • eventprefor
    • filename
    • mec
    • mecchannel
    • mecdata
    • mecfreq
    • mecfreqdata
    • mecobject
    • netmag
    • nettrig
    • origin
    • origin_error
    • pathname (replaced by waveroots)
    • pdl_product
    • pdl_ids
    • remark
    • request_card
    • subdir
    • swarm_events
    • swarm_state
    • trig_channel
    • unassocamp
    • wavefileroots (Not used in NC)
    • waveform

- Non replicated tables:

  • PI Schema:
    • ampsettypes (loaded via DDL script)
    • assoctypecat (loaded via DDL script)
    • eventcategory (loaded via DDL script)
    • eventtype (loaded via DDL script)
  • WF Schema:
    • waveroots (Populated manually)
  • AP Schema:
    • alarm_action
    • appchannels (updated via StationUI & AppChan.sql)
    • applications (updated via StationUI & AppChan.sql)
    • assocevents (used by pdl2aqms)
    • assoc_region_group (loaded via DDL script - Not used in NC)
    • autoposter (Populated manually)
    • assocwfrc (used only in NC)
    • channelmap_ampparms (updated via codaparamsDB)
    • channelmap_codaparms (updated via codaparamsDB)
    • config_channel (updated via StationUI & AppChan.sql)
    • credit_alias (Populated manually)
    • epochtimebase (loaded via DDL script)
    • eventmatchpassingscore (Populated manually)
    • eventpriorityparams (Populated manually)
    • gazetteer_region (loaded via regionDB)
    • gazetteer_region_group (loaded via DDL script)
    • gazetteerbigtown (loaded via placesDB)
    • gazetteerpt (loaded via placesDB & quarryDB)
    • gazetteerquake (loaded via placesDB)
    • gazetteertown (loaded via placesDB)
    • gazetteertype (Populated manually)
    • gazetteerline (Not used in NC)
    • gazetteerquarry (loaded via quarryDB)
    • jasieventlock (used by Jiggle)
    • leap_seconds (loaded via DDL script)
    • logger_info (used only in NC - updated via getLoggerInfo.pl)
    • magprefpriority (loaded via magprefDB)
    • messagetext (loaded via DDL script)
    • orgprefpriority (Populated manually)
    • pcs_signal
    • pcs_state
    • pcs_transition
    • peer_system_status
    • program (updated via StationUI & AppChan.sql)
    • realarm (used only in NC)
    • rt_role
    • stacorrections (updated via codaparamsDB (Md) & loadCISNmagcorr (Ml))
    • subnet (used only in SC to populate EW carlsubtrig file)
    • subnet_channel (used only in SC to populate EW carlsubtrig file)
    • system_status
  • IR Schema: (updated via stnxml2aqms)
    • channel_comment
    • channel_data
    • coefficients
    • d_abbreviation
    • d_comment
    • d_format
    • d_format_data
    • d_unit
    • dc
    • dc_data
    • decimation
    • dm
    • pn
    • pn_data
    • poles_zeros
    • polynomial
    • pz
    • pz_data
    • sensitivity
    • simple_response
    • station_comment
    • station_data
  • HT Schema: (not used)
    • datalogger
    • datalogger_board
    • datalogger_module
    • filamp
    • filamp_pchannel
    • filter
    • filter_fir
    • filter_fir_data
    • filter_sequence
    • filter_sequence_data
    • response
    • response_hp
    • response_lp
    • response_pn
    • response_pn_data
    • response_pz
    • sensor
    • sensor_component
    • station
    • station_datalogger
    • station_datalogger_lchannel
    • station_datalogger_pchannel
    • station_digitizer
    • station_digitizer_pchannel
    • station_filamp
    • station_filamp_pchannel
    • station_sensor
    • station_sensor_component
  • CS schema: (loaded via DDL script except for c_channeldata)
    • c_channeldata (updated via stnxml2aqms)
    • c_auth
    • c_datum
    • c_fltrtyp
    • c_lgrtyp
    • c_loggerdata
    • c_net
    • c_ornt
    • c_orntabbrv
    • c_phystyp
    • c_rectyp
    • c_sensordata
    • c_site
    • c_snstyp
    • c_timeclk
    • c_units

- Views:

  • active_channels_view (updated via AppChan.sql)
  • all_channels_view (updated via AppChan.sql)
  • jasi_ampparms_view
  • jasi_channel_view
  • jasi_config_view
  • jasi_response_view
  • jasi_station_view
  • md_magparms_view
  • ml_magparms_view

- Database Sequences:

Values ending in: are generated on:
1,6 RTDB1
2,7 RTDB2
3,8 ARCHDB1
  • PI Schema:
    • evseq ⇒ Event.evid
    • orseq ⇒ Origin.orid
    • magseq ⇒ Netmag.magid
    • arseq ⇒ Arrival.arid
    • ampseq ⇒ Amp.ampid
    • commseq ⇒ Remark.commid
    • mecseq ⇒ Mec.mecid
    • mecdataseq ⇒ MecData.mecdataid
    • mecfreqseq ⇒ MecFreq.mecfreqid
    • coseq ⇒ Coda.coid
    • ampsetseq ⇒ AmpSet.ampsetid
    • catseq ⇒ EventCategory.catid
  • WF Schema:
    • waseq ⇒ Waveform.wfid
    • fiseq ⇒ Filename.fileid
    • sdseq ⇒ SubDir.subdirid
  • AP Schema:
    • unassocseq ⇒ UnAssocAmp.ampid
    • sigseq ⇒ PCS_Signal.sigid
    • reqseq ⇒ Request_Card.rcid
    • dbsetseq ⇒ PDL_Ids.dbsetid
    • ntseq ⇒ NetTrig.ntid
    • trigseq ⇒ Trig_Channel.trigid
    • gazseq ⇒ GazetteerPt.gazid (Not used in NC)
  • IR Schema:
    • dcseq ⇒ DC.key
    • dmseq ⇒ DM.key
    • abbseq ⇒ D_Abbreviation.id
    • uniseq ⇒ D_Unit.id
    • comseq ⇒ D_Comment.id
    • forseq ⇒ D_Format.id
    • poseq ⇒ PN.key
    • pzseq ⇒ PZ.key

- Database Users:

User Description
browser Guest user (read only)
code Stored procedures owner
dcarchiver Waveform archiver user
operator Jiggle user
postgres Super user
repadmin Replication user
rtem Real time user
tpp Post processing user
trinetdb Schema owner
trinetdb_execute Role (execute stored procedures)
trinetdb_read Role (read access)
trinetdb_write Role (write access)
waveform_read Role (read waveforms via DB (jiggle))

- PL/Java setup: GitLab

  • Install PL/Java:
 * As aqms:
        $ cd /home/aqms/PL_Java
        $ wget https://gitlab.com/aqms-swg/aqms-jiggle/-/blob/master/PL_Java/Redhat%208/pljava-pg14.2-amd64-Linux-gpp.jar
        $ wget https://gitlab.com/api/v4/projects/16872152/packages/generic/jiggle/1.0.305/jiggle_p8g-d71520d.jar
        $ mv jiggle_p8g-d71520d.jar jiggle.jar
        $ export PATH="/usr/pgsql-14/bin:$PATH"
 * As root:
        $ java -jar /home/aqms/PL_Java/pljava-pg14.2-amd64-Linux-gpp.jar
  • Install PL/Java PostgreSQL Extension:
 * As postgres:
        $ psql -U postgres rtdb1
        = ALTER DATABASE rtdb1 SET pljava.libjvm_location TO '/usr/lib/jvm/jre/lib/amd64/server/libjvm.so';
        = \q

        $ psql -U postgres rtdb1
        = create schema if not exists db_cleanup authorization code;
        = grant usage on schema db_cleanup to trinetdb_read,trinetdb_execute;
         
        = create schema if not exists eventpriority authorization code;
        = grant usage on schema eventpriority to trinetdb_read,trinetdb_execute;

        = create schema if not exists file_util authorization code;
        = grant usage on schema file_util to trinetdb_read,trinetdb_execute;

        = create schema if not exists formats authorization code;
        = grant usage on schema formats to trinetdb_read,trinetdb_execute;

        = create schema if not exists wavearc authorization code;
        = grant usage on schema wavearc to trinetdb_read,trinetdb_execute;

        = create extension pljava;
        = select sqlj.install_jar('file:/home/aqms/PL_Java/jiggle.jar','jigglejar',true);
        = select sqlj.set_classpath('assocamp', 'jigglejar');
        = select sqlj.set_classpath('db_cleanup', 'jigglejar');
        = select sqlj.set_classpath('eventpriority', 'jigglejar');
        = select sqlj.set_classpath('file_util', 'jigglejar');
        = select sqlj.set_classpath('formats', 'jigglejar');
        = select sqlj.set_classpath('wavearc', 'jigglejar');
        = select sqlj.set_classpath('wheres', 'jigglejar');
        = grant usage on schema sqlj to trinetdb_read,trinetdb_execute;
  • Test Installation:
        $ psql -U rtem rtdb1
        = select wheres.where_from_type(35.7, -117.7, 1., 1, 'bigtown');
  • Upgrading to a new version of Jiggle.jar:
        $ psql -U postgres rtdb1
        = select sqlj.replace_jar('file:/home/postgres/jiggle_name.jar', 'jigglejar',true);

- StationUI setup: GitLab

  • Clone the repo:
% git clone https://gitlab.com/aqms-swg/stationui.git
  • Create a virtual environment:
% cd stationui/
% python3 -m venv /home/aqms/StationUI/venv
  • Activate the environment:
% source /home/aqms/StationUI/venv/bin/activate
  • Edit and install the requirements:
% vi requirements.txt 
% python3 -m pip install -r requirements.txt
  • Make sure that the Applications table contains the 'description' field:
The table applications is one of the 2 tables that StationUI updates. Add the column description
with a data type of varchar(1024) to this table, if this column doesn't already exist.

% psql -d archdb1 -U trinetdb -h aqms-bk-pp
= ALTER TABLE Applications ADD description VARCHAR(1024);
  • Configure Django:
% vi django/datacenter/settings.py 

	(venv) [aqms@aqms-bk-aux datacenter]$ diff TEMPLATE-settings.py settings.py 
	40c40
	< ALLOWED_HOSTS = []
	---
	> ALLOWED_HOSTS = ['aqms-bk-aux.geo.berkeley.edu','aqms-bk-pp.geo.berkeley.edu','moment.geo.berkeley.edu']
	42c42,43
	< ADMINS=[('John', 'john@example.com'), ('Mary','mary@example.com')]
	---
	> ADMINS=[('Stephane', 'stephane@berkeley.edu')]
	> STATIONUI_ADMINS=[('Stephane', 'stephane@berkeley.edu')]
	91c92
	<     # Oracle
	---
	>     # PostgreSQL
	93,98c94,99
	<         'ENGINE': 'django.db.backends.oracle',
	<         'NAME': 'db service name',
	<         'USER': 'db user',
	<         'PASSWORD': 'db password',
	<         'HOST': 'db hostname',
	<         'PORT': 'db port',
	---
	>         'ENGINE': 'django.db.backends.postgresql',
	>         'NAME': 'archdb1',
	>         'USER': 'tpp',
	>         'PASSWORD': '****',
	>         'HOST': 'aqms-bk-pp.geo.berkeley.edu',
	>         'PORT': '5432',
	197c198
	< ORG = 'SCSN'
	---
	> ORG = 'NCSS'
	203c204
	< LOCAL_NETWORKS = ['CI','ZY']
	---
	> LOCAL_NETWORKS = ['BK','BP','NC']
	206c207
	< #CHANNEL_LIST_SOURCE = [('db','from AQMS database')]
	---
	> CHANNEL_LIST_SOURCE = [('db','from AQMS database')]
	208c209
	< CHANNEL_LIST_SOURCE = [('cwb','from CWB'),('db','from AQMS database')]
	---
	> #CHANNEL_LIST_SOURCE = [('cwb','from CWB'),('db','from AQMS database')]
	(venv) [aqms@aqms-bk-aux datacenter]$ 

% mkdir /tmp/logs
  • Create and populate StationUI logging tables:
% cd django/stationui/
% cp user_action_tables_POSTGRES.sql user_action_tables_POSTGRES.sql.ORIG
% vi user_action_tables_POSTGRES.sql

	(venv) [aqms@aqms-bk-aux stationui]$ diff user_action_tables_POSTGRES.sql.ORIG user_action_tables_POSTGRES.sql
	1d0
	< # postgres
	31a31,48
	> 
	> create table channel_rules (
	>     progid integer NOT NULL,
	>     name varchar(32),
	>     copy_from varchar(128), -- will store list of progids
	>     channel varchar(3), -- same as channel_data
	>     location varchar(2), -- same as channel_data
	>     config varchar(64), -- same as appchannels
	>     created timestamp with time zone not null default now(),
	>     last_mofified timestamp with time zone not null default now(),
	>     CONSTRAINT unique_rule UNIQUE (progid, name, copy_from, channel, location, config)
	> );
	> 
	> GRANT ALL on stationui_action_types to tpp;
	> GRANT ALL on stationui_action_types_id_seq to tpp;
	> GRANT ALL on stationui_actions to tpp;
	> GRANT ALL on stationui_actions_id_seq to tpp;
	> GRANT ALL on channel_rules to tpp;
	(venv) [aqms@aqms-bk-aux stationui]$ 

% psql -d archdb1 -U trinetdb -h aqms-bk-pp
= \i user_action_tables_POSTGRES.sql

% psql -U postgres -d archdb1 -h aqms-bk-pp
= GRANT ALL ON SCHEMA pglogical TO tpp;
  • Create admin user:
% cd ..
% python3 manage.py migrate auth
% python3 manage.py migrate 
% python3 manage.py createsuperuser --username=<username> --email=<email>
  • Configure default rules:
% vi stationui/defaultrules.py
  • Start the server:
source /home/aqms/StationUI/venv/bin/activate
python3 manage.py runserver aqms-bk-aux.geo.berkeley.edu:8000 >& stationui.log &

- Metadata setup:

  • Install Anaconda:
% wget https://repo.anaconda.com/archive/Anaconda3-2022.10-Linux-x86_64.sh
% sh Anaconda3-2022.10-Linux-x86_64.sh 
  • Install AQMS-PDL & AQMS-MKV0:
% pip uninstall aqms-pdl
% pip uninstall aqms-mkv0
% git clone https://gitlab.com/aqms-swg/aqms-pdl.git
% cd aqms-pdl/
% pip install .
% cd ..
% git clone https://gitlab.com/aqms-swg/aqms-mkv0.git
% cd aqms-mkv0/
% pip install .
  • Create configuration file:
% cat config.yml 
DB_HOST: xxxx
DB_NAME: xxxx
DB_PORT: 5432
DB_USER: xxxx
DB_PASSWORD: xxxx
LOG_DIR: './log'
LOG_LEVEL: 'INFO'
APPLICATION_NAME: PWS-NEW

mkv0:
  # minAcc: acc. threshold (percent g) for making files for station's SNCLs
  MinACC:    0.1
  # Velocity model file for travel-time calculations
  vmodelFile:   ./allcal_model.d
  # PreEventTime: seconds before estimated P arrival to start waveform
  PreEventTime:   30
  # SecsPermag: waveform duration should be this many seconds per magnitude
  SecsPerMag:     30
  # minComplete: the minimum level of waveform completeness (0 to 1) required
  # to create V0 files..
  minComplete:    0.95

%
  • Create velocity model file:
% cat allcal_model.d 
# Compromise Crustal model for Northern & Southern California
#
#     depth   velocity
lay   0.0     4.8
lay   4.5     6.1
lay   15.5    6.8
lay   25.0    7.3 
lay   32.0    7.8 
%
  • Import Cosmos paramaters from extStaXML file:
% wget https://files.anss-sis.scsn.org/production/Extended-XML/BK/BK_BRAD_ext.xml
% stnxml2aqms -n BK -s BRAD -d BK_BRAD_ext.xml
  • Import IR parameters from FDSNStaXML file:
% wget https://files.anss-sis.scsn.org/production/FDSNStationXML1.1/BK/BK_POTR.xml
% stnxml2aqms -n BK -s POTR -i -x --sr BK_POTR.xml
  • Generate FDSNStaXML file for a given site:
% aqms2stnxml -r -n BK -s POTR BK.POTR.isti.xml
  • Generate V0 files for an event:
% mkv0 --logconsole 60425711
  • Update pga_hba.conf on aqms-bk-pp & aqms-nc-pp to include:
host    archdb1      tpp             169.229.197.35/24     md5 # Public aqms-bk-pp & aqms-bk-rt
host    archdb1      tpp             192.168.116.43/24     md5 # Private aqms-bk-pp & aqms-bk-rt
host    archdb1      tpp             137.227.232.161/24    md5 # Public aqms-nc & aqms-nc-rt
host    archdb1      tpp             192.168.119.115/24    md5 # Private aqms-nc-pp & aqms-nc-rt
  • Update pga_hba.conf on aqms-bk-rt to include:
host    rtdb1        rtem            169.229.197.35/32          md5 # Public aqms-bk-pp
host    rtdb1        rtem            192.168.116.43/32          md5 # Private aqms-bk-pp
host	rtdb1	     rtem	     137.227.232.161/32		md5 # Public aqms-nc-pp
host	rtdb1	     rtem	     192.168.119.115/32		md5 # Private aqms-nc-pp
  • Update pga_hba.conf on aqms-nc-rt to include:
host    rtdb2        rtem            169.229.197.35/32          md5 # Public aqms-bk-pp
host    rtdb2        rtem            192.168.116.43/32          md5 # Private aqms-bk-pp
host	rtdb2	     rtem	     137.227.232.161/32		md5 # Public aqms-nc-pp
host	rtdb2	     rtem	     192.168.119.115/32		md5 # Private aqms-nc-pp

- Hypocenter & Magnitude Comparison between the Oracle & PostgreSQL Systems:

  • Edit comparison parameters:
[aqms@aqms-bk-pp ~]$ cd /home/aqms/compare
[aqms@aqms-bk-pp compare]$ vi run_compare_events.sh
# Define start and end times for this comparison
starttime="2023-04-18-00:00:00"
endtime="2023-04-25-00:00:00"

# Databases to compare
# ORACLE:     publicdb rtmp2 rtucb2
# POSTGRESQL: archdb1  rtdb2 rtdb1
hostarray1=( rtucb2 )
hostarray2=( rtdb1 )
  • Run comparison script:
[aqms@aqms-bk-pp compare]$ ./run_compare_events.sh 
  • A text file and an image are created in the output directory, e.g.:
[aqms@aqms-bk-pp compare]$ ll output/
total 228
-rw-r--r-- 1 aqms users 157399 2023-04-26 10:34 compare_rtucb2-rtdb1_2023-04-18-00:00:00_2023-04-25-00:00:00.png
-rw-r--r-- 1 aqms users  70740 2023-04-26 10:34 compare_rtucb2-rtdb1_2023-04-18-00:00:00_2023-04-25-00:00:00.txt
[aqms@aqms-bk-pp compare]$ 

- Data Migration & Switchover (ora2pg): [aqms-bk-rt/aqms-bk-pp active]

  • Install ora2pg on aqms-bk-pp.
  • Stop RT & PP processes on PostgreSQL systems.
% aqms forcestop (as aqms on aqms-bk-rt, aqms-nc-rt & aqms-bk-pp)
  • Stop replication from RTs to PP on PostgreSQL systems.
> SELECT pglogical.alter_subscription_disable(subscription_name := 'rt1subscr', immediate := TRUE); (on archdb1 as postgres)
> SELECT pglogical.alter_subscription_disable(subscription_name := 'rt2subscr', immediate := TRUE); (on archdb1 as postgres)
  • Delete all rows from all the PostgreSQL databases.
> \i sql/truncate.sql (on rtdb1, rtdb2 & archdb1 as trinetdb)
> \i sql/drop_sequences.sql (on rtdb1, rtdb2 & archdb1 as trinetdb)
  • Reset the start value of the sequences (on rtdb1, rtdb2, archdb1 with the appropriate intervals) to the highest values (or higher) of the old Oracle databases.
% sqlplus browser@publicdb @get_max_seq.sql 
% vi sql/create_sequences.sql (on rtdb1, rtdb2 & archdb1 as trinetdb)
> \i sql/create_sequences.sql (on rtdb1, rtdb2 & archdb1 as trinetdb)
  • Stop PP processes on Oracle system.
  • Disable the Event table trigger on the PostgreSQL PP system.
> ALTER TABLE event DISABLE TRIGGER post_new_event; (on archdb1 as trinetdb)
  • Turn on PostgreSQL RT systems AND stop replication from Oracle RTs to Oracle PPs at the same time (goal is: oracle PP system no longer changing).
% aqms start (as aqms on aqms-bk-rt & aqms-nc-rt)
  • At this point the PostgreSQL RT systems are live and alarming.
  • Turn on the data transfer from Oracle PP to file to PostgreSQL PP (using ora2pg).
* Tables exported/imported:
	AMP
	AMPSET
	ARRIVAL
	ASSOCAMM
	ASSOCAMO
	ASSOCARO
	ASSOCCOM
	ASSOCCOO
	ASSOCEVAMPSET
	ASSOCNTE
	ASSOCWAE
        ASSOCWFRC
	CODA
	CREDIT
	EVENT
	EVENTPREFMAG
	EVENTPREFMEC
	EVENTPREFOR
	FILENAME
	MEC
	MECCHANNEL
	MECDATA
	MECFREQ
	MECFREQDATA
	MECOBJECT
	NETMAG
	NETTRIG
	ORIGIN
	ORIGIN_ERROR
	PATHNAME
	REMARK
	REQUEST_CARD
	SUBDIR
	SWARM_EVENTS
	SWARM_STATE
	TRIG_CHANNEL
	UNASSOCAMP
	WAVEFORM
	PDL_PRODUCT
	PDL_IDS
* Export tables from Oracle DB:
	% cd export/
	% source init.ora2pg
	% ./run_export.csh
* Import tables into PostgreSQL DB:
	% cd import/
	% ./run_import.csh
  • While the export is running, rsync event waveform archive from Oracle (/work/dc22/ncedc/events/) to PostgreSQL system (on active system first then standby one. Disable cron entry on standby system).
% rsync -av -e ssh --progress ncss@ucbpp.geo.berkeley.edu:/work/dc22/ncedc/events/active/ /home/aqms/data/events/active (as aqms on aqms-bk-pp)
  • Re-enable the Event table trigger on the PostgreSQL PP system.
> ALTER TABLE event ENABLE ALWAYS TRIGGER post_new_event; (on archdb1 as trinetdb)
  • Start PP processes on PostgreSQL system.
% aqms start (as aqms on aqms-bk-pp)
  • Turn back on replication from PostgreSQL RTs to PostgreSQL PP.
> SELECT pglogical.alter_subscription_enable(subscription_name := 'rt1subscr', immediate := TRUE); (on archdb1 as postgres)
> SELECT pglogical.alter_subscription_enable(subscription_name := 'rt2subscr', immediate := TRUE); (on archdb1 as postgres)
  • Start PP processes on Oracle system.
  • Turn back on replication from Oracle RTs to Oracle PPs.

- Event Data Archiving to the NCEDC: (running on strike)

  • Event waveforms directory structure:
[aqms@aqms-bk-pp events]$ pwd
/home/aqms/data/events
[aqms@aqms-bk-pp events]$ ll
total 0
lrwxrwxrwx 1 aqms users 28 2023-03-27 16:20 active -> /work/aqms-bk-pp/aqms/active/
lrwxrwxrwx 1 aqms users 24 2023-04-19 13:25 NCEVT -> /data/ncedc/events/NCEVT/
[aqms@aqms-bk-pp events]$
  • Update pga_hba.conf on aqms-bk-pp & aqms-nc-pp to include:
host    archdb1      browser         169.229.197.12/32     md5 # Public strike
operations/db_ncedc/pgsql_prod.txt · Last modified: 2024/04/22 11:57 by stephane