Table of Contents
PostgreSQL Production AQMS System
Documents
- Jiggle Configuration File: pg_jiggle_ncsn.tar.gz
PostgreSQL Database
- PostgreSQL 14 Installation: (Client/Server/Devel - done by sys admins)
- AQMS DB Instance Creation:
- Install the following extensions: (Software installed by sys admins)
List of installed extensions Name | Version | Schema | Description ------------------------+---------+------------+-------------------------------------------------------------- dblink | 1.2 | public | connect to other PostgreSQL databases from within a database fuzzystrmatch | 1.1 | public | determine similarities and distance between strings pglogical | 2.4.3 | pglogical | PostgreSQL Logical Replication pljava | 1.5.8 | sqlj | PL/Java procedural language (https://tada.github.io/pljava/) plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language postgis | 3.3.2 | public | PostGIS geometry and geography spatial types and functions postgis_tiger_geocoder | 3.3.2 | tiger | PostGIS tiger geocoder and reverse geocoder postgis_topology | 3.3.2 | topology | PostGIS topology spatial types and functions (8 rows)
- Edit the pg_hba.conf & postgresql.conf files accordingly and restart the database server (see Replication section).
> vi ~/data/pg_hba.conf > vi ~/data/postgresql.conf > sudo systemctl restart postgresql-14
- Retrieve and extract DB package DBpg_NC.tar.gz.
- Make sure LD_LIBRARY_PATH is set either at the system or service level:
% sudo ldconfig /usr/local/lib [Service] Environment=LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64:/share/apps/lib:/share/apps/lib64
- Install AQMSpg_ext.so (require libmseed.so & libmseed.h - https://github.com/iris-edu/libmseed/releases):
> cd DBpg/create/postgresql-extension > vi Makefile > make > sudo make install
- Create database objects (Look at DBpg_NC/create/README):
> cd DBpg/create > vi pg.env - Update PSQLDIR. - Update PGDATABASE. - Update passwords. > ./generate_sql_scripts.sql - Create password file. - Update templates (db, users, sequences). > ./run_as_postgres.sql - Create DB. - Activate extensions. - Create roles. - Create users. > ./run_sql_scripts.sql - Creates sequences. - Grant privileges on sequences. - Create Waveform schema. - Create Parametric Information schema. - Create Instrument Responses schema. - Create Hardware Tracking schema. - Create Application schema. - Grant privileges on all tables. - Create views. - Grant privileges on views. - Create stored procedures. - Create indexes. - Create missing objects. - Populate specific tables. - Remove specific constraints for export/import.
- If the database is a real time one, drop the trigger on the event table and insert a row in the RT_ROLE & PEER_SYSTEM_STATUS tables:
[postgres@aqms-bk-rt ~]$ psql -d rtdb1 -U trinetdb psql (14.6) Type "help" for help. rtdb1=> DROP TRIGGER post_new_event ON EVENT; DROP TRIGGER rtdb1=> \q [postgres@aqms-bk-rt ~]$ [postgres@aqms-bk-rt ~]$ psql -U rtem rtdb1 psql (14.6) Type "help" for help. rtdb1=> select * from rt_role; primary_system | modification_time ----------------+---------------------------- true | 2023-03-07 00:45:20.992049 (1 row) rtdb1=> select * from peer_system_status; status | mod_time -------------+--------------------- OPERATIONAL | 2020-01-01 00:00:00 (1 row) rtdb1=> \q [postgres@aqms-bk-rt ~]$
- Replication Setup:
- PostgreSQL Server Replication Settings:
- Settings in ~/data/pg_hba.conf: * Updated 'scram-sha-256' entries to 'md5'. * aqms-bk-pp: local all postgres peer local all all md5 host replication repadmin 127.0.0.1/32 md5 # Localhost host archdb1 repadmin 127.0.0.1/32 md5 # Localhost host replication repadmin 169.229.197.34/24 md5 # Public aqms-bk-rt & aqms-bk-pp host archdb1 repadmin 169.229.197.34/24 md5 # Public aqms-bk-rt & aqms-bk-pp host replication repadmin 192.168.116.45/24 md5 # Private aqms-bk-rt & aqms-bk-pp host archdb1 repadmin 192.168.116.45/24 md5 # Private aqms-bk-rt & aqms-bk-pp host replication repadmin 137.227.232.160/24 md5 # Public aqms-nc-rt & aqms-nc-pp host archdb1 repadmin 137.227.232.160/24 md5 # Public aqms-nc-rt & aqms-nc-pp host replication repadmin 192.168.119.114/24 md5 # Private aqms-nc-rt & aqms-nc-pp host archdb1 repadmin 192.168.119.114/24 md5 # Private aqms-nc-rt & aqms-nc-pp * aqms-nc-pp: [Set up as part of streaming replication] * aqms-bk-rt: local all postgres peer local all all md5 host replication repadmin 127.0.0.1/32 md5 # Locahost host rtdb1 repadmin 127.0.0.1/32 md5 # Locahost host rtdb1 repadmin 169.229.197.34/24 md5 # Public aqms-bk-rt & aqms-bk-pp host rtdb1 repadmin 192.168.116.45/24 md5 # Private aqms-bk-rt & aqms-bk-pp host rtdb1 repadmin 137.227.232.160/24 md5 # Public aqms-nc-rt & aqms-nc-pp host rtdb1 repadmin 192.168.119.114/24 md5 # Private aqms-nc-rt & aqms-nc-pp * aqms-nc-rt: local all postgres peer local all all md5 host replication repadmin 127.0.0.1/32 md5 # Locahost host rtdb2 repadmin 127.0.0.1/32 md5 # Locahost host rtdb2 repadmin 169.229.197.34/24 md5 # Public aqms-bk-rt & aqms-bk-pp host rtdb2 repadmin 192.168.116.45/24 md5 # Private aqms-bk-rt & aqms-bk-pp host rtdb2 repadmin 137.227.232.160/24 md5 # Public aqms-nc-rt & aqms-nc-pp host rtdb2 repadmin 192.168.119.114/24 md5 # Private aqms-nc-rt & aqms-nc-pp - Settings in ~/data/postgresql.conf: * aqms-bk-pp: listen_addresses = '*' # what IP address(es) to listen on; wal_level = 'logical' # pglogical max_worker_processes = 10 # pglogical one per database needed on provider node # pglogical one per node needed on subscriber node max_replication_slots = 80 # pglogical one per node needed on provider node (pp should be twice rt setting) max_wal_senders = 10 # pglogical one per node needed on provider node shared_preload_libraries = 'pglogical' # pglogical track_commit_timestamp = on # pglogical needed for last/first update wins conflict resolution # pglogical property available in PostgreSQL 9.5+ wal_keep_size = 8192 # (Only pp) in logfile segments*16, 16MB each; 0 disables hot_standby = on # "off" disallows queries during recovery hot_standby_feedback = on # send info from standby to prevent query conflicts * aqms-nc-pp: [Set up as part of streaming replication] * aqms-bk-rt: listen_addresses = '*' # what IP address(es) to listen on; wal_level = 'logical' # pglogical max_worker_processes = 10 # pglogical one per database needed on provider node # pglogical one per node needed on subscriber node max_replication_slots = 40 # pglogical one per node needed on provider node (pp should be twice rt setting) max_wal_senders = 10 # pglogical one per node needed on provider node shared_preload_libraries = 'pglogical' # pglogical track_commit_timestamp = on # pglogical needed for last/first update wins conflict resolution * aqms-nc-rt: listen_addresses = '*' # what IP address(es) to listen on; wal_level = 'logical' # pglogical max_worker_processes = 10 # pglogical one per database needed on provider node # pglogical one per node needed on subscriber node max_replication_slots = 40 # pglogical one per node needed on provider node (pp should be twice rt setting) max_wal_senders = 10 # pglogical one per node needed on provider node shared_preload_libraries = 'pglogical' # pglogical track_commit_timestamp = on # pglogical needed for last/first update wins conflict resolution
- PGLogical Replication From rtdb1/rtdb2 To archdb1:
1.- Install 2ndQuadrant pglogical extension on rtdb1, rtdb2 & archdb1: * Download and install 2ndQuadrant extension (done by sys admins). * As postgres: (if extension was installed after the database was created) $ sudo systemctl restart postgresql-14 $ psql = alter system set shared_preload_libraries = 'pglogical'; = \c archdb1 = CREATE EXTENSION IF NOT EXISTS pglogical; * As postgres: = set pglogical.conflict_resolution = 'last_update_wins'; 2.- Create provider & subscriber nodes: * aqms-bk-rt: = \c rtdb1 = SELECT pglogical.drop_node('rtnode1',false); = SELECT pglogical.create_node(node_name := 'rtnode1',dsn:='host=aqms-bk-rt.ncss-ucb.cisn.org port=5432 dbname=rtdb1 user=repadmin password=repadmin_pass'); = GRANT CREATE, USAGE ON SCHEMA pglogical TO repadmin; * aqms-nc-rt: = \c rtdb2 = SELECT pglogical.drop_node('rtnode2',false); = SELECT pglogical.create_node(node_name := 'rtnode2',dsn:='host=aqms-nc-rt.ncss-mp.cisn.org port=5432 dbname=rtdb2 user=repadmin password=repadmin_pass'); = GRANT CREATE, USAGE ON SCHEMA pglogical TO repadmin; * aqms-bk-pp: = \c archdb1 = GRANT CREATE, USAGE ON SCHEMA pglogical TO repadmin; = SELECT pglogical.drop_node('ppnode1',false); = SELECT pglogical.create_node(node_name := 'ppnode1',dsn:='host=aqms-bk-pp.ncss-ucb.cisn.org port=5432 dbname=archdb1 user=repadmin password=repadmin_pass'); To check the results: = SELECT * FROM pglogical.node_interface; 3.- Create replication sets on aqms-bk-rt & aqms-nc-rt: = \c rtdb1 = SELECT pglogical.drop_replication_set(set_name:='rtrepset'); = SELECT pglogical.create_replication_set(set_name:='rtrepset', replicate_insert:=TRUE,replicate_update:=TRUE, replicate_delete:=FALSE, replicate_truncate:=FALSE); = DO $$ DECLARE count INTEGER := 0; reptables varchar[] := ARRAY['amp','ampset','arrival','assocamm', 'assocamo','assocaro','assoccom','assoccoo','assocevampset', 'assocnte','assocwae','coda','credit','event', 'eventprefmag','eventprefmec','eventprefor','filename','mec', 'mecchannel','mecdata','mecfreq','mecfreqdata','mecobject', 'netmag','nettrig','origin','origin_error','pathname', 'pdl_ids','pdl_product','remark','request_card','subdir', 'swarm_events','swarm_state','trig_channel','unassocamp', 'wavefileroots','waveform']; tbl varchar; BEGIN FOREACH tbl IN ARRAY reptables LOOP PERFORM pglogical.replication_set_add_table(set_name:='rtrepset', relation:='trinetdb.'||tbl,synchronize_data:=TRUE); count := count+1; END LOOP; -- RETURN count ; END; $$; = \c rtdb2 = SELECT pglogical.drop_replication_set(set_name:='rtrepset'); = SELECT pglogical.create_replication_set(set_name:='rtrepset', replicate_insert:=TRUE,replicate_update:=TRUE, replicate_delete:=FALSE, replicate_truncate:=FALSE); = DO $$ DECLARE count INTEGER := 0; reptables varchar[] := ARRAY['amp','ampset','arrival','assocamm', 'assocamo','assocaro','assoccom','assoccoo','assocevampset', 'assocnte','assocwae','coda','credit','event', 'eventprefmag','eventprefmec','eventprefor','filename','mec', 'mecchannel','mecdata','mecfreq','mecfreqdata','mecobject', 'netmag','nettrig','origin','origin_error','pathname', 'pdl_ids','pdl_product','remark','request_card','subdir', 'swarm_events','swarm_state','trig_channel','unassocamp', 'wavefileroots','waveform']; tbl varchar; BEGIN FOREACH tbl IN ARRAY reptables LOOP PERFORM pglogical.replication_set_add_table(set_name:='rtrepset', relation:='trinetdb.'||tbl,synchronize_data:=TRUE); count := count+1; END LOOP; -- RETURN count ; END; $$; Tables can be removed from a replication set with: = SELECT pglogical.replication_set_remove_table('rtrepset', 'trinetdb.<table_name>'); Tables belonging to a replication set can be listed with: = SELECT relname FROM pglogical.tables WHERE set_name='rtrepset' order by relname; 4.- Create subscriptions on aqms-bk-pp: = \c archdb1 = SELECT pglogical.create_subscription( subscription_name := 'rt1subscr', provider_dsn := 'host=aqms-bk-rt.ncss-ucb.cisn.org port=5432 user=repadmin password=repadmin_pass_ucb dbname=rtdb1', replication_sets := ARRAY['rtrepset'], synchronize_structure := 'FALSE', synchronize_data := FALSE ); = SELECT pglogical.create_subscription( subscription_name := 'rt2subscr', provider_dsn := 'host=aqms-nc-rt.ncss-mp.cisn.org port=5432 user=repadmin password=repadmin_pass_wr dbname=rtdb2', replication_sets := ARRAY['rtrepset'], synchronize_structure := 'FALSE', synchronize_data := FALSE ); In case of permission errors you may need to grant repadmin privileges to the schema, e.g.: = \c rtdb1 = GRANT CREATE, USAGE ON SCHEMA pglogical TO repadmin; Subscriptions can be dropped with: = SELECT pglogical.drop_subscription( subscription_name := 'rt1subscr'); = SELECT pglogical.drop_subscription( subscription_name := 'rt2subscr'); To check the replication: = \c archdb1 = SELECT subscription_name,status,provider_node,replication_sets FROM pglogical.show_subscription_status(); subscription_name | status | provider_node | replication_sets -------------------+-------------+---------------+------------------ rt1subscr | replicating | rtnode1 | {rtrepset} rt2subscr | replicating | rtnode2 | {rtrepset} if status is not 'replicating', do: SELECT pglogical.alter_subscription_synchronize(subscription_name := 'rt1subscr',truncate:=FALSE); or/and SELECT pglogical.alter_subscription_synchronize(subscription_name := 'rt2subscr',truncate:=FALSE); To temporarily disable the subscription: = \c archdb1 = SELECT pglogical.alter_subscription_disable(subscription_name := 'rt1subscr', immediate := TRUE); = SELECT pglogical.alter_subscription_disable(subscription_name := 'rt2subscr', immediate := TRUE); To re-enable the subscription: = \c archdb1 = SELECT pglogical.alter_subscription_enable(subscription_name := 'rt1subscr', immediate := TRUE); = SELECT pglogical.alter_subscription_enable(subscription_name := 'rt2subscr', immediate := TRUE);
- Streaming Replication Between PP Instances:
1.- Create a cluster on the standby machine: $ /usr/pgsql-14/bin/postgresql-14-setup initdb Initializing database ... OK 2.- Copy an image from the active DB: * Stop server: $ sudo systemctl stop postgresql-14 * Copy an image of the active db on active server to local directory as user postgres@aqms-nc-pp: $ /usr/pgsql-14/bin/pg_basebackup -h aqms-bk-pp.ncss-ucb.cisn.org -R -D /var/lib/pgsql/14/data -U repadmin -v -P where: '-h ...' specifies the active host. '-R' writes configuration for replication. '-D ...' specifies where the image should be put. '-U ...' is the replication user name. '-v' specifies verbose mode. '-P' show progress information. * Start server: $ sudo systemctl start postgresql-14 * Create .pgpass file: $ vi ~/.pgpass 3.- Replication monitoring: * Processes: On Master ========== $ ps -eaf | grep sender On Slave ========== $ ps -eaf | grep receiver $ ps -eaf | grep startup You must see those all three processes running on master and slave. * SELECT * FROM pg_stat_replication; will return useful information on active DB and empty on standby DB. * SELECT * FROM pg_stat_wal_receiver; will return useful information on standby DB but empty on active DB. * SELECT pg_is_in_recovery(); will return TRUE on standby DB and FALSE on other nodes. 4.- Replication slots: * Create replication slot as postgres@archdb1 on aqms-nc-pp: # SELECT * FROM pg_create_physical_replication_slot('aqmsbkpp'); * Add "primary_slot_name = 'aqmsbkpp'" in postgresql.auto.conf on aqms-bk-pp. * SELECT slot_name, slot_type, active FROM pg_replication_slots; slot_name | slot_type | active -----------+-----------+-------- barman | physical | t aqmsbkpp | physical | t (2 rows) * Create replication slot as postgres@archdb1 on aqms-bk-pp: # SELECT * FROM pg_create_physical_replication_slot('aqmsncpp'); * Add "primary_slot_name = 'aqmsncpp'" in postgresql.auto.conf on aqms-nc-pp. * SELECT slot_name, slot_type, active FROM pg_replication_slots; slot_name | slot_type | active -----------+-----------+-------- barman | physical | t aqmsncpp | physical | t (2 rows)
- REPMGR setup:
REPMGR SETUP ON AQMS-BK-PP/AQMS-NC-PP: 1.- Install 2ndQuadrant repmgr extension on aqms-bk-pp (done by sys admins). * The repmgr extension can only be installed by a superuser. If the repmgr user is a superuser, repmgr will create the extension automatically. Alternatively, the extension can be created manually by a superuser (with "CREATE EXTENSION repmgr") before executing repmgr primary register. 2.- Create a repmgr database on primary server (aqms-bk-pp): % createdb repmgr -O repadmin 3.- Update pg_hba.conf file on aqms-bk-pp & aqms-nc-pp (add them before any other custom entries): local replication repadmin trust local repmgr repadmin trust host replication repadmin 127.0.0.1/32 trust # Localhost host repmgr repadmin 127.0.0.1/32 trust # Localhost host replication repadmin 169.229.197.34/24 trust # Public aqms-bk-rt & aqms-bk-pp host repmgr repadmin 169.229.197.34/24 trust # Public aqms-bk-rt & aqms-bk-pp host archdb1 repadmin 169.229.197.34/24 trust # Public aqms-bk-rt & aqms-bk-pp host replication repadmin 192.168.116.45/24 trust # Private aqms-bk-rt & aqms-bk-pp host repmgr repadmin 192.168.116.45/24 trust # Private aqms-bk-rt & aqms-bk-pp host archdb1 repadmin 192.168.116.45/24 trust # Private aqms-bk-rt & aqms-bk-pp host replication repadmin 137.227.232.160/24 trust # Public aqms-nc-rt & aqms-nc-pp host repmgr repadmin 137.227.232.160/24 trust # Public aqms-nc-rt & aqms-nc-pp host archdb1 repadmin 137.227.232.160/24 trust # Public aqms-nc-rt & aqms-nc-pp host replication repadmin 192.168.119.114/24 trust # Private aqms-nc-rt & aqms-nc-pp host repmgr repadmin 192.168.119.114/24 trust # Private aqms-nc-rt & aqms-nc-pp host archdb1 repadmin 192.168.119.114/24 trust # Private aqms-nc-rt & aqms-nc-pp 4.- Update postgresql.conf file on aqms-bk-pp & aqms-nc-pp: wal_log_hints = on 5.- Prepare the standby (aqms-nc-pp): * On the standby, check that the primary database is reachable: % psql 'host=aqms-bk-pp.ncss-ucb.cisn.org user=repadmin dbname=repmgr connect_timeout=2' 6.- Create repmgr configuration file on aqms-bk-pp & aqms-nc-pp: * On the primary server (aqms-bk-pp): % cat /etc/repmgr/14/repmgr.conf node_id=1 node_name=aqmsbkpp conninfo='host=aqms-bk-pp.ncss-ucb.cisn.org user=repadmin dbname=repmgr connect_timeout=2 application_name=aqmsbkpp' data_directory = '/home/pgsql/data' pg_bindir='/usr/pgsql-14/bin/' service_start_command = 'sudo /bin/systemctl start postgresql-14' service_stop_command = 'sudo /bin/systemctl stop postgresql-14' service_restart_command = 'sudo /bin/systemctl restart postgresql-14' service_reload_command = 'sudo /bin/systemctl reload postgresql-14' * On the standby server (aqms-nc-pp): % cat /etc/repmgr/14/repmgr.conf node_id=2 node_name=walreceiver conninfo='host=aqms-nc-pp.ncss-mp.cisn.org user=repadmin dbname=repmgr connect_timeout=2 application_name=walreceiver' data_directory = '/var/lib/pgsql/14/data' pg_bindir='/usr/pgsql-14/bin/' service_start_command = 'sudo /bin/systemctl start postgresql-14' service_stop_command = 'sudo /bin/systemctl stop postgresql-14' service_restart_command = 'sudo /bin/systemctl restart postgresql-14' service_reload_command = 'sudo /bin/systemctl reload postgresql-14' 7.- Register the servers: * On the primary server (aqms-bk-pp): % repmgr -f /etc/repmgr/14/repmgr.conf primary register % repmgr -f /etc/repmgr/14/repmgr.conf cluster show * On the standby server (aqms-nc-pp): % repmgr -f /etc/repmgr/14/repmgr.conf standby register % repmgr -f /etc/repmgr/14/repmgr.conf cluster show 8.- Establish SSH passwordless connection between the servers: * On each machine run: % ssh-keygen * Add contents of ~/.ssh/id_rsa.pub to the other machine's ~/.ssh/authorized_keys file.
- BARMAN setup:
BARMAN setup on aqms-bk-aux - Target DB (archdb1@aqms-bk-pp) - Using continuously streaming backups with WALs: * Install Barman (done by sys admins). * Configure Barman: (as barman) - Update /etc/barman/barman.conf: [barman] barman_user = barman configuration_files_directory = /etc/barman/conf.d barman_home = /var/lib/barman log_file = /var/log/barman/barman.log log_level = INFO retention_policy = RECOVERY WINDOW OF 2 WEEKS * Configure server connection: (as barman) - Create /etc/barman/conf.d/pg.conf: [pg] description = "archdb1 Database (Streaming-Only)" conninfo = host=aqms-bk-pp.geo.berkeley.edu user=barman dbname=archdb1 streaming_conninfo = host=aqms-bk-pp.geo.berkeley.edu user=barman backup_method = postgres streaming_archiver = on slot_name = barman * Set up storage directory: (as barman) $ ll -l /var/lib/barman total 0 lrwxrwxrwx 1 barman barman 28 Feb 10 08:42 pg -> /data/archdb1-dump/barman/pg $ * Add 'export PATH=$PATH:/usr/pgsql-14/bin/' to ~barman/.bash_profile (as barman) * Create barman user for PostgreSQL: (as postgres) $ createuser -s -P barman * Test psql connection: (as barman) - Create ~barman/.pgpass: aqms-bk-pp.geo.berkeley.edu:5432:*:barman:barman_password - Test connection: psql -c 'SELECT version()' -U barman -h aqms-bk-pp.geo.berkeley.edu archdb1 * Configure PostgreSQL: (as postgres) - Update pga_hba.conf to include: # barman host archdb1 barman 169.229.197.88/32 md5 # Public aqms-bk-aux host replication barman 169.229.197.88/32 md5 # Public aqms-bk-aux * Test replication connection: (as barman) $ psql -U barman -h aqms-bk-pp.geo.berkeley.edu -c 'IDENTIFY_SYSTEM' replication=1 * Set up SSH for restores: (as barman) - Run: $ ssh-keygen -t rsa - Save the key to ~/.ssh/pg - Copy the value of the public key in ~/.ssh/pg.pub - On the PostgreSQL server, open or create ~postgres/.ssh/authorized_keys and add the public key * Create replication slot: (as barman) $ barman receive-wal --create-slot pg * Run maintenance tasks: (as barman) $ barman cron Starting WAL archiving for server pg Starting streaming archiver for server pg $ * Verify the WAL archiving configuration: $ barman switch-xlog --force --archive pg * Check the server configuration: (as barman) $ barman check pg Server pg: PostgreSQL: OK superuser or standard user with backup privileges: OK PostgreSQL streaming: OK wal_level: OK replication slot: OK directories: OK retention policy settings: OK backup maximum age: OK (no last_backup_maximum_age provided) backup minimum size: OK (0 B) wal maximum age: OK (no last_wal_maximum_age provided) wal size: OK (0 B) compression settings: OK failed backups: OK (there are 0 failed backups) minimum redundancy requirements: OK (have 0 backups, expected at least 0) pg_basebackup: OK pg_basebackup compatible: OK pg_basebackup supports tablespaces mapping: OK systemid coherence: OK (no system Id stored on disk) pg_receivexlog: OK pg_receivexlog compatible: OK receive-wal running: OK archiver errors: OK $ * Perform a full backup: (as barman) $ barman backup pg Starting backup using postgres method for server pg in /var/lib/barman/pg/base/20230210T103815 Backup start at LSN: 0/5000060 (000000010000000000000005, 00000060) Starting backup copy via pg_basebackup for 20230210T103815 Copy done (time: 1 second) Finalising the backup. This is the first backup for server pg WAL segments preceding the current backup have been found: 000000010000000000000004 from server pg has been removed Backup size: 51.1 MiB Backup end at LSN: 0/7000000 (000000010000000000000006, 00000000) Backup completed (start time: 2023-02-10 10:38:15.474479, elapsed time: 1 second) Processing xlog segments from streaming for pg 000000010000000000000005 WARNING: IMPORTANT: this backup is classified as WAITING_FOR_WALS, meaning that Barman has not received yet all the required WAL files for the backup consistency. This is a common behaviour in concurrent backup scenarios, and Barman automatically set the backup as DONE once all the required WAL files have been archived. Hint: execute the backup command with '--wait' $ * List available backups: (as barman) $ barman list-backup pg pg 20230210T103815 - Fri Feb 10 10:38:16 2023 - Size: 67.1 MiB - WAL Size: 0 B $ * Delete a backup: (as barman) $ barman delete pg 20230210T103815 Deleting backup 20230210T103815 for server pg Delete associated WAL segments: Deleted backup 20230210T103815 (start time: Fri Feb 10 10:40:05 2023, elapsed time: less than one second) $ * Check the status of the PostgreSQL server: (as barman) $ barman status pg Server pg: Description: archdb1 Database (Streaming-Only) Active: True Disabled: False PostgreSQL version: 14.6 Cluster state: in production pgespresso extension: Not available Current data size: 51.1 MiB PostgreSQL Data directory: /home/pgsql/data Current WAL segment: 000000010000000000000007 Passive node: False Retention policies: enforced (mode: auto, retention: RECOVERY WINDOW OF 2 WEEKS, WAL retention: MAIN) No. of available backups: 0 First available backup: None Last available backup: None Minimum redundancy requirements: satisfied (0/0) $ * Check the streaming status: (as barman) $ barman replication-status pg Status of streaming clients for server 'pg': Current LSN on master: 0/70001B0 Number of streaming clients: 1 1. Async WAL streamer Application name: barman_receive_wal Sync stage : 3/3 Remote write Communication : TCP/IP IP Address : 169.229.197.35 / Port: 48662 / Host: - User name : barman Current state : streaming (async) Replication slot: barman WAL sender PID : 698902 Started at : 2023-02-10 10:22:31.835987-08:00 Sent LSN : 0/70001B0 (diff: 0 B) Write LSN : 0/70001B0 (diff: 0 B) Flush LSN : 0/7000000 (diff: -432 B) $ * Set up cron jobs: (as barman) */10 * * * * export PATH=$PATH:/usr/pgsql-14/bin/;/usr/bin/barman cron >& /dev/null 0 0 * * * export PATH=$PATH:/usr/pgsql-14/bin/;/usr/bin/barman backup pg --wait * Recovery: - Stop the PostgreSQL server. - Get a list of possible backup spots: $ barman list-backup pg - Get details for the backup you choose: $ barman show-backup pg <backup_id>. Note that <backup_id> is the timestamp of the backup. - Run: $ barman recover --remote-ssh-command "ssh -i ~/.ssh/pg -o StrictHostKeyChecking=no postgres@aqms-bk-pp" --target-time="<YYYY-MM-DD HH:MM:SS.SSSSSS+00:00>" pg <backup_id> /var/lib/pgsql/14/data - Restart the PostgreSQL server.
- Replication Monitoring Setup:
* Provide replication statistics access to repadmin user:
$ psql -U postgres rtdb2 rtdb2=# CREATE FUNCTION func_stat_replication() RETURNS SETOF pg_stat_replication as rtdb2-# $$ select * from pg_stat_replication; $$ rtdb2-# LANGUAGE sql SECURITY DEFINER; CREATE FUNCTION rtdb2=# REVOKE EXECUTE ON FUNCTION func_stat_replication() FROM public; REVOKE rtdb2=# GRANT EXECUTE ON FUNCTION func_stat_replication() to repadmin; GRANT rtdb2=# \q $ psql -U repadmin rtdb2 psql (14.7) Type "help" for help. rtdb2=> SELECT * FROM func_stat_replication(); pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_lsn | write_lsn | flush_lsn | replay _lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time ---------+----------+----------+------------------+----------------+-----------------+-------------+-------------------------------+--------------+-----------+------------+------------+------------+------- -----+-----------+-----------+------------+---------------+------------+------------------------------- 1167789 | 17852 | repadmin | rt2subscr | 192.168.116.43 | | 41830 | 2023-04-03 13:35:32.025688-04 | | streaming | 3/25476660 | 3/25476660 | 3/25476660 | 3/2547 6660 | | | | 0 | async | 2023-04-03 14:08:35.289162-04 (1 row) rtdb2=> \q $
* Streaming replication delay on standby PP DB (in seconds):
SELECT CASE WHEN pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn() THEN 0 ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS log_delay;
* Logical replication delay on RT DB (in bytes):
select pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)::numeric from pg_stat_replication;
- RT Role Setup:
* Settings in ~/data/pg_hba.conf on aqms-bk-rt:
host rtdb1 rtem 169.229.197.34/32 md5 # Public aqms-bk-rt host rtdb1 rtem 192.168.116.45/32 md5 # Private aqms-bk-rt host rtdb1 rtem 137.227.232.160/32 md5 # Public aqms-nc-rt host rtdb1 rtem 192.168.119.114/32 md5 # Private aqms-nc-rt
* Settings in ~/data/pg_hba.conf on aqms-nc-rt:
host rtdb2 rtem 169.229.197.34/32 md5 # Public aqms-bk-rt host rtdb2 rtem 192.168.116.45/32 md5 # Private aqms-bk-rt host rtdb2 rtem 137.227.232.160/32 md5 # Public aqms-nc-rt host rtdb2 rtem 192.168.119.114/32 md5 # Private aqms-nc-rt
- Memory & Log Configuration: PGTune
* Settings in ~/data/postgresql.auto.conf on aqms-bk-rt: [128GB - 64 processors]
max_connections = 40 shared_buffers = 32GB effective_cache_size = 96GB maintenance_work_mem = 2GB checkpoint_completion_target = 0.9 wal_buffers = 16MB default_statistics_target = 500 random_page_cost = 1.1 effective_io_concurrency = 300 work_mem = 13107kB huge_pages = try min_wal_size = 4GB max_wal_size = 16GB max_worker_processes = 64 max_parallel_workers_per_gather = 32 max_parallel_workers = 64 max_parallel_maintenance_workers = 4 log_connections = ON log_line_prefix = '%m user=%u db=%d pid=%p ' log_filename = 'postgresql-%Y-%m-%d.log'
* Settings in ~/data/postgresql.auto.conf on aqms-bk-pp: [128GB - 64 processors]
max_connections = 40 shared_buffers = 32GB effective_cache_size = 96GB maintenance_work_mem = 2GB checkpoint_completion_target = 0.9 wal_buffers = 16MB default_statistics_target = 500 random_page_cost = 1.1 effective_io_concurrency = 300 work_mem = 41943kB huge_pages = try min_wal_size = 4GB max_wal_size = 16GB max_worker_processes = 20 max_parallel_workers_per_gather = 10 max_parallel_workers = 20 max_parallel_maintenance_workers = 4 log_connections = ON log_line_prefix = '%m user=%u db=%d pid=%p ' log_filename = 'postgresql-%Y-%m-%d.log'
* Settings in ~/data/postgresql.auto.conf on aqms-nc-rt: [64GB - 20 processors]
max_connections = 40 shared_buffers = 16GB effective_cache_size = 48GB maintenance_work_mem = 2GB checkpoint_completion_target = 0.9 wal_buffers = 16MB default_statistics_target = 500 random_page_cost = 1.1 effective_io_concurrency = 300 work_mem = 20971kB huge_pages = try min_wal_size = 4GB max_wal_size = 16GB max_worker_processes = 20 max_parallel_workers_per_gather = 10 max_parallel_workers = 20 max_parallel_maintenance_workers = 4 log_connections = ON log_line_prefix = '%m user=%u db=%d pid=%p ' log_filename = 'postgresql-%Y-%m-%d.log'
* Settings in ~/data/postgresql.auto.conf on aqms-nc-pp: [128GB - 20 processors]
max_connections = 40 shared_buffers = 32GB effective_cache_size = 96GB maintenance_work_mem = 2GB checkpoint_completion_target = 0.9 wal_buffers = 16MB default_statistics_target = 500 random_page_cost = 1.1 effective_io_concurrency = 300 work_mem = 41943kB huge_pages = try min_wal_size = 4GB max_wal_size = 16GB max_worker_processes = 20 max_parallel_workers_per_gather = 10 max_parallel_workers = 20 max_parallel_maintenance_workers = 4 log_connections = ON log_line_prefix = '%m user=%u db=%d pid=%p ' log_filename = 'postgresql-%Y-%m-%d.log'
- Enabling Database Encryption:
* Settings in ~/data/postgresql.conf:
ssl = on ssl_cert_file = 'server.crt' ssl_key_file = 'server.key'
* Settings in ~/data/pg_hba.conf:
hostssl archdb1 tpp 169.229.197.88/32 md5 # Public aqms-bk-aux
* Certificates creation:
% cd $PGDATA % openssl req -new -x509 -days 3650 -nodes -text -out server.crt -keyout server.key -subj "/CN=aqms-bk-pp.ncss-ucb.cisn.org" % chmod og-rwx server.key
NC AQMS Production System
- Topology:
PUBLIC | NCSS | CISN |
---|---|---|
169.229.197.34/aqms-bk-rt.geo.berkeley.edu | 192.168.116.45/aqms-bk-rt.ncss-ucb.cisn.org | 192.168.200.45/aqms-bk-rt.ucb.cisn.org |
169.229.197.35/aqms-bk-pp.geo.berkeley.edu | 192.168.116.43/aqms-bk-pp.ncss-ucb.cisn.org | 192.168.200.43/aqms-bk-pp.ucb.cisn.org |
137.227.232.160/aqms-nc-rt.wr.usgs.gov | 192.168.119.114/aqms-nc-rt.ncss-mp.cisn.org | 192.168.201.114/aqms-nc-rt.mp.cisn.org |
137.227.232.161/aqms-nc-pp.wr.usgs.gov | 192.168.119.115/aqms-nc-pp.ncss-mp.cisn.org | 192.168.201.115/aqms-nc-pp.mp.cisn.org |
169.229.197.88/aqms-bk-aux.geo.berkeley.edu | 192.168.116.41/aqms-bk-aux.ncss-ucb.cisn.org | 192.168.200.41/aqms-bk-aux.ucb.cisn.org |
137.227.232.162/aqms-nc-aux.wr.usgs.gov | 192.168.119.116/aqms-nc-aux.ncss-mp.cisn.org | 192.168.201.116/aqms-nc-aux.mp.cisn.org |
* All tables are replicated between aqms-bk-pp (archdb1) and aqms-nc-pp (archdb1 [same DB name on both PP]).
- A subset of tables (from the PI, WF & AP schema) are replicated from aqms-bk-rt (rtdb1) and aqms-nc-rt (rtdb2) to aqms-bk-pp or aqms-nc-pp:
- amp
- ampset
- arrival
- assocamm
- assocamo
- assocaro
- assoccom
- assoccoo
- assocevampset
- assocnte
- assocwae
- coda
- credit
- event
- eventprefmag
- eventprefmec
- eventprefor
- filename
- mec
- mecchannel
- mecdata
- mecfreq
- mecfreqdata
- mecobject
- netmag
- nettrig
- origin
- origin_error
- pathname (replaced by waveroots)
- pdl_product
- pdl_ids
- remark
- request_card
- subdir
- swarm_events
- swarm_state
- trig_channel
- unassocamp
- wavefileroots (Not used in NC)
- waveform
- Non replicated tables:
- PI Schema:
- ampsettypes (loaded via DDL script)
- assoctypecat (loaded via DDL script)
- eventcategory (loaded via DDL script)
- eventtype (loaded via DDL script)
- WF Schema:
- waveroots (Populated manually)
- AP Schema:
- alarm_action
- appchannels (updated via StationUI & AppChan.sql)
- applications (updated via StationUI & AppChan.sql)
- assocevents (used by pdl2aqms)
- assoc_region_group (loaded via DDL script - Not used in NC)
- autoposter (Populated manually)
- assocwfrc (used only in NC)
- channelmap_ampparms (updated via codaparamsDB)
- channelmap_codaparms (updated via codaparamsDB)
- config_channel (updated via StationUI & AppChan.sql)
- credit_alias (Populated manually)
- epochtimebase (loaded via DDL script)
- eventmatchpassingscore (Populated manually)
- eventpriorityparams (Populated manually)
- gazetteer_region (loaded via regionDB)
- gazetteer_region_group (loaded via DDL script)
- gazetteerbigtown (loaded via placesDB)
- gazetteerpt (loaded via placesDB & quarryDB)
- gazetteerquake (loaded via placesDB)
- gazetteertown (loaded via placesDB)
- gazetteertype (Populated manually)
- gazetteerline (Not used in NC)
- gazetteerquarry (loaded via quarryDB)
- jasieventlock (used by Jiggle)
- leap_seconds (loaded via DDL script)
- logger_info (used only in NC - updated via getLoggerInfo.pl)
- magprefpriority (loaded via magprefDB)
- messagetext (loaded via DDL script)
- orgprefpriority (Populated manually)
- pcs_signal
- pcs_state
- pcs_transition
- peer_system_status
- program (updated via StationUI & AppChan.sql)
- realarm (used only in NC)
- rt_role
- stacorrections (updated via codaparamsDB (Md) & loadCISNmagcorr (Ml))
- subnet (used only in SC to populate EW carlsubtrig file)
- subnet_channel (used only in SC to populate EW carlsubtrig file)
- system_status
- IR Schema: (updated via stnxml2aqms)
- channel_comment
- channel_data
- coefficients
- d_abbreviation
- d_comment
- d_format
- d_format_data
- d_unit
- dc
- dc_data
- decimation
- dm
- pn
- pn_data
- poles_zeros
- polynomial
- pz
- pz_data
- sensitivity
- simple_response
- station_comment
- station_data
- HT Schema: (not used)
- datalogger
- datalogger_board
- datalogger_module
- filamp
- filamp_pchannel
- filter
- filter_fir
- filter_fir_data
- filter_sequence
- filter_sequence_data
- response
- response_hp
- response_lp
- response_pn
- response_pn_data
- response_pz
- sensor
- sensor_component
- station
- station_datalogger
- station_datalogger_lchannel
- station_datalogger_pchannel
- station_digitizer
- station_digitizer_pchannel
- station_filamp
- station_filamp_pchannel
- station_sensor
- station_sensor_component
- CS schema: (loaded via DDL script except for c_channeldata)
- c_channeldata (updated via stnxml2aqms)
- c_auth
- c_datum
- c_fltrtyp
- c_lgrtyp
- c_loggerdata
- c_net
- c_ornt
- c_orntabbrv
- c_phystyp
- c_rectyp
- c_sensordata
- c_site
- c_snstyp
- c_timeclk
- c_units
- Views:
- active_channels_view (updated via AppChan.sql)
- all_channels_view (updated via AppChan.sql)
- jasi_ampparms_view
- jasi_channel_view
- jasi_config_view
- jasi_response_view
- jasi_station_view
- md_magparms_view
- ml_magparms_view
- Database Sequences:
Values ending in: | are generated on: |
---|---|
1,6 | RTDB1 |
2,7 | RTDB2 |
3,8 | ARCHDB1 |
- PI Schema:
- evseq ⇒ Event.evid
- orseq ⇒ Origin.orid
- magseq ⇒ Netmag.magid
- arseq ⇒ Arrival.arid
- ampseq ⇒ Amp.ampid
- commseq ⇒ Remark.commid
- mecseq ⇒ Mec.mecid
- mecdataseq ⇒ MecData.mecdataid
- mecfreqseq ⇒ MecFreq.mecfreqid
- coseq ⇒ Coda.coid
- ampsetseq ⇒ AmpSet.ampsetid
- catseq ⇒ EventCategory.catid
- WF Schema:
- waseq ⇒ Waveform.wfid
- fiseq ⇒ Filename.fileid
- sdseq ⇒ SubDir.subdirid
- AP Schema:
- unassocseq ⇒ UnAssocAmp.ampid
- sigseq ⇒ PCS_Signal.sigid
- reqseq ⇒ Request_Card.rcid
- dbsetseq ⇒ PDL_Ids.dbsetid
- ntseq ⇒ NetTrig.ntid
- trigseq ⇒ Trig_Channel.trigid
- gazseq ⇒ GazetteerPt.gazid (Not used in NC)
- IR Schema:
- dcseq ⇒ DC.key
- dmseq ⇒ DM.key
- abbseq ⇒ D_Abbreviation.id
- uniseq ⇒ D_Unit.id
- comseq ⇒ D_Comment.id
- forseq ⇒ D_Format.id
- poseq ⇒ PN.key
- pzseq ⇒ PZ.key
- Database Users:
User | Description |
---|---|
browser | Guest user (read only) |
code | Stored procedures owner |
dcarchiver | Waveform archiver user |
operator | Jiggle user |
postgres | Super user |
repadmin | Replication user |
rtem | Real time user |
tpp | Post processing user |
trinetdb | Schema owner |
trinetdb_execute | Role (execute stored procedures) |
trinetdb_read | Role (read access) |
trinetdb_write | Role (write access) |
waveform_read | Role (read waveforms via DB (jiggle)) |
- PL/Java setup: GitLab
- Install PL/Java:
* As aqms: $ cd /home/aqms/PL_Java $ wget https://gitlab.com/aqms-swg/aqms-jiggle/-/blob/master/PL_Java/Redhat%208/pljava-pg14.2-amd64-Linux-gpp.jar $ wget https://gitlab.com/api/v4/projects/16872152/packages/generic/jiggle/1.0.305/jiggle_p8g-d71520d.jar $ mv jiggle_p8g-d71520d.jar jiggle.jar $ export PATH="/usr/pgsql-14/bin:$PATH"
* As root: $ java -jar /home/aqms/PL_Java/pljava-pg14.2-amd64-Linux-gpp.jar
- Install PL/Java PostgreSQL Extension:
* As postgres: $ psql -U postgres rtdb1 = ALTER DATABASE rtdb1 SET pljava.libjvm_location TO '/usr/lib/jvm/jre/lib/amd64/server/libjvm.so'; = \q $ psql -U postgres rtdb1 = create schema if not exists db_cleanup authorization code; = grant usage on schema db_cleanup to trinetdb_read,trinetdb_execute; = create schema if not exists eventpriority authorization code; = grant usage on schema eventpriority to trinetdb_read,trinetdb_execute; = create schema if not exists file_util authorization code; = grant usage on schema file_util to trinetdb_read,trinetdb_execute; = create schema if not exists formats authorization code; = grant usage on schema formats to trinetdb_read,trinetdb_execute; = create schema if not exists wavearc authorization code; = grant usage on schema wavearc to trinetdb_read,trinetdb_execute; = create extension pljava; = select sqlj.install_jar('file:/home/aqms/PL_Java/jiggle.jar','jigglejar',true); = select sqlj.set_classpath('assocamp', 'jigglejar'); = select sqlj.set_classpath('db_cleanup', 'jigglejar'); = select sqlj.set_classpath('eventpriority', 'jigglejar'); = select sqlj.set_classpath('file_util', 'jigglejar'); = select sqlj.set_classpath('formats', 'jigglejar'); = select sqlj.set_classpath('wavearc', 'jigglejar'); = select sqlj.set_classpath('wheres', 'jigglejar'); = grant usage on schema sqlj to trinetdb_read,trinetdb_execute;
- Test Installation:
$ psql -U rtem rtdb1 = select wheres.where_from_type(35.7, -117.7, 1., 1, 'bigtown');
- Upgrading to a new version of Jiggle.jar:
$ psql -U postgres rtdb1 = select sqlj.replace_jar('file:/home/postgres/jiggle_name.jar', 'jigglejar',true);
- StationUI setup: GitLab
- Clone the repo:
% git clone https://gitlab.com/aqms-swg/stationui.git
- Create a virtual environment:
% cd stationui/ % python3 -m venv /home/aqms/StationUI/venv
- Activate the environment:
% source /home/aqms/StationUI/venv/bin/activate
- Edit and install the requirements:
% vi requirements.txt % python3 -m pip install -r requirements.txt
- Make sure that the Applications table contains the 'description' field:
The table applications is one of the 2 tables that StationUI updates. Add the column description with a data type of varchar(1024) to this table, if this column doesn't already exist. % psql -d archdb1 -U trinetdb -h aqms-bk-pp = ALTER TABLE Applications ADD description VARCHAR(1024);
- Configure Django:
% vi django/datacenter/settings.py (venv) [aqms@aqms-bk-aux datacenter]$ diff TEMPLATE-settings.py settings.py 40c40 < ALLOWED_HOSTS = [] --- > ALLOWED_HOSTS = ['aqms-bk-aux.geo.berkeley.edu','aqms-bk-pp.geo.berkeley.edu','moment.geo.berkeley.edu'] 42c42,43 < ADMINS=[('John', 'john@example.com'), ('Mary','mary@example.com')] --- > ADMINS=[('Stephane', 'stephane@berkeley.edu')] > STATIONUI_ADMINS=[('Stephane', 'stephane@berkeley.edu')] 91c92 < # Oracle --- > # PostgreSQL 93,98c94,99 < 'ENGINE': 'django.db.backends.oracle', < 'NAME': 'db service name', < 'USER': 'db user', < 'PASSWORD': 'db password', < 'HOST': 'db hostname', < 'PORT': 'db port', --- > 'ENGINE': 'django.db.backends.postgresql', > 'NAME': 'archdb1', > 'USER': 'tpp', > 'PASSWORD': '****', > 'HOST': 'aqms-bk-pp.geo.berkeley.edu', > 'PORT': '5432', 197c198 < ORG = 'SCSN' --- > ORG = 'NCSS' 203c204 < LOCAL_NETWORKS = ['CI','ZY'] --- > LOCAL_NETWORKS = ['BK','BP','NC'] 206c207 < #CHANNEL_LIST_SOURCE = [('db','from AQMS database')] --- > CHANNEL_LIST_SOURCE = [('db','from AQMS database')] 208c209 < CHANNEL_LIST_SOURCE = [('cwb','from CWB'),('db','from AQMS database')] --- > #CHANNEL_LIST_SOURCE = [('cwb','from CWB'),('db','from AQMS database')] (venv) [aqms@aqms-bk-aux datacenter]$ % mkdir /tmp/logs
- Create and populate StationUI logging tables:
% cd django/stationui/ % cp user_action_tables_POSTGRES.sql user_action_tables_POSTGRES.sql.ORIG % vi user_action_tables_POSTGRES.sql (venv) [aqms@aqms-bk-aux stationui]$ diff user_action_tables_POSTGRES.sql.ORIG user_action_tables_POSTGRES.sql 1d0 < # postgres 31a31,48 > > create table channel_rules ( > progid integer NOT NULL, > name varchar(32), > copy_from varchar(128), -- will store list of progids > channel varchar(3), -- same as channel_data > location varchar(2), -- same as channel_data > config varchar(64), -- same as appchannels > created timestamp with time zone not null default now(), > last_mofified timestamp with time zone not null default now(), > CONSTRAINT unique_rule UNIQUE (progid, name, copy_from, channel, location, config) > ); > > GRANT ALL on stationui_action_types to tpp; > GRANT ALL on stationui_action_types_id_seq to tpp; > GRANT ALL on stationui_actions to tpp; > GRANT ALL on stationui_actions_id_seq to tpp; > GRANT ALL on channel_rules to tpp; (venv) [aqms@aqms-bk-aux stationui]$ % psql -d archdb1 -U trinetdb -h aqms-bk-pp = \i user_action_tables_POSTGRES.sql % psql -U postgres -d archdb1 -h aqms-bk-pp = GRANT ALL ON SCHEMA pglogical TO tpp;
- Create admin user:
% cd .. % python3 manage.py migrate auth % python3 manage.py migrate % python3 manage.py createsuperuser --username=<username> --email=<email>
- Configure default rules:
% vi stationui/defaultrules.py
- Start the server:
source /home/aqms/StationUI/venv/bin/activate python3 manage.py runserver aqms-bk-aux.geo.berkeley.edu:8000 >& stationui.log &
- Metadata setup:
- Install Anaconda:
% wget https://repo.anaconda.com/archive/Anaconda3-2022.10-Linux-x86_64.sh % sh Anaconda3-2022.10-Linux-x86_64.sh
- Install AQMS-PDL & AQMS-MKV0:
% pip uninstall aqms-pdl % pip uninstall aqms-mkv0 % git clone https://gitlab.com/aqms-swg/aqms-pdl.git % cd aqms-pdl/ % pip install . % cd .. % git clone https://gitlab.com/aqms-swg/aqms-mkv0.git % cd aqms-mkv0/ % pip install .
- Create configuration file:
% cat config.yml DB_HOST: xxxx DB_NAME: xxxx DB_PORT: 5432 DB_USER: xxxx DB_PASSWORD: xxxx LOG_DIR: './log' LOG_LEVEL: 'INFO' APPLICATION_NAME: PWS-NEW mkv0: # minAcc: acc. threshold (percent g) for making files for station's SNCLs MinACC: 0.1 # Velocity model file for travel-time calculations vmodelFile: ./allcal_model.d # PreEventTime: seconds before estimated P arrival to start waveform PreEventTime: 30 # SecsPermag: waveform duration should be this many seconds per magnitude SecsPerMag: 30 # minComplete: the minimum level of waveform completeness (0 to 1) required # to create V0 files.. minComplete: 0.95 %
- Create velocity model file:
% cat allcal_model.d # Compromise Crustal model for Northern & Southern California # # depth velocity lay 0.0 4.8 lay 4.5 6.1 lay 15.5 6.8 lay 25.0 7.3 lay 32.0 7.8 %
- Import Cosmos paramaters from extStaXML file:
% wget https://files.anss-sis.scsn.org/production/Extended-XML/BK/BK_BRAD_ext.xml % stnxml2aqms -n BK -s BRAD -d BK_BRAD_ext.xml
- Import IR parameters from FDSNStaXML file:
% wget https://files.anss-sis.scsn.org/production/FDSNStationXML1.1/BK/BK_POTR.xml % stnxml2aqms -n BK -s POTR -i -x --sr BK_POTR.xml
- Generate FDSNStaXML file for a given site:
% aqms2stnxml -r -n BK -s POTR BK.POTR.isti.xml
- Generate V0 files for an event:
% mkv0 --logconsole 60425711
- Update pga_hba.conf on aqms-bk-pp & aqms-nc-pp to include:
host archdb1 tpp 169.229.197.35/24 md5 # Public aqms-bk-pp & aqms-bk-rt host archdb1 tpp 192.168.116.43/24 md5 # Private aqms-bk-pp & aqms-bk-rt host archdb1 tpp 137.227.232.161/24 md5 # Public aqms-nc & aqms-nc-rt host archdb1 tpp 192.168.119.115/24 md5 # Private aqms-nc-pp & aqms-nc-rt
- Update pga_hba.conf on aqms-bk-rt to include:
host rtdb1 rtem 169.229.197.35/32 md5 # Public aqms-bk-pp host rtdb1 rtem 192.168.116.43/32 md5 # Private aqms-bk-pp host rtdb1 rtem 137.227.232.161/32 md5 # Public aqms-nc-pp host rtdb1 rtem 192.168.119.115/32 md5 # Private aqms-nc-pp
- Update pga_hba.conf on aqms-nc-rt to include:
host rtdb2 rtem 169.229.197.35/32 md5 # Public aqms-bk-pp host rtdb2 rtem 192.168.116.43/32 md5 # Private aqms-bk-pp host rtdb2 rtem 137.227.232.161/32 md5 # Public aqms-nc-pp host rtdb2 rtem 192.168.119.115/32 md5 # Private aqms-nc-pp
- Hypocenter & Magnitude Comparison between the Oracle & PostgreSQL Systems:
- Edit comparison parameters:
[aqms@aqms-bk-pp ~]$ cd /home/aqms/compare [aqms@aqms-bk-pp compare]$ vi run_compare_events.sh
# Define start and end times for this comparison starttime="2023-04-18-00:00:00" endtime="2023-04-25-00:00:00" # Databases to compare # ORACLE: publicdb rtmp2 rtucb2 # POSTGRESQL: archdb1 rtdb2 rtdb1 hostarray1=( rtucb2 ) hostarray2=( rtdb1 )
- Run comparison script:
[aqms@aqms-bk-pp compare]$ ./run_compare_events.sh
- A text file and an image are created in the output directory, e.g.:
[aqms@aqms-bk-pp compare]$ ll output/ total 228 -rw-r--r-- 1 aqms users 157399 2023-04-26 10:34 compare_rtucb2-rtdb1_2023-04-18-00:00:00_2023-04-25-00:00:00.png -rw-r--r-- 1 aqms users 70740 2023-04-26 10:34 compare_rtucb2-rtdb1_2023-04-18-00:00:00_2023-04-25-00:00:00.txt [aqms@aqms-bk-pp compare]$
- Data Migration & Switchover (ora2pg): [aqms-bk-rt/aqms-bk-pp active]
- Install ora2pg on aqms-bk-pp.
- Retrieve and extract DB package ora2pg_NC.tar.gz.
- Stop RT & PP processes on PostgreSQL systems.
% aqms forcestop (as aqms on aqms-bk-rt, aqms-nc-rt & aqms-bk-pp)
- Stop replication from RTs to PP on PostgreSQL systems.
> SELECT pglogical.alter_subscription_disable(subscription_name := 'rt1subscr', immediate := TRUE); (on archdb1 as postgres) > SELECT pglogical.alter_subscription_disable(subscription_name := 'rt2subscr', immediate := TRUE); (on archdb1 as postgres)
- Delete all rows from all the PostgreSQL databases.
> \i sql/truncate.sql (on rtdb1, rtdb2 & archdb1 as trinetdb) > \i sql/drop_sequences.sql (on rtdb1, rtdb2 & archdb1 as trinetdb)
- Reset the start value of the sequences (on rtdb1, rtdb2, archdb1 with the appropriate intervals) to the highest values (or higher) of the old Oracle databases.
% sqlplus browser@publicdb @get_max_seq.sql % vi sql/create_sequences.sql (on rtdb1, rtdb2 & archdb1 as trinetdb) > \i sql/create_sequences.sql (on rtdb1, rtdb2 & archdb1 as trinetdb)
- Stop PP processes on Oracle system.
- Disable the Event table trigger on the PostgreSQL PP system.
> ALTER TABLE event DISABLE TRIGGER post_new_event; (on archdb1 as trinetdb)
- Turn on PostgreSQL RT systems AND stop replication from Oracle RTs to Oracle PPs at the same time (goal is: oracle PP system no longer changing).
% aqms start (as aqms on aqms-bk-rt & aqms-nc-rt)
- At this point the PostgreSQL RT systems are live and alarming.
- Turn on the data transfer from Oracle PP to file to PostgreSQL PP (using ora2pg).
* Tables exported/imported: AMP AMPSET ARRIVAL ASSOCAMM ASSOCAMO ASSOCARO ASSOCCOM ASSOCCOO ASSOCEVAMPSET ASSOCNTE ASSOCWAE ASSOCWFRC CODA CREDIT EVENT EVENTPREFMAG EVENTPREFMEC EVENTPREFOR FILENAME MEC MECCHANNEL MECDATA MECFREQ MECFREQDATA MECOBJECT NETMAG NETTRIG ORIGIN ORIGIN_ERROR PATHNAME REMARK REQUEST_CARD SUBDIR SWARM_EVENTS SWARM_STATE TRIG_CHANNEL UNASSOCAMP WAVEFORM PDL_PRODUCT PDL_IDS * Export tables from Oracle DB: % cd export/ % source init.ora2pg % ./run_export.csh * Import tables into PostgreSQL DB: % cd import/ % ./run_import.csh
- While the export is running, rsync event waveform archive from Oracle (/work/dc22/ncedc/events/) to PostgreSQL system (on active system first then standby one. Disable cron entry on standby system).
% rsync -av -e ssh --progress ncss@ucbpp.geo.berkeley.edu:/work/dc22/ncedc/events/active/ /home/aqms/data/events/active (as aqms on aqms-bk-pp)
- Re-enable the Event table trigger on the PostgreSQL PP system.
> ALTER TABLE event ENABLE ALWAYS TRIGGER post_new_event; (on archdb1 as trinetdb)
- Start PP processes on PostgreSQL system.
% aqms start (as aqms on aqms-bk-pp)
- Turn back on replication from PostgreSQL RTs to PostgreSQL PP.
> SELECT pglogical.alter_subscription_enable(subscription_name := 'rt1subscr', immediate := TRUE); (on archdb1 as postgres) > SELECT pglogical.alter_subscription_enable(subscription_name := 'rt2subscr', immediate := TRUE); (on archdb1 as postgres)
- Start PP processes on Oracle system.
- Turn back on replication from Oracle RTs to Oracle PPs.
- Event Data Archiving to the NCEDC: (running on strike)
- Event waveforms directory structure:
[aqms@aqms-bk-pp events]$ pwd /home/aqms/data/events [aqms@aqms-bk-pp events]$ ll total 0 lrwxrwxrwx 1 aqms users 28 2023-03-27 16:20 active -> /work/aqms-bk-pp/aqms/active/ lrwxrwxrwx 1 aqms users 24 2023-04-19 13:25 NCEVT -> /data/ncedc/events/NCEVT/ [aqms@aqms-bk-pp events]$
- Update pga_hba.conf on aqms-bk-pp & aqms-nc-pp to include:
host archdb1 browser 169.229.197.12/32 md5 # Public strike