User Tools

Site Tools


operations:db_ncedc:ops_man_prod

PostgreSQL Production AQMS System Operations Manual

AQMS/DB: Commands

  • Start PostgreSQL database:
[postgres@aqms-bk-rt ~]$ sudo systemctl [re]start postgresql-14
  • Stop PostgreSQL database:
[postgres@aqms-bk-rt ~]$ sudo systemctl stop postgresql-14
  • PostgreSQL database Status:
[postgres@aqms-bk-rt ~]$ systemctl status postgresql-14
  • Show RT/PP system role:
[aqms@aqms-bk-rt ~]$ showrole
System is shadow
[aqms@aqms-bk-rt ~]$ 

[aqms@aqms-bk-pp events]$ showrole
System is primary
[aqms@aqms-bk-pp events]$ 
  • AQMS command:
[aqms@aqms-bk-rt ~]$ aqms -h
AQMS realtime-processing control script can be used to start,stop, restart, show status

 Usage: aqms [start|stop|status|restart|forcestop]

stop does a kill -1 and forcestop does a kill -9, returns 1 when stopped
status exits with 0 if all required processes are running and 1 if not.
restart does a stop 1 or 9 followed by start and status.

[aqms@aqms-bk-rt ~]$ 
  • Start all AQMS+EW codes on each system:
[aqms@aqms-bk-rt ~]$ aqms start
  • Status of what is running:
[aqms@aqms-bk-rt ~]$ aqms status
Usage: /home/aqms/utils/init/startStopCMS {start|stop}
adadup_slave /home/aqms/configs/adadup_slave/adadup_mnlons1_to_aqms-bk-rt.cfg running on pid 2068089
adadup_slave /home/aqms/configs/adadup_slave/adadup_mnlons2_to_aqms-bk-rt.cfg running on pid 2068097
adadup_slave /home/aqms/configs/adadup_slave/adadup_ucbns1_to_aqms-bk-rt.cfg running on pid 2068105
adadup_slave /home/aqms/configs/adadup_slave/adadup_ucbns2_to_aqms-bk-rt.cfg running on pid 2068113
ntrcg2 /home/aqms/configs/ntrcg2/ntrcg2.cfg running on pid 2068263
sigswitch /home/aqms/configs/sigswitch/sigswitch_event.cfg running on pid 2068274
sigswitch /home/aqms/configs/sigswitch/sigswitch_trigger.cfg running on pid 2068289
eqrun /home/aqms/configs/eqrun/eqrun_db2nsmp.cfg running on pid 605941
eqrun /home/aqms/configs/eqrun/eqrun_rcg.cfg running on pid 2068242
ec /home/aqms/configs/ec/ec.cfg running on pid 2068149
tc2 /home/aqms/configs/tc2/tc2.cfg running on pid 2068175
trig2db /home/aqms/configs/trig2db/trig2db.cfg running on pid 2068188
trimag /home/aqms/configs/trimag/trimag.cfg running on pid 2068201
ampgen /home/aqms/configs/ampgen/ag_early.cfg running on pid 2068213
ampgen /home/aqms/configs/ampgen/ag_late.cfg running on pid 2068223
alarmdec /home/aqms/configs/alarmdec/alarmdec-ql.cfg running on pid 294066
alarmdec /home/aqms/configs/alarmdec/alarmdec-tm.cfg running on pid 294076
alarmdec /home/aqms/configs/alarmdec/alarmdec-amp-early.cfg running on pid 294086
alarmdec /home/aqms/configs/alarmdec/alarmdec-amp-late.cfg running on pid 294095
alarmdec /home/aqms/configs/alarmdec/alarmdec-mw.cfg running on pid 294106
alarmdec /home/aqms/configs/alarmdec/alarmdec-mech.cfg running on pid 294115
alarmdec /home/aqms/configs/alarmdec/alarmdec-tmts-done.cfg running on pid 294126
alarmact /home/aqms/configs/alarmact/alarmact-rt.cfg running on pid 294137
alarmdist /home/aqms/configs/alarmdist/alarmdist-rt.cfg running on pid 294148
runner /home/aqms/configs/runner/sendcancel_runner.cfg running on pid 294160
telestifle /home/aqms/configs/telestifle/telestifle.cfg running on pid 294172
pws /home/aqms/configs/pws/pws.cfg running on pid 2068255
solserver.pl running on pids 2068319
2068332
2068372
2068410
PDL Exec Rec running (pid=2068648)
EIDS Notification Server running (pid=2068763)
EIDS Server running (pid=2068850)
PDL Hub running (pid=2068956)
PDL LI Poll Sender running (pid=2069106)
[aqms@aqms-bk-rt ~]$ 

[aqms@aqms-bk-rt ~]$ status
using default config file startstop_unix.d
NOTE: If next line reads "ERROR: tport_attach...", Earthworm is not running.
      Sent request for status; waiting for response...

                    EARTHWORM-64 SYSTEM STATUS

        Hostname-OS:            aqms-bk-rt - Linux 4.18.0-425.10.1.el8_7.x86_64
        Start time (UTC):       Wed Mar 22 23:07:49 2023
        Current time (UTC):     Wed Mar 29 22:07:07 2023
        Disk space avail:       508789568 kb
        Ring  1 name/key/size:  HYPO_RING / 1015 / 1024 kb
        Ring  2 name/key/size:  PICK_RING / 1005 / 1024 kb
        Ring  3 name/key/size:  FILTERPICK_RING / 1039 / 1024 kb
        Startstop's Log Dir:    /home/aqms/logs/
        Startstop's Params Dir: /home/aqms/ew/params/
        Startstop's Bin Dir:    /home/aqms/ew/bin
        Startstop Version:      v7.10 2019-08-13 (64 bit)

         Process  Process               Class/    CPU
          Name      Id      Status     Priority   Used    Argument
         -------  -------   ------     --------   ----    --------
       startstop  853366    Alive         ??/ 0 00:00:21  -
      import_ack  853367    Alive         ??/ 0 00:03:21  imp_pktrg_menlo1.d
      import_ack  853368    Alive         ??/ 0 00:03:21  imp_pktrg_menlo2.d
      import_ack  853369    Alive         ??/ 0 00:02:05  imp_pktrg_ucb1.d
      import_ack  853370    Alive         ??/ 0 00:02:36  imp_pktrg_ucb2.d
        pkfilter  853371    Alive         ??/ 0 00:04:08  pkfilter.d
   statrigfilter  853372    Alive         ??/ 0 00:03:39  statrigfilter.d
       binder_ew  853373    Alive         ??/ 0 00:09:16  binder_ew.d
      eqassemble  853374    Alive         ??/ 0 00:03:57  eqassemble.d
         hyps2ps  853375    Alive         ??/ 0 00:00:19  hyps2ps.d
     carlsubtrig  853376    Alive         ??/ 0 00:01:23  carlsubtrig.d
         trig2ps  853377    Alive         ??/ 0 00:00:18  trig2ps.d
         statmgr  853378    Alive         ??/ 0 00:00:22  statmgr.d
      telafeeder  853379    NoExec                     telafeeder.d
         diskmgr  853380    Alive         ??/ 0 00:00:17  diskmgr.d
      copystatus  853381    Alive         ??/ 0 00:00:18  PICK_RING HYPO_RING
      copystatus  853382    Alive         ??/ 0 00:00:16  FILTERPICK_RING <_RING

[aqms@aqms-bk-rt ~]$ 
[aqms@aqms-bk-pp ~]$ aqms status
alarmdec /home/aqms/configs/alarmdec/alarmdec-pp.cfg running on pid 1640591
alarmdec /home/aqms/configs/alarmdec/alarmdec-pp-mech.cfg running on pid 1640600
alarmdec /home/aqms/configs/alarmdec/alarmdec-pp-mw.cfg running on pid 1640610
alarmact /home/aqms/configs/alarmact/alarmact-pp.cfg running on pid 1640623
alarmdist /home/aqms/configs/alarmdist/alarmdist-pp.cfg running on pid 1640636
pws /home/aqms/configs/pws/pws.cfg running on pid 865789
865833
1640697
pws /home/aqms/configs/pws/pws_UCB.cfg running on pid 865827
1640707
pws /home/aqms/configs/pws/pws_CE.cfg running on pid 1640714
pws /home/aqms/configs/pws/pws_dart.cfg running on pid 865829
1640721
pws /home/aqms/configs/pws/pws_NN.cfg running on pid 865828
1640728
aqms      201218  0.3  0.0 342384 50712 ?        S    Mar28   7:25 wanc wanc_AZBKBPLBPBUS.cfg
aqms      201219  0.0  0.0  20276  2216 ?        S    Mar28   0:03 conlog /home/aqms/logs/wanc_AZBKBPLBPBUS_con
aqms      201227  0.2  0.0 305004 13168 ?        S    Mar28   4:24 wanc wanc_BG.cfg
aqms      201228  0.0  0.0  20276  2068 ?        S    Mar28   0:00 conlog /home/aqms/logs/wanc_BG_con
aqms      201243  0.3  0.0 346852 55604 ?        S    Mar28   6:44 wanc wanc_CE.cfg
aqms      201244  0.0  0.0  20276  2056 ?        S    Mar28   0:00 conlog /home/aqms/logs/wanc_CE_con
aqms      201258  0.0  0.0 317196 23496 ?        S    Mar28   1:48 wanc wanc_CINN.cfg
aqms      201259  0.0  0.0  20276  2032 ?        S    Mar28   0:00 conlog /home/aqms/logs/wanc_CINN_con
aqms      201269  0.6  0.0 390836 99372 ?        S    Mar28  11:24 wanc wanc_NC.cfg
aqms      201270  0.0  0.0  20276  3744 ?        S    Mar28   0:04 conlog /home/aqms/logs/wanc_NC_con
aqms      201283  0.5  0.0 342224 50608 ?        S    Mar28  10:04 wanc wanc_GMGSNPPGRESBSFUOUWWR.cfg
aqms      201284  0.0  0.0  20276  2124 ?        S    Mar28   0:00 conlog /home/aqms/logs/wanc_GMGSNPPGRESBSFUOUWWR_con
solserver.pl running on pids 1640802
1640808
1640814
1640820
pcs_poller /home/aqms/configs/waveform_gifs/drp_pcs.cfg running on pid 1640828
pcs_poller /home/aqms/configs/waveform_gifs/trp_pcs.cfg running on pid 1640836
[aqms@aqms-bk-pp ~]$ 

AQMS: Metadata Loading

  • As the 'aqms' user on the active pp (aqms-bk-pp or aqms-nc-pp), cd into /home/aqms/DBmaint/metadata/.
  • To load all stations for all networks:
    • Run 'getNCEDCstaXML' to retrieve all stationXML files from the NCEDC.
    • Run 'loadAllStaXML.sh' to load all stations (3-4h for ~3500 stations).
  • To load a given network:
    • Run 'getNCEDCstaXML <Network>' to retrieve all stationXML files for the given network.
    • Run 'loadAllStaXML.sh' to load all stations for the given network.
  • To load a given station:
    • Run 'getNCEDCstaXML <Network> <Station>' to retrieve the stationXML file for the given station.
    • Run 'loadAllStaXML.sh' to load the station.
  • To load a list of networks and stations: [Preferred Method]
    • Run 'getFromList' to retrieve the stationXML files for the networks and stations specified in 'netsta.lst', e.g.:
[aqms@aqms-bk-pp metadata]$ cat netsta.lst 
SB
BK MERC
NP 1858
UO
[aqms@aqms-bk-pp metadata]$ 
  • To refresh the metadata on aqms-bk-rt & aqms-nc-rt (~30 mins):
[aqms@aqms-bk-pp ~]$ cd /home/aqms/DBmaint/metadata/
[aqms@aqms-bk-pp metadata]$ ./refreshMetadata.csh
  • To refresh the PWS channel mappings:
[aqms@aqms-bk-pp ~]$ cd /home/aqms/DBmaint/PWS/
[aqms@aqms-bk-pp PWS]$ ./pws
  • To refresh the PWS channel mappings on aqms-bk-rt & aqms-nc-rt:
[aqms@aqms-bk-pp ~]$ cd /home/aqms/DBmaint/misc_tables/
[aqms@aqms-bk-pp misc_tables]$ ./refreshMiscTables.csh

AQMS: Miscellaneous DB Tables

  • As the 'aqms' user on the active pp (aqms-bk-pp or aqms-nc-pp), use:
Table: Maintenance Application:
APPCHANNELS StationUI, AppChan.sql, codaparamsDB (section 111) & pws (section 101)
APPLICATIONS StationUI, AppChan.sql, codaparamsDB (section 111) & pws (section 101)
CONFIG_CHANNEL AppChan.sql (derived from APPCHANNELS table)
PROGRAM AppChan.sql (derived from APPLICATIONS table)
CHANNELMAP_AMPPARMS codaparamsDB
CHANNELMAP_CODAPARMS codaparamsDB
CREDIT_ALIAS psql, by hand
GAZETTEERBIGTOWN placesDB
GAZETTEERTOWN placesDB
GAZETTEERQUAKE placesDB
GAZETTEERQUARRY quarryDB
GAZETTEERPT placesDB & quarryDB
GAZETTEER_REGION regionDB
MAGPREFPRIORITY magprefDB
STACORRECTIONS codaparamsDB (Md), loadCISNmagcorr (Ml)
  • To refresh the AppChan.sql tables on the active pp:
[aqms@aqms-bk-pp ~]$ cd /home/aqms/DBmaint/AppCha/
[aqms@aqms-bk-pp ~]$ vi appChan_pg.sql
[aqms@aqms-bk-pp misc_tables]$ ./refreshAppCha.csh
  • To refresh the miscellaneous tables on aqms-bk-rt & aqms-nc-rt:
[aqms@aqms-bk-pp ~]$ cd /home/aqms/DBmaint/misc_tables/
[aqms@aqms-bk-pp misc_tables]$ ./refreshMiscTables.csh

AQMS: Adding New Stations

AQMS: Real Time Role Switch

NOTES: Make sure no earthquakes are being processed by the RT system. You do this in a terminal window by looking at the current eqassemble log file on one of the RT systems. Check that this file has not changed in the last few minutes. Then run “tail -f” on the eqassemble log to see what is happening. If a new event starts to be processed here, wait until it has finished processing in the RT systems.

The command 'switchrole' can be launched on either aqms-bk-rt or aqms-nc-rt.
The real time switch should happen before the post processing one because currently a PP system
is considered primary at startup if its corresponding RT system is primary (showrole script logic).
In order to do a complete switchover:
     * Disable the 'rsync_archive' cron jobs on aqms-bk-pp & aqms-nc-pp.
     * Perform the steps in the 'AQMS: Real Time Role Switch' section.
     * Perform the steps in the 'AQMS/DB: Post Processing Role Switch' section.
     * Enable the 'rsync_archive' cron jobs on aqms-bk-pp & aqms-nc-pp.
     * Perform the steps in the 'AQMS/DB: Auxiliary Role Switch & Other Items' section.
  • Switchover:
[aqms@aqms-bk-rt ~]$ switchrole
Will switch aqms-bk-rt (rtdb1) from primary to shadow
and aqms-nc-rt(rtdb2) from shadow to primary
OK (Y or N):Y
Will do it.
rtem@rtdb1
System was primary, changing database to shadow
spawn psql -q -h aqms-bk-rt.ncss-ucb.cisn.org -d rtdb1 -U rtem -W
Password: 
rtdb1=> \i /home/aqms/utils/scripts/set_shadow.sql
rtdb1=> \q
rtem@rtdb2
System was shadow, changing database to primary
spawn psql -q -h aqms-nc-rt.ncss-mp.cisn.org -d rtdb2 -U rtem -W
Password: 
rtdb2=> \i /home/aqms/utils/scripts/set_primary.sql
rtdb2=> \q
[aqms@aqms-bk-rt ~]$
  • Standby Status:
[aqms@aqms-bk-rt ~]$ psql -q -h aqms-bk-rt.ncss-ucb.cisn.org -d rtdb1 -U rtem -W
Password: 
rtdb1=> select * from rt_role;
 primary_system |     modification_time      
----------------+----------------------------
 true           | 2023-03-07 00:45:20.992049
 false          | 2023-04-21 17:01:55.751883
 true           | 2023-04-21 17:04:13.129714
 false          | 2023-04-21 17:21:42.727915
 true           | 2023-04-21 17:21:54.800726
 false          | 2023-06-08 21:30:29.775161
 true           | 2023-06-08 21:59:35.722209
 false          | 2023-07-12 17:06:11.835567
 true           | 2023-07-12 17:10:47.257782
 false          | 2023-07-12 17:12:49.347408
(10 rows)

rtdb1=> \q
[aqms@aqms-bk-rt ~]$
  • Primary Status:
[aqms@aqms-bk-rt ~]$ psql -q -h aqms-nc-rt.ncss-mp.cisn.org -d rtdb2 -U rtem -W
Password: 
rtdb2=> select * from rt_role;
 primary_system |     modification_time     
----------------+---------------------------
 false          | 2023-03-27 00:00:00
 true           | 2023-07-12 17:12:49.54559
(2 rows)

rtdb2=> \q
[aqms@aqms-bk-rt ~]$ 

AQMS/DB: Post Processing Role Switch

NOTE:

MAKE SURE that the DB instances are started and stopped with systemctl, not pg_ctl.
          systemctl status postgresql-14 
          sudo systemctl stop postgresql-14 
          sudo systemctl [re]start postgresql-14 
  • Post Processing Failover:
- In a controlled environment (e.g. patching the active PP):

  * On the standby system as 'postgres', do a dry run first:

     % repmgr standby switchover --verbose --force-rewind --dry-run

  * If you get no errors from the dry run, on the active & standby systems as 'aqms' shut down AQMS:

     % aqms forcestop

  * Check to see if anything is still running, kill remaining processes:

     % aqms status

  * On the standby system as 'aqms', sync the event waveform archive:

     % rsync -av -e ssh --progress aqms@{primary}:/home/aqms/data/events/active/ /home/aqms/data/events/active
  where {primary} = {aqms-bk-pp.ncss-ucb.cisn.org|aqms-nc-pp.ncss-mp.cisn.org}

  * On the standby system as 'postgres', do the switchover:

     % repmgr standby switchover --verbose --force-rewind

  * On aqms-nc-aux as 'ncss', switch the DB for the DDRT process:

     % cp /home/ncss/run/params/dd2db.conf.{new_primary} /home/ncss/run/params/dd2db.conf
       where {new_primary} = {aqms-bk-pp|aqms-nc-pp}

  * On the active & standby systems as 'aqms', start AQMS:

     % aqms start

  * Check to make sure that all processes are running:

     % aqms status


- In an emergency (e.g. active PP has crashed):

  If something bad happened or you don't expect to be able to have the
  standby become primary, instead of using --force-rewind use this
  command which kills the original primary after taking over.

     % repmgr standby switchover --always-promote

  * Stop AQMS on the old primary if it isn't already, and start it up on the new primary (see above).
  • Switchover dry run:
[postgres@aqms-nc-pp ~]$ repmgr standby switchover --verbose --force-rewind --dry-run
INFO: checking for package configuration file "/etc/repmgr/14/repmgr.conf"
INFO: configuration file found at: "/etc/repmgr/14/repmgr.conf"
NOTICE: checking switchover on node "walreceiver" (ID: 2) in --dry-run mode
INFO: searching for primary node
INFO: checking if node 1 is primary
INFO: current primary node is 1
INFO: prerequisites for using pg_rewind are met
INFO: SSH connection to host "aqms-bk-pp.ncss-ucb.cisn.org" succeeded
INFO: able to execute "repmgr" on remote host "aqms-bk-pp.ncss-ucb.cisn.org"
INFO: 1 walsenders required, 2 available
INFO: demotion candidate is able to make replication connection to promotion candidate
INFO: archive mode is "off"
INFO: replication lag on this standby is 0 seconds
NOTICE: attempting to pause repmgrd on 2 nodes
NOTICE: local node "walreceiver" (ID: 2) would be promoted to primary; current primary "aqmsbkpp" (ID: 1) would be demoted to standby
INFO: following shutdown command would be run on node "aqmsbkpp":
  "sudo /bin/systemctl stop postgresql-14"
INFO: parameter "shutdown_check_timeout" is set to 60 seconds
INFO: prerequisites for executing STANDBY SWITCHOVER are met
[postgres@aqms-nc-pp ~]$ 
  • Switchover:
[postgres@aqms-nc-pp ~]$ repmgr standby switchover --verbose --force-rewind
INFO: checking for package configuration file "/etc/repmgr/14/repmgr.conf"
INFO: configuration file found at: "/etc/repmgr/14/repmgr.conf"
NOTICE: executing switchover on node "walreceiver" (ID: 2)
INFO: searching for primary node
INFO: checking if node 1 is primary
INFO: current primary node is 1
INFO: prerequisites for using pg_rewind are met
INFO: SSH connection to host "aqms-bk-pp.ncss-ucb.cisn.org" succeeded
INFO: archive mode is "off"
INFO: replication lag on this standby is 0 seconds
NOTICE: attempting to pause repmgrd on 2 nodes
NOTICE: local node "walreceiver" (ID: 2) will be promoted to primary; current primary "aqmsbkpp" (ID: 1) will be demoted to standby
NOTICE: stopping current primary node "aqmsbkpp" (ID: 1)
NOTICE: issuing CHECKPOINT on node "aqmsbkpp" (ID: 1) 
DETAIL: executing server command "sudo /bin/systemctl stop postgresql-14"
INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 12/4ED71AF8
NOTICE: promoting standby to primary
DETAIL: promoting server "walreceiver" (ID: 2) using pg_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
INFO: standby promoted to primary after 1 second(s)
NOTICE: STANDBY PROMOTE successful
DETAIL: server "walreceiver" (ID: 2) was successfully promoted to primary
NOTICE: issuing CHECKPOINT on node "walreceiver" (ID: 2) 
ERROR: unable to execute CHECKPOINT
INFO: node "aqmsbkpp" (ID: 1) is pingable
WARNING: node "aqmsbkpp" attached in state "startup"
INFO: waiting for node "aqmsbkpp" (ID: 1) to connect to new primary; 1 of max 60 attempts (parameter "node_rejoin_timeout")
DETAIL: node "walreceiver" (ID: 1) is currently attached to its upstream node in state "startup"
WARNING: node "aqmsbkpp" attached in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
INFO: waiting for node "aqmsbkpp" (ID: 1) to connect to new primary; 6 of max 60 attempts (parameter "node_rejoin_timeout")
DETAIL: node "walreceiver" (ID: 1) is currently attached to its upstream node in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
INFO: waiting for node "aqmsbkpp" (ID: 1) to connect to new primary; 11 of max 60 attempts (parameter "node_rejoin_timeout")
DETAIL: node "walreceiver" (ID: 1) is currently attached to its upstream node in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
INFO: waiting for node "aqmsbkpp" (ID: 1) to connect to new primary; 16 of max 60 attempts (parameter "node_rejoin_timeout")
DETAIL: node "walreceiver" (ID: 1) is currently attached to its upstream node in state "catchup"
WARNING: node "aqmsbkpp" attached in state "catchup"
INFO: node "aqmsbkpp" (ID: 1) has attached to its upstream node
NOTICE: node "walreceiver" (ID: 2) promoted to primary, node "aqmsbkpp" (ID: 1) demoted to standby
NOTICE: switchover was successful
DETAIL: node "walreceiver" is now primary and node "aqmsbkpp" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully
[postgres@aqms-nc-pp ~]$
  • Primary Status:
[postgres@aqms-bk-pp ~]$ repmgr node status
Node "aqmsbkpp":
	PostgreSQL version: 14.6
	Total data size: 2088 MB
	Conninfo: host=aqms-bk-pp.ncss-ucb.cisn.org user=repadmin dbname=repmgr connect_timeout=2 application_name=aqmsbkpp
	Role: primary
	WAL archiving: off
	Archive command: (none)
	Replication connections: 2 (of maximal 3)
	Replication slots: 1 physical (of maximal 80; 0 missing)
	Replication lag: n/a

[postgres@aqms-bk-pp ~]$ psql
psql (14.6)
Type "help" for help.

postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 944229
usesysid         | 19831
usename          | repadmin
application_name | walreceiver
client_addr      | 192.168.119.115
client_hostname  | 
client_port      | 45236
backend_start    | 2023-04-17 16:43:34.556498-07
backend_xmin     | 
state            | streaming
sent_lsn         | A/5C76A040
write_lsn        | A/5C76A040
flush_lsn        | A/5C76A040
replay_lsn       | A/5C76A040
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_priority    | 0
sync_state       | async
reply_time       | 2023-04-17 16:55:38.587616-07
-[ RECORD 2 ]----+------------------------------
pid              | 946732
usesysid         | 23101
usename          | barman
application_name | barman_receive_wal
client_addr      | 169.229.197.35
client_hostname  | 
client_port      | 48700
backend_start    | 2023-04-17 16:50:01.99932-07
backend_xmin     | 
state            | streaming
sent_lsn         | A/5C76A040
write_lsn        | A/5C76A040
flush_lsn        | A/5C000000
replay_lsn       | 
write_lag        | 00:00:05.659113
flush_lag        | 00:05:32.17972
replay_lag       | 00:05:32.17972
sync_priority    | 0
sync_state       | async
reply_time       | 2023-04-17 16:55:34.204999-07

postgres=# \q
[postgres@aqms-bk-pp ~]$
  • Standy Status:
[postgres@aqms-nc-pp ~]$ repmgr node status
Node "walreceiver":
	PostgreSQL version: 14.7
	Total data size: 2088 MB
	Conninfo: host=aqms-nc-pp.ncss-mp.cisn.org user=repadmin dbname=repmgr connect_timeout=2 application_name=walreceiver
	Role: standby
	WAL archiving: off
	Archive command: (none)
	Replication connections: 1 (of maximal 3)
	Replication slots: 1 physical (of maximal 80; 0 missing)
	Upstream node: aqmsbkpp (ID: 1)
	Replication lag: 0 seconds
	Last received LSN: A/5CD499C8
	Last replayed LSN: A/5CD499C8

[postgres@aqms-nc-pp ~]$ psql
psql (14.7)
Type "help" for help.

postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_wal_receiver;
-[ RECORD 1 ]---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
pid                   | 1932766
status                | streaming
receive_start_lsn     | A/5C000000
receive_start_tli     | 1
written_lsn           | A/5CE0DED8
flushed_lsn           | A/5CE0DED8
received_tli          | 1
last_msg_send_time    | 2023-04-17 17:04:42.446616-07
last_msg_receipt_time | 2023-04-17 17:04:42.447379-07
latest_end_lsn        | A/5CE0DED8
latest_end_time       | 2023-04-17 17:04:42.446616-07
slot_name             | 
sender_host           | aqms-bk-pp.ncss-ucb.cisn.org
sender_port           | 5432
conninfo              | user=repadmin password=******** channel_binding=prefer dbname=replication host=aqms-bk-pp.ncss-ucb.cisn.org port=5432 fallback_application_name=walreceiver sslmode=prefer sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any

postgres=# \q
[postgres@aqms-nc-pp ~]$ 

AQMS: Auxiliary Role Switch & Other Items

  • TDMT Users Email Notification

On aqms-bk-aux,run this command if Berkeley will be primary:

/home/aqms/bin/update_tmts_webpage.sh Berkeley  

If Menlo will be primary:

/home/aqms/bin/update_tmts_webpage.sh Menlo  
  • ShakeMap Config Changes

On new primary aux system,

run_shakecontrol stop
cd /home/aqms/shakemap_profiles/default/install/config
\rm shake.conf transfer.conf
ln -s transfer.conf-prod transfer.conf
ln -s shake.conf-prod shake.conf
run_shakecontrol start

On new shadow aux system,

run_shakecontrol stop
cd /home/aqms/shakemap_profiles/default/install/config
\rm shake.conf transfer.conf
ln -s transfer.conf-back transfer.conf
ln -s shake.conf-back shake.conf
run_shakecontrol start
  • DRP Missing WaveForm SnapShots

On primary RT system, e.g. aqms-bk-rt is primary and evid 75000006 has missing snapshots, please run this command on aqms-bk-rt:

/home/aqms/bin/siggen /signals/RTDB1/event  75000006 aqms-bk-rt_SIGSWITCH_EVENT_AQMS-BK-RT /home/aqms/cms/cms.cfg

Please check sigswitch log on primary RT system, if no new events are sent out. You need to restart:

sigswitch_ctl.sh forcestop
sigswitch_ctl.sh start

DB: Shadow Post Processing DB Recovery

  • Streaming Replication Status (shadow node aqms-bk-pp is detached):
[postgres@aqms-nc-pp ~]$ repmgr cluster show
WARNING: node "aqms-bk-pp" not found in "pg_stat_replication"
 ID | Name        | Role    | Status    | Upstream      | Location | Priority | Timeline | Connection string
----+-------------+---------+-----------+---------------+----------+----------+----------+--------------------------------------------------------
 1  | aqmsbkpp    | standby |   running | ! walreceiver | default  | 100      | 12       | host=aqms-bk-pp.ncss-ucb.cisn.org user=repadmin dbname=repmgr connect_timeout=2
 2  | walreceiver | primary | * running |               | default  | 100      | 12       | host=aqms-nc-pp.ncss-mp.cisn.org user=repadmin dbname=repmgr connect_timeout=2

WARNING: following issues were detected
  - node "aqms-bk-pp" (ID: 1) is not attached to its upstream node "walreceiver" (ID: 2)

[postgres@aqms-nc-pp ~]$
  • If WAL segments have not yet been recycled:
==> Run the following on the shadow PP system (e.g. aqms-bk-pp):
[postgres@aqms-bk-pp ~]$ repmgr node rejoin --force-rewind --verbose -d 'host= aqms-nc-pp.ncss-mp.cisn.org user=repadmin dbname=repmgr connect_timeout=2'
  • If WAL segments have been recycled:
- If errors are present in log or pg_rewind fails when running 'repmgr node rejoin':

2021-07-01 14:46:12.115 PDT [2491] ERROR:  requested WAL segment 0000000C0000003A000000AA has already been removed

First on the shadow PP DB, move $PGDATA out of the way and back it up if you want to.
Then run:

/usr/pgsql-14/bin/pg_basebackup -h source.host.com -R -D $PGDATA -U repadmin -v -P 

The "-R" generates a recovery file when doing this copy, so when you start up this instance it will be a streaming standby kept in sync with source.host.com.
  • Recovery example:
[postgres@aqms-bk-pp ~]$ sudo systemctl stop postgresql-14

[postgres@aqms-bk-pp ~]$ pwd
/var/lib/pgsql
[postgres@aqms-bk-pp ~]$ mv 14 14.0
[postgres@aqms-bk-pp ~]$ /usr/pgsql-14/bin/pg_basebackup -h aqms-nc-pp.ncss-mp.cisn.org -R -D $PGDATA -U repadmin -v -P
pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 6D/4F1BE930 on timeline 12
pg_basebackup: starting background WAL receiver
5474940/5474940 kB (100%), 1/1 tablespace
pg_basebackup: write-ahead log end point: 6D/83E8EE50
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: base backup completed
[postgres@aqms-bk-pp ~]$

[postgres@aqms-bk-pp ~]$ cd 14/data
[postgres@aqms-bk-pp ~]$ cp ../../14.0/data/pg_hba.conf .

[postgres@aqms-bk-pp ~]$ sudo systemctl start postgresql-14

[postgres@aqms-bk-pp ~]$ repmgr node check
INFO: connection is not a superuser connection, falling back to simple check
HINT: provide a superuser with -S/--superuser, or add the "repadmin" user to role "pg_read_all_settings" or "pg_monitor"
Node "aqmsbkpp":
        Server role: OK (node is standby)
        Replication lag: OK (0 seconds)
        WAL archiving: OK (0 pending archive ready files)
        Upstream connection: OK (node "aqmsbkpp" (ID: 1) is attached to expected upstream node "walreceiver" (ID: 2))
        Downstream servers: OK (this node has no downstream nodes)
        Replication slots: OK (node has no physical replication slots)
        Missing physical replication slots: OK (node has no missing physical replication slots)
        Configured data directory: OK (configured "data_directory" is "/home/pgsql/data")
[postgres@aqms-bk-pp ~]$ repmgr node status
Node "aqmsbkpp":
        PostgreSQL version: 14.6
        Total data size: 3746 MB
        Conninfo: host=aqms-bk-pp.ncss-ucb.cisn.org user=repadmin dbname=repmgr connect_timeout=2
        Role: standby
        WAL archiving: off
        Archive command: (none)
        Replication connections: 0 (of maximal 3)
        Replication slots: 0 physical (of maximal 80; 0 missing)
        Upstream node: walreceiver (ID: 2)
        Replication lag: 0 seconds
        Last received LSN: 6D/869FEEA8
        Last replayed LSN: 6D/869FEEA8

[postgres@aqms-bk-pp ~]$ repmgr cluster show
 ID | Name        | Role    | Status    | Upstream    | Location | Priority | Timeline | Connection string
----+-------------+---------+-----------+-------------+----------+----------+----------+--------------------------------------------------------
 1  | aqmsbkpp    | standby |   running | walreceiver | default  | 100      | 12       | host=aqms-bk-pp.ncss-ucb.cisn.org user=repadmin dbname=repmgr connect_timeout=2
 2  | walreceiver | primary | * running |             | default  | 100      | 12       | host=aqms-nc-pp.ncss-mp.cisn.org user=repadmin dbname=repmgr connect_timeout=2
[postgres@aqms-bk-pp ~]$

[postgres@aqms-nc-pp ~]$ repmgr node check
INFO: connection is not a superuser connection, falling back to simple check
HINT: provide a superuser with -S/--superuser, or add the "repadmin" user to role "pg_read_all_settings" or "pg_monitor"
Node "walreceiver":
        Server role: OK (node is primary)
        Replication lag: OK (N/A - node is primary)
        WAL archiving: OK (0 pending archive ready files)
        Upstream connection: OK (N/A - node is primary)
        Downstream servers: OK (1 of 1 downstream nodes attached)
        Replication slots: OK (node has no physical replication slots)
        Missing physical replication slots: OK (node has no missing physical replication slots)
        Configured data directory: OK (configured "data_directory" is "/home/pgsql/data")
[postgres@aqms-nc-pp ~]$ repmgr node status
Node "walreceiver":
        PostgreSQL version: 10.7
        Total data size: 3746 MB
        Conninfo: host=aqms-nc-pp.ncss-mp.cisn.org user=repadmin dbname=repmgr connect_timeout=2
        Role: primary
        WAL archiving: off
        Archive command: (none)
        Replication connections: 1 (of maximal 3)
        Replication slots: 0 physical (of maximal 80; 0 missing)
        Replication lag: n/a

[postgres@aqms-nc-pp ~]$ repmgr cluster show
 ID | Name        | Role    | Status    | Upstream    | Location | Priority | Timeline | Connection string
----+-------------+---------+-----------+-------------+----------+----------+----------+--------------------------------------------------------
 1  | aqmsbkpp    | standby |   running | walreceiver | default  | 100      | 12       | host=aqms-bk-pp.ncss-ucb.cisn.org user=repadmin dbname=repmgr connect_timeout=2
 2  | walreceiver | primary | * running |             | default  | 100      | 12       | host=aqms-nc-pp.ncss-mp.cisn.org user=repadmin dbname=repmgr connect_timeout=2
[postgres@aqms-nc-pp ~]$

DB: Streaming Replication Status

  • Node Status:
[postgres@aqms-bk-pp ~]$ repmgr node check
Node "aqmsbkpp":
	Server role: OK (node is primary)
	Replication lag: OK (N/A - node is primary)
	WAL archiving: OK (0 pending archive ready files)
	Upstream connection: OK (N/A - node is primary)
	Downstream servers: OK (1 of 1 downstream nodes attached)
	Replication slots: OK (1 of 1 physical replication slots are active)
	Missing physical replication slots: OK (node has no missing physical replication slots)
	Configured data directory: OK (configured "data_directory" is "/home/pgsql/data")
[postgres@aqms-bk-pp ~]$ repmgr node status
Node "aqmsbkpp":
	PostgreSQL version: 14.6
	Total data size: 2088 MB
	Conninfo: host=aqms-bk-pp.ncss-ucb.cisn.org user=repadmin dbname=repmgr connect_timeout=2 application_name=aqmsbkpp
	Role: primary
	WAL archiving: off
	Archive command: (none)
	Replication connections: 2 (of maximal 3)
	Replication slots: 1 physical (of maximal 80; 0 missing)
	Replication lag: n/a

[postgres@aqms-bk-pp ~]$  
[postgres@aqms-nc-pp ~]$ repmgr node check
Node "walreceiver":
	Server role: OK (node is standby)
	Replication lag: OK (0 seconds)
	WAL archiving: OK (0 pending archive ready files)
	Upstream connection: OK (node "walreceiver" (ID: 2) is attached to expected upstream node "aqmsbkpp" (ID: 1))
	Downstream servers: OK (this node has no downstream nodes)
	Replication slots: OK (1 of 1 physical replication slots are active)
	Missing physical replication slots: OK (node has no missing physical replication slots)
	Configured data directory: OK (configured "data_directory" is "/var/lib/pgsql/14/data")
[postgres@aqms-nc-pp ~]$ repmgr node status
Node "walreceiver":
	PostgreSQL version: 14.7
	Total data size: 2088 MB
	Conninfo: host=aqms-nc-pp.ncss-mp.cisn.org user=repadmin dbname=repmgr connect_timeout=2 application_name=walreceiver
	Role: standby
	WAL archiving: off
	Archive command: (none)
	Replication connections: 1 (of maximal 3)
	Replication slots: 1 physical (of maximal 80; 0 missing)
	Upstream node: aqmsbkpp (ID: 1)
	Replication lag: 0 seconds
	Last received LSN: A/5D18B6C0
	Last replayed LSN: A/5D18B6C0

[postgres@aqms-nc-pp ~]$ 
  • Cluster Status:
[postgres@aqms-bk-pp ~]$ repmgr cluster show
 ID | Name        | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                          
----+-------------+---------+-----------+----------+----------+----------+----------+-------------------------------------------------------------------------------------------------------------
 1  | aqmsbkpp    | primary | * running |          | default  | 100      | 1        | host=aqms-bk-pp.ncss-ucb.cisn.org user=repadmin dbname=repmgr connect_timeout=2 application_name=aqmsbkpp  
 2  | walreceiver | standby |   running | aqmsbkpp | default  | 100      | 1        | host=aqms-nc-pp.ncss-mp.cisn.org user=repadmin dbname=repmgr connect_timeout=2 application_name=walreceiver
[postgres@aqms-bk-pp ~]$ repmgr cluster event
 Node ID | Name        | Event            | OK | Timestamp           | Details                                              
---------+-------------+------------------+----+---------------------+-------------------------------------------------------
 2       | walreceiver | standby_register | t  | 2023-04-17 11:19:08 | standby registration succeeded; upstream node ID is 1
 1       | aqmsbkpp    | primary_register | t  | 2023-04-17 11:09:15 |                                                      
 1       | aqmsbkpp    | cluster_created  | t  | 2023-04-17 11:09:15 |                                                      

[postgres@aqms-bk-pp ~]$ 

AQMS/DB: Troubleshooting

  • UCB-EMP connectivity down:
- No action required.
- Wait until connectivity comes back up. Replication should resume and catch up automatically.
  • Standby RT host down:
- No action required.
- Bring host back up.
  • Standby PP host down:
- No action required.
- Bring host back up. Replication should resume and catch up automatically.
- In case of problems, rebuild shadow PP DB.
  • Active RT host down:
- Set the other RT system as active.
- If active PP system is up:
     - Perform a PP role switch.
- If active PP system is unavailable temporarily:
     - Wait for it to come back up.
     - Perform a PP role switch.
- If active PP system is unavailable for an extended period of time:
     - Force promote the other PP DB as active.
     - The old active PP DB will need to be rebuilt as shadow.
- Bring host back up.
  • Active PP host down:
- If active PP system is unavailable temporarily:
     - Wait for it to come back up.
- If active PP system is unavailable for an extended period of time:
     - Perform a RT role switch.
     - Force promote the other PP DB as active.
     - The old active PP DB will need to be rebuilt as shadow.
- Bring host back up.

DB: DDL Updates

===== If aqms-bk-rt/aqms-bk-pp are primary =====

0.- Use 'postgres' DB user for all queries.

1.- Alter subscription on aqms-bk-pp:

	= \c archdb1
	= SELECT pglogical.alter_subscription_add_replication_set('rt1subscr', 'ddl_sql');

2.- Create subscription on aqms-nc-rt:

	= \c rtdb2
	= SELECT pglogical.create_subscription(
	subscription_name := 'rtsubscrddl',
	provider_dsn := 'host=aqms-bk-rt.ncss-ucb.cisn.org port=5432 user=repadmin
	password=repadmin_pass_ucb dbname=rtdb1',
	replication_sets := ARRAY['ddl_sql'],
	synchronize_structure := 'FALSE',
	synchronize_data := FALSE
	);

3.- Apply schema change on aqms-bk-rt:

	= \c rtdb1
	= SELECT pglogical.replicate_ddl_command('alter table trinetdb.swarm_state add testfield bigint');

4.- Alter subscription on aqms-bk-pp:

	= \c archdb1
	= SELECT pglogical.alter_subscription_remove_replication_set('rt1subscr', 'ddl_sql');

5.- Drop subscription on aqms-nc-rt:

        = \c rtdb2
        = SELECT pglogical.drop_subscription( subscription_name := 'rtsubscrddl');


===== If aqms-nc-rt/aqms-nc-pp are primary =====

0.- Use 'postgres' DB user for all queries.

1.- Alter subscription on aqms-nc-pp:

        = \c archdb1
	= SELECT pglogical.alter_subscription_add_replication_set('rt2subscr', 'ddl_sql');

2.- Create subscription on aqms-bk-rt:

        = \c rtdb1
        = SELECT pglogical.create_subscription(
        subscription_name := 'rtsubscrddl',
        provider_dsn := 'host=aqms-nc-rt.ncss-mp.cisn.org port=5432 user=repadmin
        password=repadmin_pass_wr dbname=rtdb2',
        replication_sets := ARRAY['ddl_sql'],
        synchronize_structure := 'FALSE',
        synchronize_data := FALSE
        );

3.- Apply schema change on aqms-nc-rt:

        = \c rtdb2
        = SELECT pglogical.replicate_ddl_command('alter table trinetdb.swarm_state add testfield bigint');

4.- Alter subscription on aqms-nc-pp:

        = \c archdb1
	= SELECT pglogical.alter_subscription_remove_replication_set('rt2subscr', 'ddl_sql');

5.- Drop subscription on aqms-bk-rt:

        = \c rtdb1
        = SELECT pglogical.drop_subscription( subscription_name := 'rtsubscrddl');

DB: Adding New Tables

In this example, the PDL_Ids and PDL_Product tables are added.

1.- Create tables and grant permissions on rtdb1, rtdb2 and archdb1 (active PP DB):

[postgres@aqms-bk-rt PDL_Tables]$ more *.sql
::::::::::::::
create_PDL_IDS.sql
::::::::::::::
CREATE TABLE PDL_IDS
(	DBSETID BIGINT      NOT NULL,
	TNAME   VARCHAR(30) NOT NULL,
	ID      BIGINT      NOT NULL,
	LDDATE TIMESTAMP DEFAULT (CURRENT_TIMESTAMP AT TIME ZONE 'UTC'),
	CONSTRAINT PDL_IDS_PK PRIMARY KEY (DBSETID, TNAME)
);

::::::::::::::
create_PDL_PRODUCT.sql
::::::::::::::
CREATE TABLE PDL_PRODUCT
(	EVID     BIGINT      NOT NULL,
	PRODTYPE VARCHAR(30) NOT NULL,
	DEST     VARCHAR(30) NOT NULL,
	SOURCE   VARCHAR(8)  NOT NULL,
	TIME     TIMESTAMP   NOT NULL,
	ACTION   VARCHAR(2)  NOT NULL,
	DBSETID  BIGINT,
	LDDATE TIMESTAMP DEFAULT (CURRENT_TIMESTAMP AT TIME ZONE 'UTC'),
	CONSTRAINT PDL_PRO_PK   PRIMARY KEY (EVID, PRODTYPE, ACTION, DEST, TIME, SOURCE),
	CONSTRAINT PDL_PRO_TYPE CHECK (PRODTYPE IN ('origin','phase-data','moment-tensor','focal-mechanism','nearby-cities','shakemap')),
	CONSTRAINT PDL_PRO_DEST CHECK (DEST IN ('Recent_event_PDL','ComCat_PDL','dev-ComCat_PDL','dev-Recent_event_PDL')),
	CONSTRAINT PDL_PRO_ACT  CHECK (ACTION IN ('U','D','u','d'))
);

::::::::::::::
grant.sql
::::::::::::::

GRANT SELECT ON PDL_IDS TO trinetdb_read, code;
GRANT INSERT, UPDATE, DELETE ON PDL_IDS TO trinetdb_write, code;

GRANT SELECT ON PDL_PRODUCT TO trinetdb_read, code;
GRANT INSERT, UPDATE, DELETE ON PDL_PRODUCT TO trinetdb_write, code;

[postgres@aqms-bk-rt PDL_Tables]$ 

[postgres@aqms-bk-rt PDL_Tables]$ psql -U trinetdb -d rtdb1 < create_PDL_IDS.sql 
CREATE TABLE
[postgres@aqms-bk-rt PDL_Tables]$ psql -U trinetdb -d rtdb1 < create_PDL_PRODUCT.sql 
CREATE TABLE
[postgres@aqms-bk-rt PDL_Tables]$ psql -U trinetdb -d rtdb1 < grant.sql 
GRANT
GRANT
GRANT
GRANT
[postgres@aqms-bk-rt PDL_Tables]$ 

2.- Set up replication on rtdb1, rtdb2 and archdb1:

In this example, aqms-nc-rt/aqms-nc-pp are the active systems.

2.1.- Remove aqms-bk-rt replication set on aqms-nc-pp:

[postgres@aqms-nc-pp ~]$ psql -d archdb1
psql (10.14)
archdb1=# SELECT pglogical.alter_subscription_remove_replication_set('rt1subscr', 'rtrepset');
 alter_subscription_remove_replication_set 
-------------------------------------------
 t
(1 row)

archdb1=# \q
[postgres@aqms-nc-pp ~]$

2.2.- Add tables to replication set on aqms-bk-rt:

[postgres@aqms-bk-rt ~]$ psql -d rtdb1
psql (9.2.24, server 10.10)
Type "help" for help.

rtdb1=# SELECT pglogical.replication_set_add_table(set_name:='rtrepset', relation:='trinetdb.pdl_ids',synchronize_data:=TRUE);
 replication_set_add_table 
---------------------------
 t
(1 row)

rtdb1=# SELECT pglogical.replication_set_add_table(set_name:='rtrepset', relation:='trinetdb.pdl_product',synchronize_data:=TRUE);
 replication_set_add_table 
---------------------------
 t
(1 row)

rtdb1=# \q
[postgres@aqms-bk-rt ~]$ 

2.3.- Add aqms-bk-rt replication set on aqms-nc-pp:

[postgres@aqms-nc-pp ~]$ psql -d archdb1
psql (10.14)
Type "help" for help.

archdb1=# SELECT pglogical.alter_subscription_add_replication_set('rt1subscr', 'rtrepset');
 alter_subscription_add_replication_set 
----------------------------------------
 t
(1 row)

archdb1=# \q
[postgres@aqms-nc-pp ~]$

2.4.- Do an active/standby role switch (aqms-bk-rt/aqms-bk-pp are now active).

2.5.- Remove aqms-nc-rt replication set on aqms-bk-pp:

[postgres@aqms-bk-pp ~]$ psql -d archdb1
psql (10.14)
Type "help" for help.

archdb1=# SELECT pglogical.alter_subscription_remove_replication_set('rt2subscr', 'rtrepset');
 alter_subscription_remove_replication_set 
-------------------------------------------
 t
(1 row)

archdb1=# \q
[postgres@aqms-bk-pp ~]$ 

2.6.- Add tables to replication set on aqms-nc-rt:

[postgres@aqms-nc-rt ~]$ psql -d rtdb2
psql (10.16)
Type "help" for help.

rtdb2=# SELECT pglogical.replication_set_add_table(set_name:='rtrepset', relation:='trinetdb.pdl_ids',synchronize_data:=TRUE);
 replication_set_add_table 
---------------------------
 t
(1 row)

rtdb2=# SELECT pglogical.replication_set_add_table(set_name:='rtrepset', relation:='trinetdb.pdl_product',synchronize_data:=TRUE);
 replication_set_add_table 
---------------------------
 t
(1 row)

rtdb2=# \q
[postgres@aqms-nc-rt ~]$ 

2.7.- Add aqms-nc-rt replication set on aqms-bk-pp:

[postgres@aqms-bk-pp ~]$ psql -d archdb1
psql (10.14)
Type "help" for help.

archdb1=# SELECT pglogical.alter_subscription_add_replication_set('rt2subscr', 'rtrepset');
 alter_subscription_add_replication_set 
----------------------------------------
 t
(1 row)

archdb1=# \q
[postgres@aqms-bk-pp ~]$ 
operations/db_ncedc/ops_man_prod.txt · Last modified: 2024/08/16 10:33 by stephane