Process Control System
The Process Control System is a suite of database tables and small programs to manipulate entries in the tables. See SCSN PCS documentation for details about the tables and programs.
For the NCSS, the PCS is currently running on ucbpp.geo.berkeley.edu and mnlodb1.wr.usgs.gov. By running PCS on machines on either side of the Bay, some level of reliability is added. However, careful configuration of these systems is required to ensure that work is not duplicated where duplication would be detrimental. For example, it would be desirable to make images for the Duty Review Page on two system to feed independent web servers. But the fpfit program should be run only on one system to avoid duplicate results in the database.
NCSS Transition Definitions
The heart of the process control system is the pcs_transition database table. This table defines the actions or states that an event goes through as it traverses the PCS system.
Events are entered or “posted” into the system by several different programs. When an event enters the PCS system, it will be posted with at least a “group”, “table”, “state” and “rank”. The group, table, and state define a step in the processing chain. The rank is used to assign priority when multiple events are ready for processing at the same time. NCSS is not making use of rank differentiation at this time; all rank values are set to 100.
If an event is posted to PCS with a positive result value, the PCS system (database stored procedures) will immediately transition to the one or more new steps that are defined in the pcs_transition table. This is what happens with the autoposter enters new events into PCS. Other applications that enter events into PCS post them with a NULL result value. Because the result value is absent, it is left to a PCS client program to take some action on these events and apply a result value into PCS. Only then can the event transition to new states. Jiggle is the prime offender here, posting into the FINALIZE or DELETED states with no result value. The client programs pcsFinalizeCont and pcsCancelCont serve only to apply a result value to events in FINALIZE and DELETED states, respectively.
The various PCS client programs (described below) periodically query the pcs_state table to find the next event in the state for which the program is responsible. A given program will then run the appropriate processing on the event, and report the result of that processing back to the pcs_state table.
The act of entering the result into PCS will cause the system to transition events to the next steps, if any, according to the pcs_transtion table. Eventually each event reaches the end of its configured processing chain or chains and leaves the PCs system. Thus the pcs_state table does not maintain any history of past event processing. If no transitions are defined for a state, the event will remain in that state until someone modifies or deletes it.
In the NCSS, the only general rule for result values is that 1 indicates success; any other positive result value means some sort of failure of the program to handle the event state. The user should review the program's log files to determine the cause of any errors. Once appropriate corrective action is taken, the user can use the result program to apply a result value of 1. That action will allow the event to continute through or out of PCS.
Below are the current state transitions used by NCSS.
The “EventStream” processing threads are for new events, usually reported by the real-time systems. There will be “EventStream” threads defined for each datacenter database. Each thread is independent and interacts with the datacenter database described in its “table” attribute: “dcmp2” or “dcucb”.
Within the “EventStream” group, there are threads for NewEvent which are “binder” events from the RT systems. And there are threads for NewTrigger which are subnet trigger events from the RT systems.
Subnet trigger events from the RT systems are loaded into the NewTrigger state by the autoposter. See ~ncss/ncpp/sbin/autoposter_rows*.sql for the SQL statement to set up the autoposter.
For binder events from the RT system, NCSS originally used the autoposter but found that it entered events into PCS too quickly. Events would enter the PCS system before there was sufficient event parametric information replicated from the RT database to the DC database. As a result, some of the PCS client programs would try to work on new events without sufficient information. Instead the following system was devised to laod binder events into the NewEvent state:
- On each RT system sigswitch runs configured with sigswitch_event.cfg. This is a CMS client that subscribes to /signals/trimag/magnitude messages, indicating that the RT system has tried to compute Md and Ml magnitudes. Sigswitch immediately emits any messages it receives under the subject /signals/RTUCB/event (on ucbrt) or /signals/RTMP/event (on mnlort1).
- On each post-processing system, there are two instances of runner, configured with event_postit_runner_ucb.cfg and event_postit_runner_mp.cfg.
- When these instances of runner receive messages from their RT system, they run the script /home/ncss/run/bin/event_postit. This causes the event to be posted into the PCS state table in the EventStream NewEvent state with a result of 1. This action causes the event to immediately transition into each of the PCS states for which transitions are defined (below).
For the EventStream AssocTrig state, there is a similar system of sigswitch and runner to feed subnet trigger - binder event associations that are published by ntrcg2 on the RT systems. This is experimental work to evaluate the new association scheme used in tc2 trigger coordinator.
The “TPP” (a hard-coded name from Caltech meaning something like Tomato Paste
Processing ) thread handles events that are deleted or updated by various tools
like the Duty Review Page, Jiggle.
The list below is broken into groups to indicate the various chains of processing. Since a given state (shown here as the left three columns) may be listed at the top of several groups, that shows the branching that the event takes as it enters that part of PCS.
Note that under the “Backup post-proc” table, several transitions are missing the expected “NewEvent” input state. That means that new events will not transition into those states. But if an event was in that state when that system was reconfigured from “active” system to “standby” system, the event will get the processing it needs. If those transitions were missing from the table, then events would get stranded in the PCS state table of the newly configured backup system.
Active new event post-proc system:
group table stateold result statenew rank EventStream dcucb NewEvent 1 MakeDRPGif 100 EventStream dcucb MakeDRPGif 1 FPfit 100 EventStream dcucb FPfit 1 null NULL EventStream dcucb NewEvent 1 ExportAmps 100 EventStream dcucb ExportAmps 1 ExportWF 100 EventStream dcucb ExportWF 1 null NULL EventStream dcucb NewEvent 1 ExportArc 100 EventStream dcucb ExportArc 1 null NULL EventStream dcucb NewEvent 1 ddrtFeed 100 EventStream dcucb ddrtFeed 1 null NULL EventStream dcucb NewEvent 1 SwarmAlarm 100 EventStream dcucb SwarmAlarm 1 null NULL EventStream dcucb NewTrigger 1 MakeTrigGif 100 EventStream dcucb MakeTrigGif 1 null NULL EventStream dcucb AssocTrig 1 TrigCheck 100 EventStream dcucb TrigCheck 1 null NULL
Reviewed Event post-proc system:
group table stateold result statenew rank TPP TPP FINALIZE 1 MakeDRPGif 100 TPP TPP MakeDRPGif 1 null NULL TPP TPP FINALIZE 1 ALARM 100 TPP TPP ALARM 1 null NULL TPP TPP FINALIZE 1 ddrtFeed 100 TPP TPP ddrtFeed 1 null NULL TPP TPP FINALIZE 1 ExportArc 100 TPP TPP ExportArc 1 null NULL TPP TPP FINALIZE 1 FPfit 100 TPP TPP FPfit 1 null NULL TPP TPP DELETED 1 CANCELALARM 100 TPP TPP CANCELALARM 1 DeleteArc 100 TPP TPP DeleteArc 1 null NULL TPP TPP DELETED 1 DeleteDDRT 100 TPP TPP DeleteDDRT 1 null NULL TPP TPP REPOP 1 null NULL
Back-up new event post-proc system:
group table stateold result statenew rank EventStream dcmp2 NewEvent 1 MakeDRPGif 100 EventStream dcmp2 MakeDRPGif 1 null NULL EventStream dcmp2 NewTrigger 1 MakeTrigGif 100 EventStream dcmp2 MakeTrigGif 1 null NULL EventStream dcmp2 AssocTrig 1 TrigCheck 100 EventStream dcmp2 TrigCheck 1 null NULL EventStream dcmp2 ExportAmps 1 ExportWF 100 EventStream dcmp2 ExportWF 1 null NULL EventStream dcmp2 ExportArc 1 null NULL EventStream dcmp2 FPfit 1 null NULL EventStream dcmp2 SwarmAlarm 1 null NULL EventStream dcmp2 ddrtFeed 1 null NULL
PCS Tasks and the Programs That Do Them
Currently the PCS system is used:
- For new events detected by binder_ew on RT systems):
Program Name | PCS state name | Description |
---|---|---|
makedrpcont | MakeDRPGif | create the waveform image (snapshot) for the Duty Review Page. |
fp_cont | FPfit | run fpfit to generate fault plane solutions for qualifying events. |
arcExportCont | ExportArc | create hypoinverse archive messages from the database catalog and export them to interested parties. |
exportamps | ExportAmps | generate strong ground motion packets for qualifying events for export to CISN partners. |
makeV0Cont | ExportWF | generate COSMOS V0 files for qualifying events for use by the CISN Engineering Data Center. |
ddrtFeedCont | ddrtFeed | generate hypoinverse archive messages and feed them to the real-time-double-difference system |
swarmon | SwarmAlarm | detect and report earthquake swarms in configured regions. |
- For new events of type “st” (subnet triggers from the RT systems):
Program Name | PCS state name | Description |
---|---|---|
makeTrigCont | MakeTrigGif | Create the waveform image for the Duty Review Page. |
- For new subnet triggers that have been associated with binder events in Trigger Coordinator
Program Name | PCS state name | Description |
---|---|---|
trigCheckCont | TrigCheck | Experimental code to compare trigger events and binder events. |
- For the post-processing TPP group, there are several points at which events can enter PCS. The main entry points are:
- To the FINALIZE state, caused by a jiggle user finalizing event
- To the DELETED state, caused by jiggle or Duty Review Page users deleting an event.
- To the ALARM state when the Duty Review Page user accepts an event or when the Web TMTS user publishes a moment tensor solution.
- To the MakeDRPGif state when the Duty Review Page user presses the “Remake GIF” button.
Program Name | PCS state name | Description |
---|---|---|
pcsFinalizeCont | FINALIZE | Jiggle posts this state for events that have been finalized. |
pcsAlarmCont | ALARM | submit the event to the alarm system. |
finalDRPcont | MakeDRPGif | create the waveform image (snapshot) for the Duty Review Page. |
fp_cont | FPfit | run fpfit to generate fault plane solutions for qualifying events. |
arcExportCont | ExportArc | create hypoinverse archive messages from the database catalog and export them to interested parties. |
pcsCancelCont | DELETED | Jiggle posts this state for events that have been deleted from the catalog. |
pcsCancelAlarmCont | CANCELALARM | |
ddrtFeedCont | ddrtFeed | generate hypoinverse archive messages and feed them to the real-time-double-difference system |
ddrtDeleteCont | deleteDDRT | generate archive delete messages and feed them to the real-time-double-difference system |
arcDeleteCont | DeleteArc | generate archive delete messages and export them to interested parties. |
repopWFcont | REPOP | replace existing event waveforms with new selection based on revised event parameters. |