==== Process Control System ====
The Process Control System is a suite of database tables and small programs to
manipulate entries in the tables. See
[[SCSN>postproc:process_control_system|SCSN PCS documentation]] for
details about the tables and programs.
For the NCSS, the PCS is currently running on ucbpp.geo.berkeley.edu and
mnlodb1.wr.usgs.gov. By
running PCS on machines on either side of the Bay, some level of reliability
is added. However, careful configuration of these systems is required to
ensure that work is not duplicated where duplication would be detrimental. For
example, it would be desirable to make images for the [[postproc:drp|Duty Review Page]]
on two system to feed independent web servers. But the fpfit program should be run
only on one system to avoid duplicate results in the database.
=== NCSS Transition Definitions ===
The heart of the process control system is the pcs_transition database
table. This table defines the actions or **states** that an event goes through
as it traverses the PCS system.
Events are entered or "posted" into the system by several different
programs. When an event enters the PCS system, it will be posted with at least
a "group", "table", "state" and "rank". The group, table, and state define a
step in the processing chain. The rank is used to assign priority when
multiple events are ready for processing at the same time. NCSS is not making
use of rank differentiation at this time; all rank values are set to 100.
If an event is posted to PCS with a positive result value, the PCS system
(database stored procedures) will immediately transition to the one or more
new steps that are defined in the pcs_transition table. This is what happens
with the **autoposter** enters new events into PCS. Other applications that
enter events into PCS post them with a NULL result value. Because the result
value is absent, it is left to a PCS client program to take some action on
these events and apply a result value into PCS. Only then can the event
transition to new states. Jiggle is the prime offender here, posting into the
**FINALIZE** or **DELETED** states with no result value. The client programs
**pcsFinalizeCont** and **pcsCancelCont** serve only to apply a result value
to events in FINALIZE and DELETED states, respectively.
The various PCS client programs (described below) periodically query the
pcs_state table to find the next event in the state for which the program is
responsible. A given program will then run the appropriate processing on the
event, and report the result of that processing back to the pcs_state table.
The act of entering the result into PCS will cause the system to transition
events to the next steps, if any, according to the pcs_transtion
table. Eventually each event reaches the end of its configured processing
chain or chains and leaves the PCs system. Thus the pcs_state table does not
maintain any history of past event processing. If no transitions are defined
for a state, the event will remain in that state until someone modifies or
deletes it.
In the NCSS, the only general rule for result values is that **1** indicates
success; any other positive result value means some sort of failure of the
program to handle the event state. The user should review the program's log
files to determine the cause of any errors. Once appropriate corrective action
is taken, the user can use the **result** program to apply a result value of
1. That action will allow the event to continute through or out of PCS.
Below are the current state transitions used by NCSS.
The "EventStream" processing threads are for **new** events, usually reported
by the real-time systems. There will be "EventStream" threads defined for each
datacenter database. Each thread is independent and interacts with the
datacenter database described in its "table" attribute: "dcmp2" or "dcucb".
Within the "EventStream" group, there are threads for **NewEvent** which are
"binder" events from the [[rtem:ovrv|RT systems]]. And there are threads for
**NewTrigger** which are subnet trigger events from the RT systems.
Subnet trigger events from the RT systems are loaded into the NewTrigger state
by the autoposter. See ~ncss/ncpp/sbin/autoposter_rows*.sql for the SQL
statement to set up the autoposter.
For binder events from the RT system, NCSS originally used the autoposter but
found that it entered events into PCS __too quickly__. Events would enter the
PCS system before there was sufficient event parametric information replicated
from the RT database to the DC database. As a result, some of the PCS client
programs would try to work on new events without sufficient
information. Instead the following system was devised to laod binder events
into the NewEvent state:
* On each RT system [[rtem:sigswitch]] runs configured with sigswitch_event.cfg. This is a [[CMS]] client that subscribes to **/signals/trimag/magnitude** messages, indicating that the RT system has tried to compute Md and Ml magnitudes. Sigswitch immediately emits any messages it receives under the subject **/signals/RTUCB/event** (on ucbrt) or **/signals/RTMP/event** (on mnlort1).
* On each post-processing system, there are two instances of [[rtem:runner_eqrun|runner]], configured with **event_postit_runner_ucb.cfg** and **event_postit_runner_mp.cfg**.
* When these instances of **runner** receive messages from their RT system, they run the script **/home/ncss/run/bin/event_postit**. This causes the event to be posted into the PCS state table in the **EventStream NewEvent** state with a result of **1**. This action causes the event to immediately transition into each of the PCS states for which transitions are defined (below).
For the EventStream **AssocTrig** state, there is a similar system of
sigswitch and runner to feed subnet trigger - binder event associations that
are published by [[rtem:ntrcg2]] on the RT systems. This is experimental work
to evaluate the new association scheme used in [[rtem:tc2]] trigger coordinator.
The "TPP" (a hard-coded name from Caltech meaning something like Tomato Paste
Processing :-) ) thread handles events that are deleted or updated by various tools
like the [[PostProc:drp|Duty Review Page]], [[PostProc:Jiggle]].
The list below is broken into groups to indicate the various chains of
processing. Since a given state (shown here as the left three columns) may be
listed at the top of several groups, that shows the branching that the event
takes as it enters that part of PCS.
Note that under the "Backup post-proc" table, several transitions are missing
the expected "NewEvent" input state. That means that new events will not
transition into those states. But if an event was in that state when that
system was reconfigured from "active" system to "standby" system, the event
will get the processing it needs. If those transitions were missing from the
table, then events would get stranded in the PCS state table of the newly
configured backup system.
==Active new event post-proc system:==
group table stateold result statenew rank
EventStream dcucb NewEvent 1 MakeDRPGif 100
EventStream dcucb MakeDRPGif 1 FPfit 100
EventStream dcucb FPfit 1 null NULL
EventStream dcucb NewEvent 1 ExportAmps 100
EventStream dcucb ExportAmps 1 ExportWF 100
EventStream dcucb ExportWF 1 null NULL
EventStream dcucb NewEvent 1 ExportArc 100
EventStream dcucb ExportArc 1 null NULL
EventStream dcucb NewEvent 1 ddrtFeed 100
EventStream dcucb ddrtFeed 1 null NULL
EventStream dcucb NewEvent 1 SwarmAlarm 100
EventStream dcucb SwarmAlarm 1 null NULL
EventStream dcucb NewTrigger 1 MakeTrigGif 100
EventStream dcucb MakeTrigGif 1 null NULL
EventStream dcucb AssocTrig 1 TrigCheck 100
EventStream dcucb TrigCheck 1 null NULL
==Reviewed Event post-proc system:==
group table stateold result statenew rank
TPP TPP FINALIZE 1 MakeDRPGif 100
TPP TPP MakeDRPGif 1 null NULL
TPP TPP FINALIZE 1 ALARM 100
TPP TPP ALARM 1 null NULL
TPP TPP FINALIZE 1 ddrtFeed 100
TPP TPP ddrtFeed 1 null NULL
TPP TPP FINALIZE 1 ExportArc 100
TPP TPP ExportArc 1 null NULL
TPP TPP FINALIZE 1 FPfit 100
TPP TPP FPfit 1 null NULL
TPP TPP DELETED 1 CANCELALARM 100
TPP TPP CANCELALARM 1 DeleteArc 100
TPP TPP DeleteArc 1 null NULL
TPP TPP DELETED 1 DeleteDDRT 100
TPP TPP DeleteDDRT 1 null NULL
TPP TPP REPOP 1 null NULL
==Back-up new event post-proc system:==
group table stateold result statenew rank
EventStream dcmp2 NewEvent 1 MakeDRPGif 100
EventStream dcmp2 MakeDRPGif 1 null NULL
EventStream dcmp2 NewTrigger 1 MakeTrigGif 100
EventStream dcmp2 MakeTrigGif 1 null NULL
EventStream dcmp2 AssocTrig 1 TrigCheck 100
EventStream dcmp2 TrigCheck 1 null NULL
EventStream dcmp2 ExportAmps 1 ExportWF 100
EventStream dcmp2 ExportWF 1 null NULL
EventStream dcmp2 ExportArc 1 null NULL
EventStream dcmp2 FPfit 1 null NULL
EventStream dcmp2 SwarmAlarm 1 null NULL
EventStream dcmp2 ddrtFeed 1 null NULL
==== PCS Tasks and the Programs That Do Them ====
Currently the PCS system is used:
* For new events detected by binder_ew on RT systems):
^ Program Name ^ PCS //state// name ^ Description ^
| [[PostProc:makedrpcont]] | MakeDRPGif | create the waveform image (snapshot) for the [[PostProc:drp|Duty Review Page]]. |
| [[PostProc:fp_cont]] | FPfit | run fpfit to generate fault plane solutions for qualifying events. |
| [[PostProc:arcExportCont]] | ExportArc | create hypoinverse archive messages from the database catalog and export them to interested parties. |
| [[PostProc:exportamps]] | ExportAmps | generate strong ground motion packets for qualifying events for export to CISN partners. |
| [[PostProc:makeV0Cont]] | ExportWF | generate COSMOS V0 files for qualifying events for use by the [[http://www.quake.ca.gov/cisn-edc/|CISN Engineering Data Center]]. |
| [[PostProc:ddrtFeedCont]] | ddrtFeed | generate hypoinverse archive messages and feed them to the real-time-double-difference system |
| [[PostProc:swarmon]] | SwarmAlarm | detect and report earthquake swarms in configured regions. |
* For new events of type "st" (subnet triggers from the RT systems):
^ Program Name ^ PCS //state// name ^ Description ^
| [[PostProc:makeTrigCont]] | MakeTrigGif | Create the waveform image for the [[PostProc:drp|Duty Review Page]]. |
* For new subnet triggers that have been associated with binder events in [[RTEM:tc2|Trigger Coordinator]]
^ Program Name ^ PCS //state// name ^ Description ^
| [[PostProc:trigCheckCont]] | TrigCheck | Experimental code to compare trigger events and binder events. |
* For the post-processing //TPP// group, there are several points at which events can enter PCS. The main entry points are:
* To the //FINALIZE// state, caused by a jiggle user finalizing event
* To the //DELETED// state, caused by jiggle or Duty Review Page users deleting an event.
* To the //ALARM// state when the Duty Review Page user accepts an event or when the [[postproc:tmts|Web TMTS]] user publishes a moment tensor solution.
* To the //MakeDRPGif// state when the Duty Review Page user presses the "Remake GIF" button.
^ Program Name ^ PCS //state// name ^ Description ^
| [[PostProc:pcsFinalizeCont]] | FINALIZE | Jiggle posts this state for events that have been finalized. |
| [[PostProc:pcsAlarmCont]] | ALARM | submit the event to the [[postproc:alarm]] system. |
| [[PostProc:finalDRPcont]] | MakeDRPGif | create the waveform image (snapshot) for the [[PostProc:drp|Duty Review Page]]. |
| [[PostProc:fp_cont]] | FPfit | run fpfit to generate fault plane solutions for qualifying events. |
| [[PostProc:arcExportCont]] | ExportArc | create hypoinverse archive messages from the database catalog and export them to interested parties. |
| [[PostProc:pcsCancelCont]] | DELETED | Jiggle posts this state for events that have been deleted from the catalog. |
| [[PostProc:pcsCancelAlarmCont]] | CANCELALARM | submit the event to the [[RT:alarm|real-time]] and [[postproc:alarm|data center]] alarm systems to cancel any alarm actions.
| [[PostProc:ddrtFeedCont]] | ddrtFeed | generate hypoinverse archive messages and feed them to the real-time-double-difference system |
| [[PostProc:ddrtDeleteCont]] | deleteDDRT | generate archive delete messages and feed them to the real-time-double-difference system |
| [[PostProc:arcDeleteCont]] | DeleteArc | generate archive delete messages and export them to interested parties. |
| [[PostProc:repopWFcont]] | REPOP | replace existing event waveforms with new selection based on revised event parameters. |