==== Process Control System ==== The Process Control System is a suite of database tables and small programs to manipulate entries in the tables. See [[SCSN>postproc:process_control_system|SCSN PCS documentation]] for details about the tables and programs. For the NCSS, the PCS is currently running on ucbpp.geo.berkeley.edu and mnlodb1.wr.usgs.gov. By running PCS on machines on either side of the Bay, some level of reliability is added. However, careful configuration of these systems is required to ensure that work is not duplicated where duplication would be detrimental. For example, it would be desirable to make images for the [[postproc:drp|Duty Review Page]] on two system to feed independent web servers. But the fpfit program should be run only on one system to avoid duplicate results in the database. === NCSS Transition Definitions === The heart of the process control system is the pcs_transition database table. This table defines the actions or **states** that an event goes through as it traverses the PCS system. Events are entered or "posted" into the system by several different programs. When an event enters the PCS system, it will be posted with at least a "group", "table", "state" and "rank". The group, table, and state define a step in the processing chain. The rank is used to assign priority when multiple events are ready for processing at the same time. NCSS is not making use of rank differentiation at this time; all rank values are set to 100. If an event is posted to PCS with a positive result value, the PCS system (database stored procedures) will immediately transition to the one or more new steps that are defined in the pcs_transition table. This is what happens with the **autoposter** enters new events into PCS. Other applications that enter events into PCS post them with a NULL result value. Because the result value is absent, it is left to a PCS client program to take some action on these events and apply a result value into PCS. Only then can the event transition to new states. Jiggle is the prime offender here, posting into the **FINALIZE** or **DELETED** states with no result value. The client programs **pcsFinalizeCont** and **pcsCancelCont** serve only to apply a result value to events in FINALIZE and DELETED states, respectively. The various PCS client programs (described below) periodically query the pcs_state table to find the next event in the state for which the program is responsible. A given program will then run the appropriate processing on the event, and report the result of that processing back to the pcs_state table. The act of entering the result into PCS will cause the system to transition events to the next steps, if any, according to the pcs_transtion table. Eventually each event reaches the end of its configured processing chain or chains and leaves the PCs system. Thus the pcs_state table does not maintain any history of past event processing. If no transitions are defined for a state, the event will remain in that state until someone modifies or deletes it. In the NCSS, the only general rule for result values is that **1** indicates success; any other positive result value means some sort of failure of the program to handle the event state. The user should review the program's log files to determine the cause of any errors. Once appropriate corrective action is taken, the user can use the **result** program to apply a result value of 1. That action will allow the event to continute through or out of PCS. Below are the current state transitions used by NCSS. The "EventStream" processing threads are for **new** events, usually reported by the real-time systems. There will be "EventStream" threads defined for each datacenter database. Each thread is independent and interacts with the datacenter database described in its "table" attribute: "dcmp2" or "dcucb". Within the "EventStream" group, there are threads for **NewEvent** which are "binder" events from the [[rtem:ovrv|RT systems]]. And there are threads for **NewTrigger** which are subnet trigger events from the RT systems. Subnet trigger events from the RT systems are loaded into the NewTrigger state by the autoposter. See ~ncss/ncpp/sbin/autoposter_rows*.sql for the SQL statement to set up the autoposter. For binder events from the RT system, NCSS originally used the autoposter but found that it entered events into PCS __too quickly__. Events would enter the PCS system before there was sufficient event parametric information replicated from the RT database to the DC database. As a result, some of the PCS client programs would try to work on new events without sufficient information. Instead the following system was devised to laod binder events into the NewEvent state: * On each RT system [[rtem:sigswitch]] runs configured with sigswitch_event.cfg. This is a [[CMS]] client that subscribes to **/signals/trimag/magnitude** messages, indicating that the RT system has tried to compute Md and Ml magnitudes. Sigswitch immediately emits any messages it receives under the subject **/signals/RTUCB/event** (on ucbrt) or **/signals/RTMP/event** (on mnlort1). * On each post-processing system, there are two instances of [[rtem:runner_eqrun|runner]], configured with **event_postit_runner_ucb.cfg** and **event_postit_runner_mp.cfg**. * When these instances of **runner** receive messages from their RT system, they run the script **/home/ncss/run/bin/event_postit**. This causes the event to be posted into the PCS state table in the **EventStream NewEvent** state with a result of **1**. This action causes the event to immediately transition into each of the PCS states for which transitions are defined (below). For the EventStream **AssocTrig** state, there is a similar system of sigswitch and runner to feed subnet trigger - binder event associations that are published by [[rtem:ntrcg2]] on the RT systems. This is experimental work to evaluate the new association scheme used in [[rtem:tc2]] trigger coordinator. The "TPP" (a hard-coded name from Caltech meaning something like Tomato Paste Processing :-) ) thread handles events that are deleted or updated by various tools like the [[PostProc:drp|Duty Review Page]], [[PostProc:Jiggle]]. The list below is broken into groups to indicate the various chains of processing. Since a given state (shown here as the left three columns) may be listed at the top of several groups, that shows the branching that the event takes as it enters that part of PCS. Note that under the "Backup post-proc" table, several transitions are missing the expected "NewEvent" input state. That means that new events will not transition into those states. But if an event was in that state when that system was reconfigured from "active" system to "standby" system, the event will get the processing it needs. If those transitions were missing from the table, then events would get stranded in the PCS state table of the newly configured backup system. ==Active new event post-proc system:== group table stateold result statenew rank EventStream dcucb NewEvent 1 MakeDRPGif 100 EventStream dcucb MakeDRPGif 1 FPfit 100 EventStream dcucb FPfit 1 null NULL EventStream dcucb NewEvent 1 ExportAmps 100 EventStream dcucb ExportAmps 1 ExportWF 100 EventStream dcucb ExportWF 1 null NULL EventStream dcucb NewEvent 1 ExportArc 100 EventStream dcucb ExportArc 1 null NULL EventStream dcucb NewEvent 1 ddrtFeed 100 EventStream dcucb ddrtFeed 1 null NULL EventStream dcucb NewEvent 1 SwarmAlarm 100 EventStream dcucb SwarmAlarm 1 null NULL EventStream dcucb NewTrigger 1 MakeTrigGif 100 EventStream dcucb MakeTrigGif 1 null NULL EventStream dcucb AssocTrig 1 TrigCheck 100 EventStream dcucb TrigCheck 1 null NULL ==Reviewed Event post-proc system:== group table stateold result statenew rank TPP TPP FINALIZE 1 MakeDRPGif 100 TPP TPP MakeDRPGif 1 null NULL TPP TPP FINALIZE 1 ALARM 100 TPP TPP ALARM 1 null NULL TPP TPP FINALIZE 1 ddrtFeed 100 TPP TPP ddrtFeed 1 null NULL TPP TPP FINALIZE 1 ExportArc 100 TPP TPP ExportArc 1 null NULL TPP TPP FINALIZE 1 FPfit 100 TPP TPP FPfit 1 null NULL TPP TPP DELETED 1 CANCELALARM 100 TPP TPP CANCELALARM 1 DeleteArc 100 TPP TPP DeleteArc 1 null NULL TPP TPP DELETED 1 DeleteDDRT 100 TPP TPP DeleteDDRT 1 null NULL TPP TPP REPOP 1 null NULL ==Back-up new event post-proc system:== group table stateold result statenew rank EventStream dcmp2 NewEvent 1 MakeDRPGif 100 EventStream dcmp2 MakeDRPGif 1 null NULL EventStream dcmp2 NewTrigger 1 MakeTrigGif 100 EventStream dcmp2 MakeTrigGif 1 null NULL EventStream dcmp2 AssocTrig 1 TrigCheck 100 EventStream dcmp2 TrigCheck 1 null NULL EventStream dcmp2 ExportAmps 1 ExportWF 100 EventStream dcmp2 ExportWF 1 null NULL EventStream dcmp2 ExportArc 1 null NULL EventStream dcmp2 FPfit 1 null NULL EventStream dcmp2 SwarmAlarm 1 null NULL EventStream dcmp2 ddrtFeed 1 null NULL ==== PCS Tasks and the Programs That Do Them ==== Currently the PCS system is used: * For new events detected by binder_ew on RT systems): ^ Program Name ^ PCS //state// name ^ Description ^ | [[PostProc:makedrpcont]] | MakeDRPGif | create the waveform image (snapshot) for the [[PostProc:drp|Duty Review Page]]. | | [[PostProc:fp_cont]] | FPfit | run fpfit to generate fault plane solutions for qualifying events. | | [[PostProc:arcExportCont]] | ExportArc | create hypoinverse archive messages from the database catalog and export them to interested parties. | | [[PostProc:exportamps]] | ExportAmps | generate strong ground motion packets for qualifying events for export to CISN partners. | | [[PostProc:makeV0Cont]] | ExportWF | generate COSMOS V0 files for qualifying events for use by the [[http://www.quake.ca.gov/cisn-edc/|CISN Engineering Data Center]]. | | [[PostProc:ddrtFeedCont]] | ddrtFeed | generate hypoinverse archive messages and feed them to the real-time-double-difference system | | [[PostProc:swarmon]] | SwarmAlarm | detect and report earthquake swarms in configured regions. | * For new events of type "st" (subnet triggers from the RT systems): ^ Program Name ^ PCS //state// name ^ Description ^ | [[PostProc:makeTrigCont]] | MakeTrigGif | Create the waveform image for the [[PostProc:drp|Duty Review Page]]. | * For new subnet triggers that have been associated with binder events in [[RTEM:tc2|Trigger Coordinator]] ^ Program Name ^ PCS //state// name ^ Description ^ | [[PostProc:trigCheckCont]] | TrigCheck | Experimental code to compare trigger events and binder events. | * For the post-processing //TPP// group, there are several points at which events can enter PCS. The main entry points are: * To the //FINALIZE// state, caused by a jiggle user finalizing event * To the //DELETED// state, caused by jiggle or Duty Review Page users deleting an event. * To the //ALARM// state when the Duty Review Page user accepts an event or when the [[postproc:tmts|Web TMTS]] user publishes a moment tensor solution. * To the //MakeDRPGif// state when the Duty Review Page user presses the "Remake GIF" button. ^ Program Name ^ PCS //state// name ^ Description ^ | [[PostProc:pcsFinalizeCont]] | FINALIZE | Jiggle posts this state for events that have been finalized. | | [[PostProc:pcsAlarmCont]] | ALARM | submit the event to the [[postproc:alarm]] system. | | [[PostProc:finalDRPcont]] | MakeDRPGif | create the waveform image (snapshot) for the [[PostProc:drp|Duty Review Page]]. | | [[PostProc:fp_cont]] | FPfit | run fpfit to generate fault plane solutions for qualifying events. | | [[PostProc:arcExportCont]] | ExportArc | create hypoinverse archive messages from the database catalog and export them to interested parties. | | [[PostProc:pcsCancelCont]] | DELETED | Jiggle posts this state for events that have been deleted from the catalog. | | [[PostProc:pcsCancelAlarmCont]] | CANCELALARM | submit the event to the [[RT:alarm|real-time]] and [[postproc:alarm|data center]] alarm systems to cancel any alarm actions. | [[PostProc:ddrtFeedCont]] | ddrtFeed | generate hypoinverse archive messages and feed them to the real-time-double-difference system | | [[PostProc:ddrtDeleteCont]] | deleteDDRT | generate archive delete messages and feed them to the real-time-double-difference system | | [[PostProc:arcDeleteCont]] | DeleteArc | generate archive delete messages and export them to interested parties. | | [[PostProc:repopWFcont]] | REPOP | replace existing event waveforms with new selection based on revised event parameters. |