===== Alarm Actions ===== Alarm actions are the goal of the AQMS alarm system: if an event meets some conditions, one or more alarm actions are called for the event by //alarmdist//. Alarm actions are executable commands, usually scripts. They can do essentially anything although they should not take long to run: currently alarmdist calls one action at a time. Alarmdist has a configured time limit: for each action. If the action runs too long, alarmdist will kill it and report an error. All alarm actions have the same calling convention: action_name EVENT_ID mod_count The mod_count value can be any number; it is required but no longer used. It is a relic from when the the event version was not maintained in the Event table. Note that every action should also have a //cancel// script to undo the action. The cancel command has the name of the action script with //CANCEL_// prepended. The cancel command has the same calling convention as the action command. == Errors == If an alarm action encounters a problem and returns abnormal exit code, alarmdist will send an email notice and set the action state to **ERROR** in the alarm_action DB table. To keep track of these errors, we have the script //action_error// (in /home/ncss/run/bin/). ncss@ucbrt.geo.berkeley.edu:action_error -h action_error version 0.0.2 action_error - report alarm_actions in ERROR state Syntax: action_error [-c config] [-E evid] [-U evid action] where: -E evid - query for the action commandline for any alarm actions in ERROR state for event . -U evid action - update any of event 's actions of name from ERROR state to ERROR-ACK state. -c config - specify an alternate configuration file. The default config file is /home/ncss/run/params/db.conf -h Help - prints this help message. When neither option -E or -U is given, action_error prints any event IDs and their actions which are in the ERROR state. This mode is suitable for use by monitor. The //monitor// program is configured to all //action_error// with no arguments. This searches the entire alarm_action table for actions in the **ERROR** state. This encourages AQMS operators to investigate these errors. The user should look in the action logs (/home/ncss/run/alarms/logs) to find the problem. It may be something temporary, such as an scp destination being down. In that case, simply running the action command manually would be appropriate to get the action performed. After any action errors have been investigated, the user should acknowledge the error. Otherwise the action_error script will continue to report it. To acknowledge an error, use the -U option to action_error. For example, action_error -U 72282711 MTweb This will change the action state from **ERROR** to **ERROR-ACK**, which will stop action_error from complaining. === BeltPager === As the name suggests, this action causes a pager message to be sent. This is used to notify NCSS personnel about a significant earthquake. The script does the following: * Generate a message using a database stored procedure. * Send the message using the appropriate local command: //pager// for UCB systems, //qpage// for Menlo Park systems. * On the real-time systems, add the event to the ReAlarm database table on the archive database. This will be used by the [[postproc:realarm]] system. === NCSSmail === This action is intended to send email notification to NCSS personnel about interesting events. It is NOT intended for email to the general public; that function is ably provided by [[https://earthquake.usgs.gov/ens/|ENS]]. The NCSSmail action does the following: * Generate a message using a database stored procedure. * Mail the message to the configured list of NCSS recipients. === EVTPRM2PDL === This cryptically named action sends event parameters to the [[network:pdl|PDL]] system, making them available to the world. The following things are performed: * Generate a phase-data QuakeML product using the //qml// program. * Submit the phase-data product to PDL using ProductClient.jar configured to use its EIDSInputWedge mode. This automatically generates an origin product with matching submission time so that the PDL system can see that the two products are reporting the same event. * Record the PDL submission in the pdl_product and pdl_ids database tables. These tables keep a record of the event state when it was submitted. * Generate a nearby-cities product using the //townsPublish// script. * Submit the nearby-cities product to PDL and update the pdl_product table. === ShakeMap === This is the action that tells [[postproc:shakemap|ShakeMap]] about earthquakes. ShakeMap's //queue// program is listing on a TCP socket. This action connects to the TCP socket and sends a short message: shake_alarm EVENT_ID UPDATE Note that //queue// is configured with a list of host-names from which it will accept TCP connections. If the host calling the ShakeMap alarm action is not on queue's list, it will be rejected. The script that performs this action reads a list of ShakeMap hosts: ///home/ncss/run/alarms/actions/Shakemap/shake_hosts_ports//. For each listed host, the action scripts forks a separate process to handle that host. The parent process waits a configured time (30 seconds) for each child to complete its work. This ensures that even if one ShakeMap server is down, the other ones will still get notified of the event. === MTweb === The MTweb action takes care of publishing moment tensor information to the world. All the work as done by ///home/ncss/run/alarms/bin/mtwebpdl//. The following items are performed: * Generate a moment-tensor QuakeML product using //qml//. * Generate image files of the moment tensor mechanism, the waveform fit, and possibly the variance reduction vs depth plot. * Submit all the above into PDL using ProductClient.jar configured to use its EIDSInputWedge mode. * Record the PDL submission in the pdl_product and pdl_ids database tables. These tables keep a record of the event state when it was submitted. * Generate an html page to support the above image files on web server(s) outside of PDL. * Submit the html and image files to the NCEDC web server using scp. * Generate a [[network:cube|CUBE]] LI (add-on) message for the above html page and submit it to PDL. This is done using ProductClient.jar running as //li_poller//. This is a bit redundant for the ComCat web page, but it is useful for the old [[http://www.ncedc.org/recenteqs/|recenteqs]] system fed from ucbpp. === MTlocalmail, MTmail === These two alarm actions send email about moment tensors, to two different lists of recipients. The local list is intended to be a short list of NCSS people show should get prompt notification. The public list is longer, hence slower. These scripts do the following: * Retrieve the email text from the database, where it was placed by either real-time [[rtem:tmts]] or [[postproc:tmts|tmts web system]]. * Send the message to the listed recipients: * MTlocalmail uses the //mtlocalmail// list in /home/ncss/run/alarms/config * MTmail uses the //mtpublicmail// list * Place a copy of the email text on the NCEDC web server. === TMTSDone === This alarm action notifies Duty Response people when [[rtem:tmts]] has finished with an event on the real-time systems. It indicates that all real-time AQMS processing of the event is complete, so that it is safe to start modifying event information on the archive databases. The following steps are performed: * Query the alarm_action table to see if the BeltPage action was called for this event. If not, TMTSDone takes no further action. * Generate a text message, and send it via email and pager to the configured recipients.