Seisnetwatch at BSL

Seisnetwatch is a system for monitoring acquisition of seismic data and related equipment. It is a product of ISTI and is recommended by USGS for all ANSS Regional Seismic Networks.

The Seisnetwatch system for UCB consists of NSI (Network Station Information) servers and event channels running on rodgers.geo.berkeley; data collection agents on various BSL and Menlo Park systems; and the seisnetwatch client application for displaying data, running anywhere. The following sections discuss these components.

NOTE: (1) In the figure the ports 10009 and 10209 on ucbns1/ucbns2 are incorrectly labeled - the Crossover Socket Agent port is 10009 and the SocketAgent port is 10209. (2) The ISTI documentation mentions a Control Agent (CTA). The NCSS has never used this part of seisnetwatch.

NSI Servers

There are two NSI servers running on rodgers.geo.berkeley.edu, running in base directories /home/ncss/snw/ucbns1NSI and /home/ncss/snw/ucbns2NSI. These names reflect the fact that some of the data on one server are data for which ucbns1 is the primary acquisition system, and the other server has some data for which ucbns2 is the primary acquisition system. However, the majority of data on both servers is identical.

The NSI servers are java applications running /home/ncss/snw/SeisNetWatch/jars/isti.snwserver.jar. They parse several configuration files, and receive data from event channels (described below). The servers evaluate the incoming data to make decisions about the performance of configured “stations”, and make all this information available to the seisnetwatch client application (described below). The servers provide two different means for distributing their data: CORBA IOR descriptors, and network services via httpd.

In the previous paragraph stations is quoted because it does not specifically refer to a seismographic station. It is much more general than that; basically any entity that has one or more parameters to be monitored. For seismographic stations with more than one data logger, there will be one seisnetwatch “station” for each data logger. We also define seisnetwatch “stations” for the telemetry elements such as cell modems and radios. And we have seisnetwatch “stations” for many of the smart solar controllers. Also, the seisnetwatch system is not limited to seismic stations: the BSL's GPS stations are also monitored by seisnetwatch.

The NSI servers store data in history files, currently configured to hold 180 days of data. These are flat files that can be viewed with a text editor. They are in directories named for the agent that provided them and the name of each parameter, with one file for each day. For example:

ncss@rodgers:cd snw/ucbns2NSI/history
ncss@rodgers:pwd
/home/ncss/snw/ucbns2NSI/history
ncss@rodgers:cd BK-BKS/Q330_Java_Agent\;UCBNS1_Q330AGENT/
ncss@rodgers:ll
total 120
drwxr-xr-x 2 ncss users 8192 Mar 25 17:02 Days since last reboot
drwxr-xr-x 2 ncss users 8192 Mar 25 17:02 GEN - Minutes since loss
drwxr-xr-x 2 ncss users 8192 Mar 25 17:02 GEN - Overall Clock Quality
drwxr-xr-x 2 ncss users 8192 Mar 25 17:02 GEN - VCO
drwxr-xr-x 2 ncss users 8192 Mar 25 17:02 Mass Position - Channel 1
drwxr-xr-x 2 ncss users 8192 Mar 25 17:02 Mass Position - Channel 2
drwxr-xr-x 2 ncss users 8192 Mar 25 17:02 Mass Position - Channel 3
drwxr-xr-x 2 ncss users 8192 Mar 25 17:02 MonSys - GPS Antenna Curent (ma)
drwxr-xr-x 2 ncss users 8192 Mar 25 17:02 MonSys - System Temp (C)
drwxr-xr-x 2 ncss users 8192 Mar 25 17:02 MonSys - System Voltage
ncss@rodgers:cd GEN\ -\ Minutes\ since\ loss/
ncss@rodgers:ll
total 2896
-rw-r--r-- 1 ncss users 12960 Sep 27 17:00 params_20170927.txt
-rw-r--r-- 1 ncss users 12915 Sep 28 16:59 params_20170928.txt
-rw-r--r-- 1 ncss users 12960 Sep 29 17:00 params_20170929.txt
-rw-r--r-- 1 ncss users 12915 Sep 30 16:55 params_20170930.txt
...
-rw-r--r-- 1 ncss users 12960 Mar 24 16:57 params_20180324.txt
-rw-r--r-- 1 ncss users 12960 Mar 25 16:57 params_20180325.txt
-rw-r--r-- 1 ncss users 12420 Mar 26 15:57 params_20180326.txt

Note that the directory names contain lots of characters that are not standard in Unix filesystems, making it very difficult to access the files with shell commands. These directories can be accessed using double quotes.

Besides storing the history of data collected from agents, the NSI servers also store a record of station performance (good, fair, poor, etc). The servers neither store a history of individual parameter performance, nor a record of the station “usage” parameter. This means that when the seisnetwatch client is asked to display a history of station parameter performance, the NSI server has to compute each parameter's performance based on the current usage level.

The script for starting each NSI server and its event channels is in each base directory, and named ucbns1NSI and ucbns2NSI, respectively. During system startup and shutdown, these scripts are called by /home/ncss/snw/bin/run_all_snw and /home/ncss/snw/bin/stop_all_snw, respectively.

The NSI script offers several options:

ncss@rodgers:cd /home/ncss/snw/ucbns2NSI/
ncss@rodgers:./ucbns2NSI
Warning NSI missing directive argument
usage: NSI [start|startnsi|stop|restart|stopec]

These options do the following:

NSI Configuration Files

Configuration files for the NSI servers are in conf directories under each of the base directories. The files NSI.conf and orb.conf are simple enough and rarely need to be changed. See the ISTI documentation for details.

The real meat of configuration is in ruleset.ucb.ini and stations_info.v3.ini. There is no special meaning to these file names but the names must match the entries in NSI.conf so that the NSI server can find them.

Any new station added to seisnetwatch requires a new entry in stations_info.v3.ini. New types of stations requires several new entries in ruleset.ucb.ini. Because of the variety of stations used at the BSL, we have never automated the construction of these files; instead each file normally is modified by hand in your favorite editor.

When new stations_info.v3.ini or ruleset.ucb.ini files are installed, the NSI server will automatically reread these files. It is generally not necessary to stop and restart the NSI servers for these changes.

However, it is essential that the NSI log file be reviewed after configuration changes are made to make sure no errors were introduced. The log files for each NSI server are in the log directory under each of the base directories. Note that lots of information is included in these NSI log files, so it takes some searching to find entries about revised configuration files. It helps to search the current log for the name of the changed configuration file.

Seisnetwwatch Oddness

There are several odd things about the design and implementation of seisnetwatch that may surprise people at first.

Seisnetwatch Event Channels on rodgers.geo.berkeley.edu

There is little documentation about the seisnetwatch event channels. Somehow they provide transport for all the information collected by seisnetwatch agents into the NSI servers. Apparently the TCP ports that the event channel program open provide both input and output; there are no separate ports for this.

As seen in the NSI start-up script, the event channel is java code that is part of the isti.snwserver.jar file. One instance of this program is run for each event channel (TCP port) given in the orb.config file. The seisnetwatch diagram shows the port numbers used by each of these event channels. There is one group of six event channels for ucbns1NSI another group of six for ucbns2NSI.

Note that the feeder programs do not appear to be able to re-establish a connection with their event channels after a connection is broken. The run_all_snw script includes a step to send email notification to the seisnetwatch operator when seisnetwatch is started.

Seisnetwatch on UCB Data Acquisition Systems

The main seismic data collection activities for the UCB seisnetwatch system take place on the data acquisition systems ucbns1 and ucbns2. As described elsewhere, each seismograph station sends its data to one of these two acquisition systems. From there, that station's data is multicast so that the other acquisition system gets a second copy of the data. The acquisition system that receives data pushed directly from the station is said to be primary system for that station; the other acquisition system is secondary.

The seisnetwatch data collection systems on ucbns1 and ucbns2 are designed to preserve the unique aspects of the acquisition data, while avoiding redundant collection where the information is not unique. For example, data latency for a given station may be different on the two acquisition systems: Latency information for ucbns1 is sent to ucbns1NSI, while latency information for ucbns2 is sent to ucbns2NSI. On the other hand, state-of-health (SOH) information retrieved by an agent from a data logger is not unique to ucbns1 or ucbns2, so that information gets sent to both NSI servers.

This routing of information on ucbns1 and ucbns2 is accomplished by having two collection points (Socket Agents) on each of these systems. One socket agent sends its data directly to the Event Channels for its corresponding NSI server on rodgers. The other socket agent sends its information to a local set of Event Channels. From there the data is copied by ec2ec (Event Channel to Event Channel) programs to both of the Event Channels for both of the NSI servers. This second Socket Agent is known as the crossover socket agent.

These two socket agents on ucbns1 and ucbns2 are the primary entry points for data into seisnetwatch. Each socket agent listens on a configured TCP port for incoming client connections. Once connected, the client sends data to the socket agent, where the data is parsed and sent to the configured event channels using ORB protocol. Clients typically remain connected to the socket agent for long times, but that is not required.

Another part of the data collection system has the uselessly generic name snwclient. This is a C program that polls a directory for files, parses the files, writes their contents to its configured socket agent, and deletes the files when done. The source code for snwclient is part of the Menlo Park (USGS) contribution to Earthworm. This snwclient makes it much easier to write data collection agents: it is simpler to write code to plop a file in a directory than it is to deal with the vagaries of TCP socket connections.

SNW Startup on ucbns1 and ucbns2

All the seisnetwatch processes on ucbns1 and ucbns2 are controlled by the systemd script /etc/systemd/system/snw.service (RHEL 7). This script calls scripts in /home/ncss/snw/bin: run_all_snw for startup, and stop_all_snw for shutdown. These are simple shell scripts that perform the appropriate actions for each of the seisnetwatch components, including the various collection agents described below.

Note that the ec2ec programs do not appear to be able to re-establish a connection with their event channels after a connection is broken. The ec2ec programs need to be restarted to make new connections.

Data Collection Agents

There are a number of data collection agents for the seisnetwatch systems on ucbns1 and ucbns2. Theie job is to collect data, write it into the seisnetwatch data submission format, and send it into seisnetwatch. The data transmission is either by file into snwclient, or by TCP socket to a socket agent. Most data collection agents also supply the UsageLevel parameter for each “station” that the agent monitors.

Oddly, ISTI does not document the seisnetwatch data submission format (except as comments within some agent source code). Data is sent in ASCII in the following form, one line per “sample”:

NET-STATION:PARAM_COUNT:key=value;key=value;...

where NET-STATION is the network and “station” name of the entity being monitored, and PARAM_COUNT is the number of key-value pairs being sent. Key-value pairs are separated by semicolons. Any keys or values containing spaces must be surrounded by double quotes. For example:

 RSW-DANT:16:"Time of last poll"="2018/04/18 07:00:20 UTC";"Agent Message"="Communication OK";"Agent Radio Comms"=1;"Signal to Interference plus Noise Ratio"=19.6;"Modem Temperature"=7;"Cell Bytes Sent Rate"=0.00;"Received Signal Code Power"=-53.0;"Service Display"=LTE;"Service Level"=4;"Cell Bytes Received Rate"=225.93;"Reference Signal Received Quality"=-8;"Reference Signal Received Power"=-82;"Error Rate"=0;"Power Supply Voltage"=12.69;"RSSI"=-62;UsageLevel=7

Note that there is no time value associated with this data sample. The time stamp is applied only when the data reaches the NSI server. So it does not help the seisnetwatch system if any part of the data collection system buffers data while waiting for the downstream path to open because that data will be given misleading time stamps when it finally arrives at the NSI server.

csagent

The csagent runs on the data acquisition systems ucbns1 and ucbns2. It provides seisnetwatch information about the state of acquisition for a given station as seen by the local host. It uses the command /home/ncss/config/bin/dpda to obtain information from the comserv shared memory area.

On startup, csagent parses the master comserv configuration file /etc/stations.ini (symbolic link to /home/ncss/config/etc/stations.ini) to find all comserv stations. It also parses each station's station.ini file to learn the “state” of each stations. Only stations in active state (“state=A”) are processed. Finally, csagent parses its own configuration file /home/ncss/snw/csaganet/csagent.cfg.

After configuration, csagent enters a continuous loop of processing each station, then sleeping to “poll_interval_seconds” seconds. Csagent runs dpda on each station, and parses the output. The following parameters are submitted to seisnetwatch:

The UsageLevel is determined by csagent as follows. If dpda sees CS2M or SL2M clients, UsageLevel is 3 (primary); otherwise UsageLevel is 2 (secondary).

Csagent is controlled by the script/home/ncss/snw/csagent/runCSagent. Whenever stations are added to or removed from the master stations.ini file, csagent needs to be restarted.

dlogagent

The dlogagent runs on the data acquisition systems ucbns1 and ucbns2. It runs the command dlogstat to read various “active” files produced by the comserv client datalog. Note that these are miniSEED data logs, not log files!

The source code for dlogagent (python) and dlogstat (C++) are in /home/ncss/snw/src/.

On startup, dlogagent parses the master comserv configuration file /etc/stations.ini (symbolic link to /home/ncss/config/etc/stations.ini) to find all comserv stations. It also parses each station's station.ini file to search for any of the following configuration parameters:

Finally, dlogagent reads its own configuration file /home/ncss/snw/dlogagent/dlogagent.cfg for a few parameters.

After configuration, dlogagent enters a continuous loop. It runs dlogstat with the appropriate command-line options to get the configured information from each station, and then sleeps for poll_interval_seconds.

dlogagent is controlled by the script /home/ncss/snw/dlogagent/runDLogAgent. Whenever a station is added to or removed from the comserv system, dlogagent needs to be restarted.

q330agent

The q330agent is java code that runs on the data acquisition systems ucbns1 and ucbns2. It communicates with each of the Q330-style data loggers for which it is configured to obtain many parameters from the data logger. This information is then transmitted into the seisnetwatch system.

On startup, q330agent parses the appropriate configuration file in /home/ncss/snw/q330: ucbns1_q330.xml or ucbns2_q330.xml, as well as CA_base.conf. The XML file lists each data logger that is to be processed by the agent. The current convention is to process those data loggers for which the hosts is the primary acquisition system. The XML file also lists the various parameters needed for each data logger. Note that some Q330-style data loggers need to be configured with a non-standard port number.

The q330agent is controlled by the script /home/ncss/snq/q330/runQ330Agent. Whenever a Q330-style data logger is added to or removed an acquisition system, q330agent needs to be restarted.

basalt SNW agent

The Basalt SNW agent is software that runs on Basalt data loggers. It is the responsibility of the data logger operators to set up and configure this software. The agent sends its parameters to a seisnetwatch socket agent to get this information into the seisnetwatch system. Currently all Basalt agents are sending data to the crossover socket agent on ucbns1 (tcp port 10009)

snmp2snw agent

The snmp2snw agents are perl scripts to use SNMP to query various devices and submit parameters into seisnetwatch. The goal of this project was to keep the code as generic as possible, so that several different types of device could be handled by a single script. Unfortunately that was not possible: currently snmp2snw.pl can handle FreeWave and Xetawave radios, and CradlePoint cell modems. A slightly different script snmp2snw-sw.pl is needed to handle Sierra Wireless cell modems.

For the cell modems, the snmp2snw agents try to set the seisnetwatch Usage level to a value to indicate the various cellular service types (i.e. 7 for LTE, 6 for HSPA+, etc.) This turns out to be too much for seisnetwatch: the snw server has no way to save the Usage levels, so it cannot apply the appropriate usage-based rules to display historic information. It is not obvious how to fix this problem.

The two snmp2snw agents are controlled by scripts runSNMP2snw and run_sw_SNMP2snw, and parse configuration files snmp2snw.conf and snmp2snw-sw.conf, respectively. All these files are in /home/ncss/snw/snmp2snw on ucbns2. When new devices are added to a configuration file, the relevant agent must be restarted.

The output of these agents are files, one per device. Normally these agents are configured to send their files to the snwclient input, ucbns2:/home/ncss/snw/snwclient/outdir, where the files are parsed and fed into the seisnetwatch system. For testing, it is useful to send the snmp2snw files to a test directory so they can be reviewed by the operator. The configuration file snmp2snw.test_conf can be used for this.

MSmodbus agent

The MSmodbus agent uses the MODBUS protocol to communicate with Morning Star solar controllers. Currently this per script supports the ProStar and SunSaver models; other models could be added.

On startup, MSmodbus2snw.pl reads configuration information from /home/ncss/snw/MSmodbus/MSmodbus2snw.conf. Then it enters a continuous loop to query each configured device and then sleep for a configured interval.

MSmodbus2snw is controlled by the script /home/ncss/snw/MSmodbus/runMSModbus2snw. The agent should be restarted whenever the configuration file is changed.

GPS/GNSS agent

The GPS SNW data collection agent is part of customized PERL software that runs as user gps via a cron script run_gps every 10 minutes on the real-time GNSS/GPS data acquisition server tiburon. It pushes SOH data for all BSL's GNSS stations to the SNW crossover socket agent on ucbns1 (tcp port 10009). It also pushes SOH data to a timeseries database which serves as as software interface for visualization tools like Grafana to consume SOH data - currently a Graphite database on eew2.geo.berkeley.edu (tcp port 2003).

Example SNW message pushed to ucbns1:10009:*

SATS=>11 LOGF=>1 TEMP=>38.00 FREE=>1711.15 LATE=>24.67 CORR=>1 COMP=>100.00 UPTM=>197.83 VOLT=>13.33 LAST=>0.90 CONN=>1 
Sending SNW msg => BARD-BRI2:12:Network Connectivity=1;Secs Since Last Good Data=0.90;% Complete Epochs(last 10 mins)=100.00;Data Latency(ms)=24.67;Uptime(days)=197.83;Board Temperature(C)=38.00;Supply Voltage=13.33;# Satellites tracked=11;Free space on rcvr(Mbytes)=1711.15;On-site logging=1;Terrastar Corrections=1;UsageLevel=3

*SNW parameters and their threshold values Good/Bad/Fair/Unknown are specified in the NSI config files for theGpsPolaRx5RuleSet ruleset found on rodgers: /home/ncss/snw/ucbns[12]NSI/conf/ruleset.ucb.ini. There is currently just one ruleset(GpsPolaRx5RuleSet) and UsageLevel(3) in use. Parameters values that are 'unknown' are given the value -1 or -1.0. See also Data Collection Agents section above.

Example Graphite DB message(s)pushed to eew2.geo.berkeley.edu:2003:*

BARD.BRI2.COMP 100.00 1667596414
BARD.BRI2.LATE 24.67 1667596414
BARD.BRI2.UPTM 197.83 1667596414
BARD.BRI2.TEMP 38.00 1667596414
BARD.BRI2.VOLT 13.33 1667596414
BARD.BRI2.SATS 11 1667596414
BARD.BRI2.FREE 1711.15 1667596414
BARD.BRI2.LOGF 1 1667596414
BARD.BRI2.CORR 1 1667596414

* Graphite message format is: <network>.<station>.<param> <value> <UNIX_epoch_timestamp> where each <param> description is identical to the SNW message above. Parameter values that are 'unknown' are simply not sent out.

Querying the GNSS receivers for SOH data is integral with other receiver operations performed by the PERL software and requires device administrator access to the receivers and GNSS expertise; otherwise its quite possible to inadvertently mess up the config of a receiver or that of the entire GNSS network or to consume too much of the limited data bandwidth; actual real-time data being more important than SOH data. It is not possible to have a user with read-only access to these parameters.

This agent is unlike the snmp2snw agents that query simpler devices such as cell modems in read-only mode; and unlike data loggers that include SOH in miniSEED channels or push SOH data directly to SNW. So just as for data loggers, SOH data for GNSS receivers would need to be pushed to a software interface like a port or database for visualization tools such as SNW or Graphana to consume.

It is the responsibility of the GNSS Network Manager to develop, test, install, configure, and maintain this software to, among other things, provide SOH information for SNW and Grafana. This includes adding/removing stations and SOH parameters, including editing the NSI config files ruleset.ucb.ini and stations_info.v3.ini on rodgers. Adding a new GNSS station is in general a complex set of operations requiring GNSS expertise. However, when a new station is added to the data acquisition system its SOH data should automatically appear in the Graphite DB.

Note: In the figure above the ports 10009 and 10209 on ucbns1/ucbns2 are incorrectly labeled - the Crossover Socket Agent port is 10009 and the SocketAgent port is 10209. Although the GPS agent could push data to the Crossover SocketAgent on either ucbns1:10009 or ucbns2:10009 it only pushes to the former.