Two templates for a workflow example of using webservices are presented. The first template uses direct HTTP requests explicitly, whereas the second template uses the abstracted ObsPy FDSN Client implementation to manage data requests. Data is requested using FDSNWS-Dataselect, FDSNWS-Station, and FDSNWS-Event. All example code is written in Python.
As setup, native Python modules and libraries need to be imported at the top of the script.
import urllib
import os
import tempfile
import pandas as pd
import numpy as np
from obspy.core import UTCDateTime, read
The query for station information of stations from networks FR and RA with channels HN? (wildcard character "?" matches any character) is stored in variable sta_request
. Python function urlretrieve
of the urllib
module transfers the content of the web request to text file RESIF_station_file
.
RESIF_station_file = 'RESIF_station_file.txt'
# request station metadata at channel level
sta_request = 'http://ws.resif.fr/fdsnws/station/1/query?network=FR,RA&channel=HN?&format=text&level=channel'
urllib.urlretrieve(sta_request, RESIF_station_file)
A few columns from RESIF_station_file
are selected and stored in variable RAP_sta
, with the corresponding column header names in RAP_col
. Function read_sta_file
uses read_csv
from the pandas
package.
RESIF_sta_names = list(['NET', 'STA', 'LOCID', 'CHA', 'LAT', 'LON', 'ELEV',
'DEP', 'AZ', 'DIP', 'INSTRUM', 'SCALE', 'SFREQ',
'SUNIT', 'SAMPRATE','START', 'END'])
RAP_sta, RAP_col = read_sta_file(RESIF_station_file)
def read_sta_file(filename):
pd_sta = pd.read_csv(filename, sep='|', header=0, names=RESIF_sta_names)
short_list = list(['NET', 'STA', 'LOCID', 'LAT', 'LON', 'INSTRUM', 'SAMPRATE', 'START', 'END'])
values = pd_sta[short_list].values
values_unique = np.vstack({tuple(row) for row in values})
nr, nc = values_unique.shape
ista = short_list.index('STA')
for v in values_unique:
v[ista] = v[ista].strip()
return values_unique, short_list
To transfer information between various requests, a Python class called SeismicStation
is created, and station objects are instantiated. Every station object has attributes assigned to it that can be accessed from anywhere in the program, for example self.lat
and self.start_times
in the code below. The word self
refers to the station object. All station objects are gathered in the list sta_dict
.
END_TIME = UTCDateTime(2017, 1, 1) # create station objects using class SeismicStation sta_dict = create_station_objects(RAP_sta, RAP_col) def create_station_objects(X, names_list): ista = names_list.index('STA') stations = X[:, ista] unique_stations = np.unique(stations) nsta = len(unique_stations) stadict = {} for sta in unique_stations: lines = X[X[:, ista]==sta, :] stadict[sta] = SeismicStation(lines, names_list) return stadict class SeismicStation(object): def __init__(self, sta_lines, sta_col_names): nr, nc = sta_lines.shape ista = sta_col_names.index('STA') ilat = sta_col_names.index('LAT') ilon = sta_col_names.index('LON') istart = sta_col_names.index('START') iend = sta_col_names.index('END') self.sta = np.unique(sta_lines[:, ista])[0] lats = np.empty(nr, dtype=float) lons = np.empty(nr, dtype=float) start_times = np.empty(nr, dtype=object) end_times = np.empty(nr, dtype=object) for i in xrange(nr): line = sta_lines[i] lats[i] = float(line[ilat]) lons[i] = float(line[ilon]) start_times[i] = UTCDateTime(line[istart]) try: end_times[i] = UTCDateTime(line[iend]) except TypeError: # case of missing end time end_times[i] = END_TIME if end_times[i] > END_TIME: end_times[i] = END_TIME self.lat = np.mean(lats) self.lon = np.mean(lons) self.n_periods = len(start_times) time_indexes = np.argsort(start_times) self.start_times = start_times[time_indexes] self.end_times = end_times[time_indexes]
A station request at response level yields more detailed metadata than the first station request at channel level. In the URL 'level=channel' is changed to 'level=response'. The networks (FR, RA) and channels (HN?) are the same as before. StationXML file RESIF_response_file
is produced by urlretrieve
.
RESIF_response_file = 'RESIF_response_file.xml' # request station metadata at response level resp_request = 'http://ws.resif.fr/fdsnws/station/1/query?network=FR,RA&channel=HN?&level=response' urllib.urlretrieve(resp_request, RESIF_response_file)
For every station object, data of events within a 50 km radius (event_radius
) of that station are requested.
RENASS_TSV_NAMES = list(['ID', 'OTIME', 'LAT', 'LON', 'MAG', 'MAGTYPE', 'DEPTH', 'DEPTYPE', 'CAT', 'EVAL1', 'NSTA', 'NPHA', 'AZGAP', 'EVAL2', 'TOWN', 'DIST']) event_radius = 50 for sta in sta_dict.values(): # get event metadata sta.get_ReNaSS_events(event_radius) # get event waveform data sta.get_RESIF_data(nmax=None)
In class module get_ReNaSS_events
event metadata is retrieved and processed. A web request is set up, similar to the previous one, requesting event metadata from ReNaSS (Réseau National de Surveillance Sismique), since RESIF does not support FDSNWS-Event. The data format is TSV (tab-separated values), which can be read by read_csv
from the pandas
package. Useful metadata is passed to attributes of the station objects.
#class SeismicStation(object): def get_ReNaSS_events(self, radius_km): # set up web-request request = 'http://renass.unistra.fr/fdsnws/event/1/query?orderby=time&format=tsv' request = request + '&longitude=%.2f' % self.lon request = request + '&latitude=%.2f' % self.lat request = request + '&starttime=%s' % self.start_times[0].isoformat() request = request + '&endtime=%s' % self.end_times[-1].isoformat() request = request + '&maxradius=%.2f' % (radius_km / 111.25) # retrieve catalog data f_, fname = tempfile.mkstemp() urllib.urlretrieve(request, fname) # read data and remove file pd_ev = pd.read_csv(fname, sep='\t', names=RENASS_TSV_NAMES) os.unlink(fname) # get only the info we want to keep short_list = list(['ID', 'OTIME','MAG']) values = pd_ev[short_list].values iid = short_list.index('ID') itime = short_list.index('OTIME') imag = short_list.index('MAG') nr, nc = values.shape # clean data ids = values[:, iid] otimes = np.empty(nr, dtype=object) ev_mags = np.empty(nr, dtype=float) for i in xrange(nr): v = values[i] otimes[i] = UTCDateTime(v[itime]) ev_mags[i] = float(v[imag]) self.n_events = len(otimes) time_indexes = np.argsort(otimes) self.ev_ids = ids[time_indexes] self.otimes = otimes[time_indexes] self.ev_mags = ev_mags[time_indexes] os.close(f_)
For the actual waveform data, a fdsn-dataselect query is constructed in class module get_RESIF_data
. This request gathers miniSEED waveform data from the events identified in the event metadata query. There will not be waveform data available for every event, since the station may not have been active in the requested time frame. The read
module from ObsPy is used here to read the file gained from the web request into a Stream object (for more info, see ObsPy workflow below) and write it to a SAC file. The SAC file can subsequently be plotted and processed in SAC. Writing to other formats, such as miniSEED (format="MSEED"
), is possible.
#class SeismicStation(object): def get_RESIF_data(self, nmax=None): if nmax is not None: n_ev = nmax else: n_ev = self.n_events data = np.empty(n_ev, dtype=object) n_data = 0 # get data from 1 minute before (t1) to 10 minutes after (t2) origin times for otime in self.otimes[0:n_ev]: print otime t1 = otime - 60 t2 = otime + 10*60 request = 'http://ws.resif.fr/fdsnws/dataselect/1/query?station=%s&channel=HN?&starttime=%s&endtime=%s&quality=B' % (self.sta, t1, t2) f_, fname = tempfile.mkstemp() urllib.urlretrieve(request, fname) try: st = read(fname) os.unlink(fname) if len(st) > 0: data[n_data] = st n_data += 1 st.write("data/%s_%s.sac" % (self.sta, otime), format="SAC") print 'Waveform data found!' except TypeError: os.unlink(fname) print 'no RESIF data in this time window' continue os.close(f_) data.resize(n_data) self.data = data
Figure showing raw waveform data of event requested from RESIF FDSNWS-Dataselect.
An alternative way of requesting waveform data and station/event metadata is through ObsPy, a Python framework for seismology. The obspy.clients.fdsn
module is in many cases the best option, because of its large number of data centers and modern data formats. See the ObsPy tutorial for documentation and more information on this module. With this module, the user can obtain station and event metadata, and waveform data, and instantly plot the results. The complete source code for this template can be found here.
Start by initializing a client object specifying either the name of the webservice, e.g. "RESIF", or the base URL, e.g. "http://ws.resif.fr".
from obspy.clients.fdsn import Client import numpy as np from obspy.core import UTCDateTime, read, Stream # specify webservice client = Client("RESIF")
Three data requests that the client module supports are get_stations
, get_events
and get_waveforms
, although not all clients support all three types. As mentioned before, RESIF, among others, does not support get_events
, which is why the client is switched to IRIS for that part of the workflow.
To get station information at channel level, use the get_stations
module. An Inventory object is created, which has a network-station-channel hierarchy. Station codes and coordinates, amongst others, are attributes of the station objects that this request yields.
# request station metadata at channel level station_chan = client.get_stations(network="FR,RA",station="*",channel="HN?",level="channel") # plot stations station_chan.plot() # number of stations for FR network nsta_FR = station_chan[0].selected_number_of_stations # number of stations for RA network nsta_RA = station_chan[1].selected_number_of_stations # get coordinates of every station FR_coords = np.empty([nsta_FR, 2]) RA_coords = np.empty([nsta_RA, 2]) for k in range(nsta_FR): FR_coords[k,0] = station_chan[0][k].latitude FR_coords[k,1] = station_chan[0][k].longitude # get station names FR_sta = ["" for x in range(nsta_FR)] for p in range(nsta_FR): FR_sta[p] = station_chan[0][p].code
To get metadata at response level, simply change the level key in get_stations
from "channel" to "response". Instrument responses can be plotted with plot_response
(see figure below).
# request station metadata at response level client = Client("RESIF") station_resp = client.get_stations(network="FR,RA", station="*", channel="HN?", level="response") # plot responses for network FR, station GRN station_resp.plot_response(min_freq=0.001,network="FR",station="GRN")
The get_events
method in the code below finds events from the first of January until the first of March 2002 and requests their metadata, creating a Catalog object. This information can be written to file in, for example, XML-format. Similar to the previous workflow example, events can be requested within a certain distance of stations using the maxradius
parameter. To reduce computational time, only stations that are active in the requested time window are taken into consideration. For the events that are found, waveform data is requested using get_waveforms
, creating Stream objects containing one or more Trace objects. Common data processing operations, such as removing instrument response, filtering and resampling, can be performed on Stream objects and its traces. For an overview of built-in ObsPy methods, refer to the ObsPy documentation.
# get event metadata of events within radius=0.5 degrees of stations starttime = UTCDateTime("2002-01-01") endtime = UTCDateTime("2002-03-01") event_radius = 0.5 # degrees for p in range(nsta_FR): # create empty Stream object st = Stream() # check if station was active during requested time frame act = station_chan[0][p].is_active(starttime=starttime, endtime=endtime) if act is True: lat = FR_coords[p,0] # station latitude lon = FR_coords[p,1] # station longitude client = Client("IRIS") try: FR_events = client.get_events(starttime=starttime, endtime=endtime, latitude=lat, longitude=lon, maxradius=event_radius) # produces a Catalog object FR_events.write("%s_events.xml" %FR_sta[p], format="QUAKEML") n_events = FR_events.count() # number of events found print '%i event(s) found for station %s' %(n_events,FR_sta[p]) # request waveform data from origin time minus 1 minute to origin time plus 10 minutes: client = Client("RESIF") for j in range(n_events): # event origin time otime = UTCDateTime(FR_events[j].origins[0].time) try: st += client.get_waveforms("FR",FR_sta[p],"*","HN?",otime-60, otime+10*60) print 'Waveform data found for station %s event %i!' %(FR_sta[p],j+1) except Exception: print 'No waveform data found for station %s event %i...' %(FR_sta[p],j+1) continue FR_events.plot(projection="local",title="Event locations %s" %FR_sta[p]) st.plot() # write data to file st.write("%s_data.MSEED" %FR_sta[p], format="MSEED") except Exception: print 'No events found for station %s...' %FR_sta[p] continue else: print 'Station %s was not active in requested time frame' %FR_sta[p]
![]() |
![]() |
![]() Figures showing a) raw waveform data b) earthquake magnitudes and c) instrument frequency response for three components. |