Two templates for a workflow example of using webservices are presented. The first template uses direct HTTP requests explicitly, whereas the second template uses the abstracted ObsPy FDSN Client implementation to manage data requests. Data is requested using FDSNWS-Dataselect, FDSNWS-Station, and FDSNWS-Event. All example code is written in Python.
Using Direct HTTP Requests
This example concerns multiple requests for station metadata, event metadata, and raw waveform data using FDSN standards. The finished source code for this example can be
downloaded.
As setup, native Python modules and libraries need to be imported at the top of the script.
import urllib
import os
import tempfile
import pandas as pd
import numpy as np
from obspy.core import UTCDateTime, read
I. Station Metadata
The query for station information of stations from networks FR and RA with channels HN? (wildcard character "?" matches any character) is stored in variable sta_request
. Python function urlretrieve
of the urllib
module transfers the content of the web request to text file RESIF_station_file
.
RESIF_station_file = 'RESIF_station_file.txt'
# request station metadata at channel level
sta_request = 'http://ws.resif.fr/fdsnws/station/1/query?network=FR,RA&channel=HN?&format=text&level=channel'
urllib.urlretrieve(sta_request, RESIF_station_file)
A few columns from RESIF_station_file
are selected and stored in variable RAP_sta
, with the corresponding column header names in RAP_col
. Function read_sta_file
uses read_csv
from the pandas
package.
RESIF_sta_names = list(['NET', 'STA', 'LOCID', 'CHA', 'LAT', 'LON', 'ELEV',
'DEP', 'AZ', 'DIP', 'INSTRUM', 'SCALE', 'SFREQ',
'SUNIT', 'SAMPRATE','START', 'END'])
RAP_sta, RAP_col = read_sta_file(RESIF_station_file)
def read_sta_file(filename):
pd_sta = pd.read_csv(filename, sep='|', header=0, names=RESIF_sta_names)
short_list = list(['NET', 'STA', 'LOCID', 'LAT', 'LON', 'INSTRUM', 'SAMPRATE', 'START', 'END'])
values = pd_sta[short_list].values
values_unique = np.vstack({tuple(row) for row in values})
nr, nc = values_unique.shape
ista = short_list.index('STA')
for v in values_unique:
v[ista] = v[ista].strip()
return values_unique, short_list
To transfer information between various requests, a Python class called SeismicStation
is created, and station objects are instantiated. Every station object has attributes assigned to it that can be accessed from anywhere in the program, for example self.lat
and self.start_times
in the code below. The word self
refers to the station object. All station objects are gathered in the list sta_dict
.
END_TIME = UTCDateTime(2017, 1, 1)
# create station objects using class SeismicStation
sta_dict = create_station_objects(RAP_sta, RAP_col)
def create_station_objects(X, names_list):
ista = names_list.index('STA')
stations = X[:, ista]
unique_stations = np.unique(stations)
nsta = len(unique_stations)
stadict = {}
for sta in unique_stations:
lines = X[X[:, ista]==sta, :]
stadict[sta] = SeismicStation(lines, names_list)
return stadict
class SeismicStation(object):
def __init__(self, sta_lines, sta_col_names):
nr, nc = sta_lines.shape
ista = sta_col_names.index('STA')
ilat = sta_col_names.index('LAT')
ilon = sta_col_names.index('LON')
istart = sta_col_names.index('START')
iend = sta_col_names.index('END')
self.sta = np.unique(sta_lines[:, ista])[0]
lats = np.empty(nr, dtype=float)
lons = np.empty(nr, dtype=float)
start_times = np.empty(nr, dtype=object)
end_times = np.empty(nr, dtype=object)
for i in xrange(nr):
line = sta_lines[i]
lats[i] = float(line[ilat])
lons[i] = float(line[ilon])
start_times[i] = UTCDateTime(line[istart])
try:
end_times[i] = UTCDateTime(line[iend])
except TypeError:
# case of missing end time
end_times[i] = END_TIME
if end_times[i] > END_TIME:
end_times[i] = END_TIME
self.lat = np.mean(lats)
self.lon = np.mean(lons)
self.n_periods = len(start_times)
time_indexes = np.argsort(start_times)
self.start_times = start_times[time_indexes]
self.end_times = end_times[time_indexes]
Station Metadata at Response Level
A station request at response level yields more detailed metadata than the first station request at channel level. In the URL 'level=channel' is changed to 'level=response'. The networks (FR, RA) and channels (HN?) are the same as before. StationXML file RESIF_response_file
is produced by urlretrieve
.
RESIF_response_file = 'RESIF_response_file.xml'
# request station metadata at response level
resp_request = 'http://ws.resif.fr/fdsnws/station/1/query?network=FR,RA&channel=HN?&level=response'
urllib.urlretrieve(resp_request, RESIF_response_file)
II. Event Webservice
For every station object, data of events within a 50 km radius (event_radius
) of that station are requested.
RENASS_TSV_NAMES = list(['ID', 'OTIME', 'LAT', 'LON', 'MAG', 'MAGTYPE',
'DEPTH', 'DEPTYPE', 'CAT', 'EVAL1', 'NSTA', 'NPHA',
'AZGAP', 'EVAL2', 'TOWN', 'DIST'])
event_radius = 50
for sta in sta_dict.values():
# get event metadata
sta.get_ReNaSS_events(event_radius)
# get event waveform data
sta.get_RESIF_data(nmax=None)
Event Metadata
In class module get_ReNaSS_events
event metadata is retrieved and processed. A web request is set up, similar to the previous one, requesting event metadata from ReNaSS (Réseau National de Surveillance Sismique), since RESIF does not support FDSNWS-Event. The data format is TSV (tab-separated values), which can be read by read_csv
from the pandas
package. Useful metadata is passed to attributes of the station objects.
#class SeismicStation(object):
def get_ReNaSS_events(self, radius_km):
# set up web-request
request = 'http://renass.unistra.fr/fdsnws/event/1/query?orderby=time&format=tsv'
request = request + '&longitude=%.2f' % self.lon
request = request + '&latitude=%.2f' % self.lat
request = request + '&starttime=%s' % self.start_times[0].isoformat()
request = request + '&endtime=%s' % self.end_times[-1].isoformat()
request = request + '&maxradius=%.2f' % (radius_km / 111.25)
# retrieve catalog data
f_, fname = tempfile.mkstemp()
urllib.urlretrieve(request, fname)
# read data and remove file
pd_ev = pd.read_csv(fname, sep='\t', names=RENASS_TSV_NAMES)
os.unlink(fname)
# get only the info we want to keep
short_list = list(['ID', 'OTIME','MAG'])
values = pd_ev[short_list].values
iid = short_list.index('ID')
itime = short_list.index('OTIME')
imag = short_list.index('MAG')
nr, nc = values.shape
# clean data
ids = values[:, iid]
otimes = np.empty(nr, dtype=object)
ev_mags = np.empty(nr, dtype=float)
for i in xrange(nr):
v = values[i]
otimes[i] = UTCDateTime(v[itime])
ev_mags[i] = float(v[imag])
self.n_events = len(otimes)
time_indexes = np.argsort(otimes)
self.ev_ids = ids[time_indexes]
self.otimes = otimes[time_indexes]
self.ev_mags = ev_mags[time_indexes]
os.close(f_)
Event Waveform Data
For the actual waveform data, a fdsn-dataselect query is constructed in class module get_RESIF_data
. This request gathers miniSEED waveform data from the events identified in the event metadata query. There will not be waveform data available for every event, since the station may not have been active in the requested time frame. The read
module from ObsPy is used here to read the file gained from the web request into a Stream object (for more info, see ObsPy workflow below) and write it to a SAC file. The SAC file can subsequently be plotted and processed in SAC. Writing to other formats, such as miniSEED (format="MSEED"
), is possible.
#class SeismicStation(object):
def get_RESIF_data(self, nmax=None):
if nmax is not None:
n_ev = nmax
else:
n_ev = self.n_events
data = np.empty(n_ev, dtype=object)
n_data = 0
# get data from 1 minute before (t1) to 10 minutes after (t2) origin times
for otime in self.otimes[0:n_ev]:
print otime
t1 = otime - 60
t2 = otime + 10*60
request = 'http://ws.resif.fr/fdsnws/dataselect/1/query?station=%s&channel=HN?&starttime=%s&endtime=%s&quality=B' % (self.sta, t1, t2)
f_, fname = tempfile.mkstemp()
urllib.urlretrieve(request, fname)
try:
st = read(fname)
os.unlink(fname)
if len(st) > 0:
data[n_data] = st
n_data += 1
st.write("data/%s_%s.sac" % (self.sta, otime), format="SAC")
print 'Waveform data found!'
except TypeError:
os.unlink(fname)
print 'no RESIF data in this time window'
continue
os.close(f_)
data.resize(n_data)
self.data = data
Figure showing raw waveform data of event requested from RESIF FDSNWS-Dataselect.
ObsPy FDSN Client
An alternative way of requesting waveform data and station/event metadata is through ObsPy, a Python framework for seismology. The obspy.clients.fdsn
module is in many cases the best option, because of its large number of data centers and modern data formats. See the ObsPy tutorial for documentation and more information on this module. With this module, the user can obtain station and event metadata, and waveform data, and instantly plot the results. The complete source code for this template can be found here.
Start by initializing a client object specifying either the name of the webservice, e.g. "RESIF", or the base URL, e.g. "http://ws.resif.fr".
from obspy.clients.fdsn import Client
import numpy as np
from obspy.core import UTCDateTime, read, Stream
# specify webservice
client = Client("RESIF")
Three data requests that the client module supports are get_stations
, get_events
and get_waveforms
, although not all clients support all three types. As mentioned before, RESIF, among others, does not support get_events
, which is why the client is switched to IRIS for that part of the workflow.
I. Station metadata at channel level
To get station information at channel level, use the get_stations
module. An Inventory object is created, which has a network-station-channel hierarchy. Station codes and coordinates, amongst others, are attributes of the station objects that this request yields.
# request station metadata at channel level
station_chan = client.get_stations(network="FR,RA",station="*",channel="HN?",level="channel")
# plot stations
station_chan.plot()
# number of stations for FR network
nsta_FR = station_chan[0].selected_number_of_stations
# number of stations for RA network
nsta_RA = station_chan[1].selected_number_of_stations
# get coordinates of every station
FR_coords = np.empty([nsta_FR, 2])
RA_coords = np.empty([nsta_RA, 2])
for k in range(nsta_FR):
FR_coords[k,0] = station_chan[0][k].latitude
FR_coords[k,1] = station_chan[0][k].longitude
# get station names
FR_sta = ["" for x in range(nsta_FR)]
for p in range(nsta_FR):
FR_sta[p] = station_chan[0][p].code
II. Station metadata at response level
To get metadata at response level, simply change the level key in get_stations
from "channel" to "response". Instrument responses can be plotted with plot_response
(see figure below).
# request station metadata at response level
client = Client("RESIF")
station_resp = client.get_stations(network="FR,RA", station="*", channel="HN?", level="response")
# plot responses for network FR, station GRN
station_resp.plot_response(min_freq=0.001,network="FR",station="GRN")
II. Event data
The get_events
method in the code below finds events from the first of January until the first of March 2002 and requests their metadata, creating a Catalog object. This information can be written to file in, for example, XML-format. Similar to the previous workflow example, events can be requested within a certain distance of stations using the maxradius
parameter. To reduce computational time, only stations that are active in the requested time window are taken into consideration. For the events that are found, waveform data is requested using get_waveforms
, creating Stream objects containing one or more Trace objects. Common data processing operations, such as removing instrument response, filtering and resampling, can be performed on Stream objects and its traces. For an overview of built-in ObsPy methods, refer to the ObsPy documentation.
# get event metadata of events within radius=0.5 degrees of stations
starttime = UTCDateTime("2002-01-01")
endtime = UTCDateTime("2002-03-01")
event_radius = 0.5 # degrees
for p in range(nsta_FR):
# create empty Stream object
st = Stream()
# check if station was active during requested time frame
act = station_chan[0][p].is_active(starttime=starttime, endtime=endtime)
if act is True:
lat = FR_coords[p,0] # station latitude
lon = FR_coords[p,1] # station longitude
client = Client("IRIS")
try:
FR_events = client.get_events(starttime=starttime, endtime=endtime, latitude=lat, longitude=lon, maxradius=event_radius) # produces a Catalog object
FR_events.write("%s_events.xml" %FR_sta[p], format="QUAKEML")
n_events = FR_events.count() # number of events found
print '%i event(s) found for station %s' %(n_events,FR_sta[p])
# request waveform data from origin time minus 1 minute to origin time plus 10 minutes:
client = Client("RESIF")
for j in range(n_events):
# event origin time
otime = UTCDateTime(FR_events[j].origins[0].time)
try:
st += client.get_waveforms("FR",FR_sta[p],"*","HN?",otime-60, otime+10*60)
print 'Waveform data found for station %s event %i!' %(FR_sta[p],j+1)
except Exception:
print 'No waveform data found for station %s event %i...' %(FR_sta[p],j+1)
continue
FR_events.plot(projection="local",title="Event locations %s" %FR_sta[p])
st.plot()
# write data to file
st.write("%s_data.MSEED" %FR_sta[p], format="MSEED")
except Exception:
print 'No events found for station %s...' %FR_sta[p]
continue
else:
print 'Station %s was not active in requested time frame' %FR_sta[p]
a) Waveform data (miniSEED)
|
b) Event locations
|
c) Instrument Frequency Response
Figures showing a) raw waveform data b) earthquake magnitudes and c) instrument frequency response for three components.
|