DWR: A module for accessing Daily Weather Report observations

This module provides a python API to the data recently digitised from the UK Daily Weather Reports.

Access to the data is through the load_observations() function:

import DWR
import datetime
obs=DWR.load_observations('prmsl',
                          datetime.datetime(1903,10,1,0),
                          datetime.datetime(1903,10,31,23))

Will load the mslp observations (as a pandas.DataFrame) from the whole of October 1903.

Observations are typically made twice a day. It’s often useful to estimate the observed value at other times of day. at_station_and_time() does this :

value=DWR.at_station_and_time(obs,'ABERDEEN',
                              datetime.datetime(1903,10,12,13,30))

estimates the Aberdeen observation at 1:30pm on 12 October 1903 by linear interpolation between the nearest previous and subsequent observations. obs is the pandas.DataFrame loaded above.

It’s nice to have easy access to station locations:

posn=DWR.get_station_location(obs,'ABERDEEN')

gets the position (posn[‘latitude’] and posn[‘longitude’]) of the Aberdeen station. Note that these positions have low accuracy: the DWR has poor station metadata and we often don’t know which station was in use at which time. This call returns the location of Aberdeen, rather than the location of the weather station in Aberdeen.

The station names are stored in ALLCAPS with no spaces; for pretty output we want correctly capitalised and spaced strings

DWR.pretty_name('FORTWILLIAM')

returns ‘Fort William’.


DWR.at_station_and_time(obs, station, dte)[source]

Get, from these observations, the value at the selected station and time.

Typically there are observations from each station only twice a day (sometimes less) to get the observed value at the specified time we do linear interpolation in time (using only observations for the selected station.

Parameters:
Returns:

Interpolated observed value from station at time.

Return type:

float

Raises:

StandardError – obs does not contain at least two values for selected station, one before and one after specified time. So interpolation not possible.


DWR.at_station_and_time_with_distance(obs, station, dte)[source]

Get, from these observations, the value at the selected station and time, along with the time gap (in seconds) between the given interpolated value and a real observation.

Typically there are observations from each station only twice a day (sometimes less) to get the observed value at the specified time we do linear interpolation in time (using only observations for the selected station.

Parameters:
Returns:

Interpolated observed value from station at time.

Return type:

float

Raises:

StandardError – obs does not contain at least two values for selected station, one before and one after specified time. So interpolation not possible.


DWR.get_station_location(obs, station)[source]

Get, from these observations, the location (lat and lon) of the named station.

Parameters:
Returns:

Dictionary with keys ‘latitude’ and ‘longitude’.

Return type:

dict

Raises:

StandardError – obs does not contain any observations for selected station.


DWR.load_observations(variable, start, end)[source]

Load observations from disc, for the selected period

Data must be available in directory $SCRATCH/DWR. Requires a data file for all calendar months in the period start to end.

Parameters:
  • variable (str) – Variable name (‘prmsl’,’air.2m’, ‘prate’, etc.)
  • start (datetime.datetime) – Get observations at or after this time.
  • end (datetime.datetime) – Get observations before this time.
Returns:

Dataframe of observations.

Return type:

pandas.DataFrame

Raises:

IOError – No data on disc for this variable, for a year and month in the selected period.


DWR.load_observations_1file(variable, year, month)[source]

Load all observations for one calendar month (for one variable)

Data must be available in directory ../../data.

Parameters:
  • variable (str) – Variable name (‘prmsl’,’air.2m’, ‘prate’, etc.)
  • year (int) – Year of assimilation run.
  • month (int) – Month of assimilation run (1-12)
Returns:

Dataframe of observations.

Return type:

pandas.DataFrame

Raises:

IOError – No data on disc for this variable, year, and month.


DWR.pretty_name(name)[source]

Convert station names from DATAFORMAT to Print Format.

The station names included in the DWR data files are in all caps and contain no spaces. This function maps them to a readable format - so FORTWILLIAM becomes ‘Fort William’.

Parameters:name (str) – Name as in data file (e.g. ‘CAPGRISNEZ’)
Returns:Name in readable format (e.g. ‘Cap Gris-Nez’)
Return type:str