Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Water
Impact Factor: 4.436

The ‘dirty dozen’ of freshwater science: detecting then reconciling hydrological data biases and errors

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Sound water policy and management rests on sound hydrometeorological and ecological data. Conversely, unrepresentative, poorly collected, or erroneously archived data introduce uncertainty regarding the magnitude, rate, and direction of environmental change, in addition to undermining confidence in decision‐making processes. Unfortunately, data biases and errors can enter the information flow at various stages, starting with site selection, instrumentation, sampling/measurement procedures, postprocessing and ending with archiving systems. Techniques such as visual inspection of raw data, graphical representation, and comparison between sites, outlier, and trend detection, and referral to metadata can all help uncover spurious data. Tell‐tale signs of ambiguous and/or anomalous data are highlighted using 12 carefully chosen cases drawn mainly from hydrology (‘the dirty dozen’). These include evidence of changes in site or local conditions (due to land management, river regulation, or urbanization); modifications to instrumentation or inconsistent observer behavior; mismatched or misrepresentative sampling in space and time; treatment of missing values, postprocessing and data storage errors. Also for raising awareness of pitfalls, recommendations are provided for uncovering lapses in data quality after the information has been gathered. It is noted that error detection and attribution are more problematic for very large data sets, where observation networks are automated, or when various information sources have been combined. In these cases, more holistic indicators of data integrity are needed that reflect the overall information life‐cycle and application(s) of the hydrological data. WIREs Water 2017, 4:e1209. doi: 10.1002/wat2.1209 This article is categorized under: Science of Water > Methods Science of Water > Water and Environmental Change
Archiving errors in river flow records: Station a—missing data code (198) interpreted as actual flow data; Station b—rounding of flows greater than 1 cubic meter per second (cumec) to whole integers; Station c —decimalization change; and Station d—suspect low values set to zero. Data sources: (a): FRIEND European Water Archive (EWA), Germany; (b): German Federal Institute of Hydrology (BfG), Germany (including data acquired from the water authorities of the German Federal States); (c): Office of Public Works (OPW), Ireland; (d): Centre for Hydrographic Studies (CEDEX), Spain.
[ Normal View | Magnified View ]
An information‐flow that begins by setting project objectives and ends with data archiving, dissemination and use. Data biases and errors can enter the information‐flow at any point in between.
[ Normal View | Magnified View ]
Gauged river flow records for the River Rother, UK. (a), Data completeness for the three gauges, with light blue illustrating partial data; (b), Histogram of river flows for gauge C, fitted with a log‐normal distribution (red); (c), Schematic of gauge locations and metadata. Dark blue river reaches are measured by gauges A‐B‐C, light blue ones are not; numbers in brackets () are the NNRFA station codes; areas represent the upstream catchment size, and x symbols indicate other gauges. Data source: UK National Flood River Archive (http://nrfa.ceh.ac.uk/)
[ Normal View | Magnified View ]
(a), Discharge and SSC time series for the proglacial river of the Finsterwalder Glacier, Norway, showing two periods of very high values compared to background levels (East and West are rivers draining the glacier margins, which coalesce downstream to form the Outlet river); (b), The same data converted to SSL and integrated over time—a process that yields improved characterization of the sediment transport regime compared with simple measures of central tendency and dispersion (the proximal flux is the sum of the East and West fluxes; the distal flux is the outlet flux); (c), Histogram of the SSC showing the multimodal nature of the data.
[ Normal View | Magnified View ]
(a) 10‐s resolution turbidity record (gray) with a 1‐min moving average (black) over 10‐h in a still‐water laboratory aquaria with silty substrate and one Signal Crayfish left for 1‐h near the beginning of the experiment, after which time it was removed. Note that spikes occur only when the crayfish is present, with gradually decreasing turbidity after crayfish removal. (b) 5‐min resolution turbidity record for a tributary of the River Nene, UK, colonized by crayfish (black) which records a signal with more frequent spikes during night hours (labels are at midnight) and a strong diurnal structure in the mean turbidity. During this period other instruments confirmed that there were no changes in hydraulics capable of driving these turbidity fluctuations. It was concluded that individual spikes reflect fine sediment entrainment caused by foraging, burrowing or fighting events, which increase at night because crayfish are nocturnal. The diurnal pattern reflects the net effect of this enhanced night time activity on mean turbidity. A second turbidity sensor (red line), identical to that in the river, was deployed in an open‐top aquarium filled with clean water and situated on the river bed adjacent to the first. The flat trace confirms that the signal from the river is not an instrument artifact, driven by diurnal variations in light or temperature that can affect the optical measurement of turbidity in some sensors. The small spikes that do occur, fall within the manufacturers stated error, are randomly distributed around the mean and do not show any temporal structure, which suggests that they reflect instrument noise.
[ Normal View | Magnified View ]
Schematic of continuous river discharge measurement with a schedule of discrete biological surveys (numbered 1–4) within an autumn sampling season. Eco‐sample 1 is collected under steady/low flow conditions; 2 during a period of catchment rewetting; 3 near to and 4 following the peak discharge. A denotes the start of the hydrological year in the UK.
[ Normal View | Magnified View ]
Time of day when spot samples of river water temperatures were taken at Glutton, River Dove, Derbyshire, UK (Reprinted with permission from Ref ).
[ Normal View | Magnified View ]
Evidence of observer (a, b) value and (c) day of week biases in daily precipitation amounts recorded for Dushanbe, Tajikistan. Data source: NOAA Global Summary of the Day.
[ Normal View | Magnified View ]
Pre‐ and postreservoir data for Shell Brook, UK (NRFA 41024): (a) daily river flow hydrographs in 1972 and 2005; (b) flow duration curves for 1971–1977 and 1978–2015; (c) 5th, 50th, and 95th flow quantiles for the same periods as (b).
[ Normal View | Magnified View ]
Before and after correcting for change in datum. Stage records for the Comite river near Comite, Louisiana (USGS site number 07378000) are publicly available on the USGS National Water Information Service website. The online Water Year Report states ‘From Oct. 1, 1978 to Sept. 30, 1996, at current datum. From Oct. 1, 1996 to Sept. 30, 2001, at datum 2.00 ft lower.’ Therefore, the stage time series were adjusted to the same datum by subtracting two feet from the measured stage between 1 October 1996 and 30 September 2001 (i.e., water years 1997–2001). The measurements made during this period are shown as red circles, before (a) and after (b) datum correction.
[ Normal View | Magnified View ]
AMAX series for the Harper's Brook at Old Mill Bridge, UK (NRFA 32003). A compound crump profile weir was installed in 1965. The black dots show the linear trend for the whole record, with the equation given in the top right corner. Horizontal blue and red lines show the AMAX mean of the records pre‐ and post‐1965, respectively.
[ Normal View | Magnified View ]
(a) Unadjusted monthly mean minimum temperatures smoothed with 12‐month running mean at urban (USHCN ID: 166664) and rural (USHCN ID: 168163) weather stations. (b) The locations of stations in A are shown on a map of night time lights generated from the Defence Meteorological Satellite Program's Operational Linescan System (http://ngdc.noaa.gov/eog/). The lower panel shows the locations in detail, with example stations marked with a white cross. The urban station is situated in the bright area of New Orleans.
[ Normal View | Magnified View ]
Observed (blue) and modeled (red) annual mean flow for the River Boyne catchment 1952–2009 (Office of Public Works station number 07012). The gray shaded area represents the years in which arterial drainage took place (1969–1986). Dashed horizontal lines are median observed and modeled flows in pre‐ and postdrainage periods. The black vertical line is the change point in observed flow in 1978, detected by Pettitt's test (Reprinted with permission from Ref ).
[ Normal View | Magnified View ]
(a) Comparison of the original (red line) and posthomogenized (blue line) annual rainfall record for MH, Ireland. Also shown are the regression equations for the linear trend in each series. (b) Double mass plots for original and homogenized annual precipitation series at MH compared with a nearby station in Derry.
[ Normal View | Magnified View ]

Browse by Topic

Science of Water > Water and Environmental Change
Science of Water > Methods

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts