Improved Statistical Method for Quality Control of Hydrographic Observations

Type Article
Date 2020-05
Language English
Author(s) Gourrion Jerome1, Szekely Tanguy1, Killick Rachel2, Owens Breck3, Reverdin Gilles4, Chapron BertrandORCID5
Affiliation(s) 1 : OceanScope, Plouzané, France
2 : Met Office, Exeter, United Kingdom
3 : Woods Hole Oceanographic Institution, Woods Hole, Massachusetts
4 : Sorbonne-Université, CNRS/IRD/MNHN (LOCEAN UMR 7159), Paris, France
5 : LOPS, Ifremer, Plouzané, France
Source Journal Of Atmospheric And Oceanic Technology (0739-0572) (American Meteorological Society), 2020-05 , Vol. 37 , N. 5 , P. 789-806
DOI 10.1175/JTECH-D-18-0244.1
WOS© Times Cited 5
Keyword(s) Ocean, Climatology, Salinity, Temperature, Data quality control, Oceanic variability

Realistic ocean state prediction and its validation rely on the availability of high quality in situ observations. To detect data errors, adequate quality check procedures must be designed. This paper presents procedures that take advantage of the ever-growing observation databases that provide climatological knowledge of the ocean variability in the neighborhood of an observation location. Local validity intervals are used to estimate binarily whether the observed values are considered as good or erroneous. Whereas a classical approach estimates validity bounds from first- and second-order moments of the climatological parameter distribution, that is, mean and variance, this work proposes to infer them directly from minimum and maximum observed values. Such an approach avoids any assumption of the parameter distribution such as unimodality, symmetry around the mean, peakedness, or homogeneous distribution tail height relative to distribution peak. To reach adequate statistical robustness, an extensive manual quality control of the reference dataset is critical. Once the data have been quality checked, the local minima and maxima reference fields are derived and the method is compared with the classical mean/variance-based approach. Performance is assessed in terms of statistics of good and bad detections. It is shown that the present size of the reference datasets allows the parameter estimates to reach a satisfactory robustness level to always make the method more efficient than the classical one. As expected, insufficient robustness persists in areas with an especially low number of samples and high variability.

Full Text
File Pages Size Access
Publisher's official version 18 3 MB Open access
Top of the page