Sequential process to choose efficient sampling design based on partial prior information data and simulations
|Author(s)||Kermorvant Claire1, Coube Sébastien1, D’amico Frank1, Bru Noëlle1, Caill-Milly Nathalie2|
|Affiliation(s)||1 : CNRS/Univ Pau & Pays Adour, Laboratoire de Mathématiques et de leurs Applications de Pau-Fédération MIRA, UMR 5142, 64600 Anglet, France
2 : Ifremer, Laboratoire Environnement Ressources d’ Arcachon, France
|Source||Spatial Statistics (2211-6753) (Elsevier BV), 2020-08 , Vol. 38 , P. 100439 (13p.)|
|WOS© Times Cited||2|
|Keyword(s)||Accuracy, Cost-effectiveness, Survey, Simulations, Survey optimization, Virtual ecology|
Issues on sampling procedure definition led numerous study results to be biased and object of controversy. Choosing relevant sampling design and number of samples is a difficult task when wanted to set up or optimize a survey. The survey design choice is very important to avoid bias and increase the survey cost-efficiency. It can have a strong effect on the sample size needed to achieve some targeted accuracy on results. And on the final cost of the procedure.
The sequential process we expose here melt design based and model based sampling theories. Its objectives is helping practitioners defining a sampling design and a number of samples for their survey when inference to the whole population is wanted. The main idea is to mathematically reconstruct the distribution of the surveyed population. Then assess and compare cost-effectiveness of various sampling designs on this population. This process allows setting predetermined level(s) of accuracy to be reached in the targeted estimates and to take into account previous relevant data. Results are an optimal sampling design and an associated optimal sample size for a desired accuracy in the results. This accuracy is so achieved without excess sampling. Strength of this process is that it is based on simulations. This allows trying a high number of combinations between sampling design, sample size and desired level of accuracy. Sampling design performances can thus be compared.
The user can decide which combination is the best for his survey and apply it for real. We discuss how to use available data to improve the survey, from the case were several historical data are provided to the case where no data are available.