Biogeo : an R package for assessing and improving data quality of occurrence record datasets

Loading...
Thumbnail Image

Authors

Robertson, Mark P.
Visser, Vernon
Hui, Cang

Journal Title

Journal ISSN

Volume Title

Publisher

Wiley

Abstract

Occurrence data from museum and herbarium collections are valuable for mapping biodiversity patterns in space and time. Unfortunately these collections datasets contain many errors and suffer from several data quality issues that can influence the quality of the products derived from them. It is up to the user to identify these errors and data quality issues when using these data. Despite the large number of potential users of these datasets there are few software tools dedicated to error detection and correction of collections datasets. The R package biogeo was developed for detecting and correcting errors and for assessing of data quality of collections datasets consisting of occurrence records. Features of the package include error detection, such as mismatches between the recorded country and the country where the record is plotted, records of terrestrial species that fall into the sea and outlier detection. A key feature of the package is the ability to identify likely alternative positions for points that represent obvious errors in the dataset and functions to explore records in geographical and environmental space in order to identify possible errors in the dataset. Functions are also available for converting coordinates that are in various text formats into degrees, minutes and seconds and then into decimal degrees.

Description

Keywords

Biogeo, Assessing and improving data quality, R package, Occurrence record datasets

Sustainable Development Goals

Citation

Robertson, MP, Visser, V & Hui, C 2016, 'Biogeo : an R package for assessing and improving data quality of occurrence record datasets', Ecography, vol. 39, no. 4, pp. 394-401.