Abstract:
Occurrence data from museum and herbarium collections are valuable for mapping
biodiversity patterns in space and time. Unfortunately these collections datasets contain
many errors and suffer from several data quality issues that can influence the quality of the
products derived from them. It is up to the user to identify these errors and data quality
issues when using these data. Despite the large number of potential users of these datasets
there are few software tools dedicated to error detection and correction of collections
datasets. The R package biogeo was developed for detecting and correcting errors and for
assessing of data quality of collections datasets consisting of occurrence records. Features of
the package include error detection, such as mismatches between the recorded country and
the country where the record is plotted, records of terrestrial species that fall into the sea
and outlier detection. A key feature of the package is the ability to identify likely alternative
positions for points that represent obvious errors in the dataset and functions to explore
records in geographical and environmental space in order to identify possible errors in the
dataset. Functions are also available for converting coordinates that are in various text
formats into degrees, minutes and seconds and then into decimal degrees.