To clean or not to clean: Cleaning open‐source data improves extinction risk assessments for threatened plant species - Royal Botanic Gardens, Kew research repository
Skip to main content
Shared Research Repository
Journal article

To clean or not to clean: Cleaning open‐source data improves extinction risk assessments for threatened plant species

17 November 2020

Abstract

Plants are under‐represented in conservation efforts, with only 9% of described species published on the IUCN Red List. Biodiversity aggregators including the Global Biodiversity Information Facility (GBIF) and the more recent Botanical Information and Ecology Network (BIEN) contain a wealth of potentially useful occurrence data. We investigate the influence of these data in accelerating plant extinction risk assessments for 225 endemic, near‐endemic, and socioeconomic Bolivian plant species. Geo‐referenced herbarium voucher specimens verified by taxonomic experts comprised our control data set. Open‐source data for 77 species was subjected to a two‐stage cleaning protocol (using an automated R package followed by a manual clean) and threat categories were computed based on extent of occurrence thresholds. Accuracy was the highest using cleaned GBIF data (76%) and uncleaned BIEN data (79%). Sensitivity was the highest for cleaned GBIF (73%) and BIEN (80%) data suggesting our cleaning protocol was essential to maximize sensitivity rates. Comparisons between the control, GBIF and BIEN data sets revealed a paucity of occurrence data for 148 species (66%), 72% of which qualified for a threatened category. Balancing data quantity and accuracy must be considered when using open‐source data. Filling data gaps for threatened species is a conservation priority to improve the coverage of threatened species within biodiversity aggregators.

Files

File nameDate UploadedVisibilityFile size
csp2.311.pdf
19 Nov 2020
Public
2.83 MB