In many areas of research, product development and software engineering, analytical pipelines – workflows connecting output from multiple software – are key for processing and running tests on data. They can provide results in a consistent, modular and transparent manner. Pipelines also make it easier to demonstrate the reproducibility of one’s research as well as enabling analyses that update as new data are added. Not all analyses, however, can necessarily be run or coded in one’s favoured programming language as different parts of an analysis may require external software or packages. Integrating a variety of programs and software can lead to issues of portability (additional software may not run across all operating systems) and versioning errors (differing arguments across additional software versions). For the ideal pipeline, it should be possible to install and run any command-line software, within the main programming language of the pipeline, without concern for software versions or operating system. R (CRAN, 2019) is one of the most popular computer languages amongst researchers, and many packages exist for calling programs and code from non-R sources (e.g. sys (Ooms, 2019) for shell commands, reticulate (RStudio, 2019) for python and rJava (Urbanek, 2019) for Java). To our knowledge, however, no R package exists with the ability to launch external programs originating from any UNIX command-line source. The outsider packages work through docker (Docker Inc., 2020a)–aservicethat, through OS-level virtualization, enables deployment of isolated software “containers” – and a codesharing service, e.g. GitHub (GitHub, 2019), to allow a user to install and run, in theory, any external, command-line program or package, on any of the major operating systems (Windows, Linux, OSX).
This is a metadata only record.