Installing and Removing R Packages With Ansible

I was asked by some of our Data Scientists to get a few R packages onto their server, which I configured by Ansible. R seems to be bit funny compared to other programming languages because it’s package installation happens inside R code, rather than with a dedicated commandline utility.

A quick google gave me a blog post by Svend Vanderveken, on how to install R packages with Ansible, but as the author points out, it reports ‘changed’ even when the package is already installed. This is not great

- name: r - packages
  command: >
    Rscript --slave --no-save --no-restore-history -e "if (! ('{{ item }}' %in% installed.packages()[,'Package'])) { install.packages(pkgs='{{ item }}', repos=c('http://ftp.heanet.ie/mirrors/cran.r-project.org/')); print('Added'); } else { print('Already installed'); }"
  register: r_result
  failed_when: "r_result.rc != 0 or 'had non-zero exit status' in r_result.stderr"
  changed_when: "'Added' in r_result.stdout"
  with_items:
    - getopt

Since it’s hard to read as a one liner here’s that R code expanded:

if (! ('{{ item }}' %in% installed.packages()[,'Package'])) {
    install.packages(pkgs='{{ item }}', repos=c('http://ftp.heanet.ie/mirrors/cran.r-project.org/'));
    print('Added');
} else {
    print('Already installed');
}

The R code will print out the status of the package, and then the Ansible code checks to the output to only report changed if the package was not installed already :)

Whilst we’re at it, here’s code for removing R packages:

- name: r - remove packages
  command: >
    /usr/bin/Rscript --slave --no-save --no-restore-history -e "if (! ('{{ item }}' %in% installed.packages()[,'Package'])) { print('Not installed'); } else { remove.packages(pkgs='{{ item }}'); print('Removed'); }"
  register: r_result
  failed_when: r_result.rc != 0
  changed_when: '"Removed" in r_result.stdout'
  with_items:
    - getopt

Expanded again for readability:

if (! ('{{ item }}' %in% installed.packages()[,'Package'])) {
    print('Not installed');
} else {
    remove.packages(pkgs='{{ item }}');
    print('Removed');
}

Of course, what we’d really like to see is an R package module, but I’m not very experienced with R so I don’t know how easy it would be to make this cross-platform compatible.

One last hint! Don’t use R packages if you don’t have to. In general, OS packages will be faster to install so you’ll get your provisioning done much quicker - this certainly stands for numpy in Python. So, whilst in this example I’ve used getopt, I’ve actually ended up using the apt action to install r-cran-getopt.


Tags: ansible, r