OBSOLETE … Binary production now handled by BiocKubeInstall

This document is retained for historical purposes only.

Basic concepts

When R is used within a Docker container, users may confront the need to install additional packages. This can be time-consuming when compilation is required. To solve this, we have created a repository of container-based binary packages. Installing these during R sessions is fast and only involves file transfer.

A container for use with NHGRI AnVIL

We will use the workspace


to run a custom environment with 64 CPU, 240 GB RAM, and 300 GB disk.

We use the runtime container


The manifest for Bioconductor 3.10 software packages

We use the following function to generate a character vector with names of all software packages in a given release.

get_bioc_packagelist = function(rel = "RELEASE_3_10") {
 system("git clone git@git.bioconductor.org:admin/manifest")
 owd = getwd()
 system(paste("git checkout ", rel))
 proc_software.txt = function() {
  x = readLines("software.txt")[-1]  # first line is comment
  nn = which(nchar(x)==0)
  tmp = x[-nn]
  gsub("Package: ", "", tmp)

We’ll generate the vector of package names and copy it to a google bucket for retrieval in our AnVIL session.

software_3.10_2020_04_28 = get_bioc_packagelist()
save(software_3.10_2020_04_28, file="software_3.10_2020_04_28.rda")
system("gsutil cp software_3.10_2020_04_28.rda gs://biocbbs_2020a")

We will use this vector as input to BiocManager::install with Ncpus = 45.

Tasks within AnVIL

We set up the runtime based on


In Rstudio, we retrieve the vector of package names.

> getwd()
[1] "/home/rstudio"
> system("gsutil cp gs://biocbbs_2020a/software_3.10_2020_04_28.rda .")
Copying gs://biocbbs_2020a/software_3.10_2020_04_28.rda...
/ [1 files][ 11.6 KiB/ 11.6 KiB]                                                
Operation completed over 1 objects/11.6 KiB.                                     
> dir()
[1] "entrypoint.out"               "kitematic"                    "software_3.10_2020_04_28.rda" "welder.log"                  
> load("software_3.10_2020_04_28.rda")
> length(software_3.10_2020_04_28)
[1] 1823