renv: Project Environments for RKevin Ushey2020-01-301 / 32

The Motivation2 / 32

The Motivation

Have you ever finished a project, come back a year later, and asked:

2 / 32

The Motivation

Have you ever finished a project, come back a year later, and asked:

Why is my dplyr pipeline suddenly throwing an error? I swear it worked before...

2 / 32

The Motivation

Have you ever finished a project, come back a year later, and asked:

Why is my dplyr pipeline suddenly throwing an error? I swear it worked before...
What happened to my ggplot2 plots? Why are the bars upside down? I swear it worked before...

2 / 32

The Motivation

Have you ever finished a project, come back a year later, and asked:

Why is my dplyr pipeline suddenly throwing an error? I swear it worked before...
What happened to my ggplot2 plots? Why are the bars upside down? I swear it worked before...
Okay, I re-ran the analysis, but now nlme is complaining about model convergence: I swear this didn't happen before...

2 / 32

The Motivation

Have you ever finished a project, come back a year later, and asked:

Why is my dplyr pipeline suddenly throwing an error? I swear it worked before...
What happened to my ggplot2 plots? Why are the bars upside down? I swear it worked before...
Okay, I re-ran the analysis, but now nlme is complaining about model convergence: I swear this didn't happen before...

How can we make sure this never happens, ever again?

2 / 32

What is Packrat?

Before we talk about renv, let's talk about Packrat, our first effort to solve this problem.

3 / 32

What is Packrat?

Before we talk about renv, let's talk about Packrat, our first effort to solve this problem.

The story so far: In the beginning Packrat was created. This has made a lot of people very angry and been widely regarded as a bad move.

-- Douglas Adams, "The Hitchhiker's Guide to Reproducibility"

3 / 32

What is Packrat?

Before we talk about renv, let's talk about Packrat, our first effort to solve this problem.

The story so far: In the beginning Packrat was created. This has made a lot of people very angry and been widely regarded as a bad move.

-- Douglas Adams, "The Hitchhiker's Guide to Reproducibility"

Packrat's main problem is that it is not a pit of success.

3 / 32

What is Packrat?

Before we talk about renv, let's talk about Packrat, our first effort to solve this problem.

The story so far: In the beginning Packrat was created. This has made a lot of people very angry and been widely regarded as a bad move.

-- Douglas Adams, "The Hitchhiker's Guide to Reproducibility"

Packrat's main problem is that it is not a pit of success.

It works, but for the average user, it's frustrating to use, and it's challenging to recover when errors arise.

3 / 32

What is Packrat?

Before we talk about renv, let's talk about Packrat, our first effort to solve this problem.

The story so far: In the beginning Packrat was created. This has made a lot of people very angry and been widely regarded as a bad move.

-- Douglas Adams, "The Hitchhiker's Guide to Reproducibility"

Packrat's main problem is that it is not a pit of success.

It works, but for the average user, it's frustrating to use, and it's challenging to recover when errors arise.

renv's goal is to be a better Packrat.

3 / 32

What is Packrat renv?4 / 32

What is Packrat renv?

renv is a toolkit used to manage project-local libraries of R packages.

4 / 32

What is Packrat renv?

renv is a toolkit used to manage project-local libraries of R packages.

You can use renv to make your projects more:

4 / 32

What is Packrat renv?

renv is a toolkit used to manage project-local libraries of R packages.

You can use renv to make your projects more:

Isolated: Each project gets its own library of R packages, so you can feel free to upgrade and change package versions in one project without worrying about breaking your other projects.

4 / 32

What is Packrat renv?

renv is a toolkit used to manage project-local libraries of R packages.

You can use renv to make your projects more:

Isolated: Each project gets its own library of R packages, so you can feel free to upgrade and change package versions in one project without worrying about breaking your other projects.
Portable: Because renv captures the state of your R packages within a lockfile, you can more easily share and collaborate on projects with others, and ensure that everyone is working from a common base.

4 / 32

What is Packrat renv?

renv is a toolkit used to manage project-local libraries of R packages.

You can use renv to make your projects more:

Isolated: Each project gets its own library of R packages, so you can feel free to upgrade and change package versions in one project without worrying about breaking your other projects.
Portable: Because renv captures the state of your R packages within a lockfile, you can more easily share and collaborate on projects with others, and ensure that everyone is working from a common base.
Reproducible: Use renv::snapshot() to save the state of your R library to the lockfile renv.lock. You can later use renv::restore() to restore your R library exactly as specified in the lockfile.

4 / 32

What is Packrat renv?

renv is a toolkit used to manage project-local libraries of R packages.

You can use renv to make your projects more:

Isolated: Each project gets its own library of R packages, so you can feel free to upgrade and change package versions in one project without worrying about breaking your other projects.
Portable: Because renv captures the state of your R packages within a lockfile, you can more easily share and collaborate on projects with others, and ensure that everyone is working from a common base.
Reproducible: Use renv::snapshot() to save the state of your R library to the lockfile renv.lock. You can later use renv::restore() to restore your R library exactly as specified in the lockfile.

renv attempts to prescribe a default workflow that "just works" for most, but remains flexible enough that alternate workflows can be built on top of the tools provided by renv.

4 / 32

The Lay of the Land

Before we discuss renv specifically, let's briefly outline the default state of the world, in terms of R package installations.

5 / 32

The Lay of the Land

Before we discuss renv specifically, let's briefly outline the default state of the world, in terms of R package installations.

You've started a new project, and you're ready to analyze some data. You've downloaded dplyr, and you're ready to load and use it in your project:

5 / 32

The Lay of the Land

Before we discuss renv specifically, let's briefly outline the default state of the world, in terms of R package installations.

You've started a new project, and you're ready to analyze some data. You've downloaded dplyr, and you're ready to load and use it in your project:

library(dplyr)

5 / 32

The Lay of the Land

Before we discuss renv specifically, let's briefly outline the default state of the world, in terms of R package installations.

You've started a new project, and you're ready to analyze some data. You've downloaded dplyr, and you're ready to load and use it in your project:

library(dplyr)

When this code is run, R searches the active library paths for an installed copy of the dplyr package, then loads it. The natural questions that arise are:

5 / 32

The Lay of the Land

Before we discuss renv specifically, let's briefly outline the default state of the world, in terms of R package installations.

You've started a new project, and you're ready to analyze some data. You've downloaded dplyr, and you're ready to load and use it in your project:

library(dplyr)

When this code is run, R searches the active library paths for an installed copy of the dplyr package, then loads it. The natural questions that arise are:

What is a library path?

5 / 32

The Lay of the Land

Before we discuss renv specifically, let's briefly outline the default state of the world, in terms of R package installations.

You've started a new project, and you're ready to analyze some data. You've downloaded dplyr, and you're ready to load and use it in your project:

library(dplyr)

When this code is run, R searches the active library paths for an installed copy of the dplyr package, then loads it. The natural questions that arise are:

What is a library path?
What are the active library paths?

5 / 32

The Lay of the Land

Before we discuss renv specifically, let's briefly outline the default state of the world, in terms of R package installations.

You've started a new project, and you're ready to analyze some data. You've downloaded dplyr, and you're ready to load and use it in your project:

library(dplyr)

When this code is run, R searches the active library paths for an installed copy of the dplyr package, then loads it. The natural questions that arise are:

What is a library path?
What are the active library paths?
How does R search these library paths when loading a package?

5 / 32

The Lay of the Land

Before we discuss renv specifically, let's briefly outline the default state of the world, in terms of R package installations.

You've started a new project, and you're ready to analyze some data. You've downloaded dplyr, and you're ready to load and use it in your project:

library(dplyr)

When this code is run, R searches the active library paths for an installed copy of the dplyr package, then loads it. The natural questions that arise are:

What is a library path?
What are the active library paths?
How does R search these library paths when loading a package?

Let's explore each of these concepts quickly.

5 / 32

What is a Library Path?

We can borrow the definition from the R extensions manual:

6 / 32

What is a Library Path?

We can borrow the definition from the R extensions manual:

A directory into which packages are installed.

6 / 32

What is a Library Path?

We can borrow the definition from the R extensions manual:

A directory into which packages are installed.

That's it -- it's just a directory. Nothing more, nothing less.

6 / 32

What is a Library Path?

We can borrow the definition from the R extensions manual:

A directory into which packages are installed.

That's it -- it's just a directory. Nothing more, nothing less.

Let any remaining mystique be dispelled.

6 / 32

What is a Library Path?

Each R session can be configured to use multiple library paths -- that is, the default set of directories that R will search when attempting to find and load a package.

7 / 32

What is a Library Path?

Each R session can be configured to use multiple library paths -- that is, the default set of directories that R will search when attempting to find and load a package.

The .libPaths() function is used to get and set the library paths for an R session. For example, on my macOS machine running with R 3.6:

7 / 32

What is a Library Path?

Each R session can be configured to use multiple library paths -- that is, the default set of directories that R will search when attempting to find and load a package.

The .libPaths() function is used to get and set the library paths for an R session. For example, on my macOS machine running with R 3.6:

> .libPaths()
[1] "/Users/kevinushey/Library/R/3.6/library"
[2] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library"

7 / 32

What is a Library Path?

Each R session can be configured to use multiple library paths -- that is, the default set of directories that R will search when attempting to find and load a package.

The .libPaths() function is used to get and set the library paths for an R session. For example, on my macOS machine running with R 3.6:

> .libPaths()
[1] "/Users/kevinushey/Library/R/3.6/library"
[2] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library"

In this case, I have my own user library, which is first in the list. This is where packages I choose to download and install normally get installed.

7 / 32

What is a Library Path?

Each R session can be configured to use multiple library paths -- that is, the default set of directories that R will search when attempting to find and load a package.

The .libPaths() function is used to get and set the library paths for an R session. For example, on my macOS machine running with R 3.6:

> .libPaths()
[1] "/Users/kevinushey/Library/R/3.6/library"
[2] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library"

In this case, I have my own user library, which is first in the list. This is where packages I choose to download and install normally get installed.

The last library in the list is the system library. This is where the default packages that come with R are installed.

7 / 32

What is a Library Path?

Each R session can be configured to use multiple library paths -- that is, the default set of directories that R will search when attempting to find and load a package.

The .libPaths() function is used to get and set the library paths for an R session. For example, on my macOS machine running with R 3.6:

> .libPaths()
[1] "/Users/kevinushey/Library/R/3.6/library"
[2] "/Library/Frameworks/R.framework/Versions/3.6/Resources/library"

In this case, I have my own user library, which is first in the list. This is where packages I choose to download and install normally get installed.

The last library in the list is the system library. This is where the default packages that come with R are installed.

You may also have one or more site libraries -- think of these as administrator-managed directories of packages.

7 / 32

What is a Library Path?

R searches the active library paths, in order, to find an installation of the requested package.

8 / 32

What is a Library Path?

R searches the active library paths, in order, to find an installation of the requested package.

The first installation of the package that is discovered is loaded. The package's dependencies are also loaded from these same library paths.

8 / 32

What is a Library Path?

R searches the active library paths, in order, to find an installation of the requested package.

The first installation of the package that is discovered is loaded. The package's dependencies are also loaded from these same library paths.

So, back to our example with:

8 / 32

What is a Library Path?

R searches the active library paths, in order, to find an installation of the requested package.

The first installation of the package that is discovered is loaded. The package's dependencies are also loaded from these same library paths.

So, back to our example with:

library(dplyr)

8 / 32

What is a Library Path?

R searches the active library paths, in order, to find an installation of the requested package.

The first installation of the package that is discovered is loaded. The package's dependencies are also loaded from these same library paths.

So, back to our example with:

library(dplyr)

When this code is run, R will search the active library paths for an installation of the dplyr package, and load the first one it finds.

8 / 32

What is a Library Path?

R searches the active library paths, in order, to find an installation of the requested package.

The first installation of the package that is discovered is loaded. The package's dependencies are also loaded from these same library paths.

So, back to our example with:

library(dplyr)

When this code is run, R will search the active library paths for an installation of the dplyr package, and load the first one it finds.

You can use find.package() to search the library paths for an installed package:

8 / 32

What is a Library Path?

R searches the active library paths, in order, to find an installation of the requested package.

The first installation of the package that is discovered is loaded. The package's dependencies are also loaded from these same library paths.

So, back to our example with:

library(dplyr)

When this code is run, R will search the active library paths for an installation of the dplyr package, and load the first one it finds.

You can use find.package() to search the library paths for an installed package:

> find.package("dplyr")
[1] "/Users/kevinushey/Library/R/3.6/library/dplyr"

8 / 32

The Challenge

By default, each R session uses the same set of library paths.

9 / 32

The Challenge

By default, each R session uses the same set of library paths.

9 / 32

The Challenge

By default, each R session uses the same set of library paths.

This implies that if you were to install dplyr 0.8.2, that package would then become available in all three projects.

9 / 32

The Challenge

However, different projects may have different package dependencies. For example, suppose:

10 / 32

The Challenge

However, different projects may have different package dependencies. For example, suppose:

Project 1 requires dplyr 0.7.8,

10 / 32

The Challenge

However, different projects may have different package dependencies. For example, suppose:

Project 1 requires dplyr 0.7.8,
Project 2 requires dplyr 0.8.2, and

10 / 32

The Challenge

However, different projects may have different package dependencies. For example, suppose:

Project 1 requires dplyr 0.7.8,
Project 2 requires dplyr 0.8.2, and
Project 3 requires the development version of dplyr.

10 / 32

The Challenge

However, different projects may have different package dependencies. For example, suppose:

Project 1 requires dplyr 0.7.8,
Project 2 requires dplyr 0.8.2, and
Project 3 requires the development version of dplyr.

Unfortunately, these projects share the same library paths!

10 / 32

The Challenge

However, different projects may have different package dependencies. For example, suppose:

Project 1 requires dplyr 0.7.8,
Project 2 requires dplyr 0.8.2, and
Project 3 requires the development version of dplyr.

Unfortunately, these projects share the same library paths!

Hence, installing a new version of dplyr implies changing the version of dplyr used by each project.

10 / 32

The Challenge

However, different projects may have different package dependencies. For example, suppose:

Project 1 requires dplyr 0.7.8,
Project 2 requires dplyr 0.8.2, and
Project 3 requires the development version of dplyr.

Unfortunately, these projects share the same library paths!

Hence, installing a new version of dplyr implies changing the version of dplyr used by each project.

This could spell disaster -- especially if you had no record of which version of dplyr (or its dependencies!) were used for a particular project.

10 / 32

The Solution

The solution, then, is to use project-local libraries, to ensure that each project gets its own unique library of R packages.

11 / 32

The Solution

The solution, then, is to use project-local libraries, to ensure that each project gets its own unique library of R packages.

11 / 32

The Solution

The solution, then, is to use project-local libraries, to ensure that each project gets its own unique library of R packages.

By using project-local libraries, you can rest assured that upgrading the packages used in one project will not risk breaking your other projects.

11 / 32

The Solution

It is from this idea -- the use of project-local libraries -- that the renv package is borne. We:

12 / 32

The Solution

It is from this idea -- the use of project-local libraries -- that the renv package is borne. We:

Give each project its own project-local library,

12 / 32

The Solution

It is from this idea -- the use of project-local libraries -- that the renv package is borne. We:

Give each project its own project-local library,
Make it simple and straightforward for R sessions to use the project-local library,

12 / 32

The Solution

It is from this idea -- the use of project-local libraries -- that the renv package is borne. We:

Give each project its own project-local library,
Make it simple and straightforward for R sessions to use the project-local library,
Provide tools for managing the R packages installed in these project-local libraries,

12 / 32

The Solution

It is from this idea -- the use of project-local libraries -- that the renv package is borne. We:

Give each project its own project-local library,
Make it simple and straightforward for R sessions to use the project-local library,
Provide tools for managing the R packages installed in these project-local libraries,

And make the experience as seamless as possible, so that one can use renv without being an expert on renv.

12 / 32

Initializing a Project

The first step in activating renv for a project is:

13 / 32

Initializing a Project

The first step in activating renv for a project is:

renv::init()

13 / 32

Initializing a Project

The first step in activating renv for a project is:

renv::init()

This function forks the state of your default R libraries into a project-local library, and then prepares the infrastructure required to use renv in that project.

13 / 32

Initializing a Project

The first step in activating renv for a project is:

renv::init()

This function forks the state of your default R libraries into a project-local library, and then prepares the infrastructure required to use renv in that project.

In particular, a project-local .Rprofile is created (or amended), which is then used by new R sessions to automatically initialize renv and ensure the project-local library is used.

13 / 32

Initializing a Project

The first step in activating renv for a project is:

renv::init()

This function forks the state of your default R libraries into a project-local library, and then prepares the infrastructure required to use renv in that project.

In particular, a project-local .Rprofile is created (or amended), which is then used by new R sessions to automatically initialize renv and ensure the project-local library is used.

After calling renv::init(), you can continue working exactly as you did before. The only difference is that packages will now be installed to, and loaded from, your project-local library.

13 / 32

Initializing a Project

There are two main observable differences you'll see in your R session after renv has been activated in a project.

14 / 32

Initializing a Project

There are two main observable differences you'll see in your R session after renv has been activated in a project.

Firstly, a small banner will be displayed in the console:

* Project '~/projects/2020-rstudio-conf' loaded. [renv 0.9.2]
>

14 / 32

Initializing a Project

There are two main observable differences you'll see in your R session after renv has been activated in a project.

Firstly, a small banner will be displayed in the console:

* Project '~/projects/2020-rstudio-conf' loaded. [renv 0.9.2]
>

Secondly, the library paths (as reported by .libPaths()) will now be changed.

> .libPaths()
[1] "/Users/kevinushey/projects/2020-rstudio-conf/renv/library/R-3.6/x86_64-apple-darwin15.6.0"
[2] "/private/var/folders/b4/2422hswx71z8mgwtv4rhxchr0000gn/T/RtmpmchotD/renv-system-library"

You'll notice the use of a project-local library path, as well as a separate system library path (used for sandboxing, which we'll discuss later).

14 / 32

Saving and Loading

We've discussed renv's first goal -- make it simple to use project-local libraries.

15 / 32

Saving and Loading

We've discussed renv's first goal -- make it simple to use project-local libraries.

The second goal is to make it possible to save and load the state of your project-local libraries.

15 / 32

Saving and Loading

We've discussed renv's first goal -- make it simple to use project-local libraries.

The second goal is to make it possible to save and load the state of your project-local libraries.

Or, in the parlance of renv, you can snapshot and restore the state of your project-local libraries.

15 / 32

Saving and Loading

We've discussed renv's first goal -- make it simple to use project-local libraries.

The second goal is to make it possible to save and load the state of your project-local libraries.

Or, in the parlance of renv, you can snapshot and restore the state of your project-local libraries.

renv::snapshot()  # save the project's library state
renv::restore()   # load the project's library state

15 / 32

Saving and Loading

We've discussed renv's first goal -- make it simple to use project-local libraries.

The second goal is to make it possible to save and load the state of your project-local libraries.

Or, in the parlance of renv, you can snapshot and restore the state of your project-local libraries.

renv::snapshot()  # save the project's library state
renv::restore()   # load the project's library state

Let's discuss how these functions work.

15 / 32

Snapshot

You can capture the state of a project library using:

16 / 32

Snapshot

You can capture the state of a project library using:

> renv::snapshot()
The following package(s) will be updated in the lockfile:
# CRAN ===============================
- markdown    [* -> 1.1]
- rmarkdown   [* -> 2.1]
- yaml        [* -> 2.2.0]
Do you want to proceed? [y/N]: y
* Lockfile written to '~/projects/2020-rstudio-conf/renv.lock'.

16 / 32

Snapshot

You can capture the state of a project library using:

> renv::snapshot()
The following package(s) will be updated in the lockfile:
# CRAN ===============================
- markdown    [* -> 1.1]
- rmarkdown   [* -> 2.1]
- yaml        [* -> 2.2.0]
Do you want to proceed? [y/N]: y
* Lockfile written to '~/projects/2020-rstudio-conf/renv.lock'.

The state of your project library will be encoded into a lockfile, called renv.lock.

16 / 32

Snapshot

You can capture the state of a project library using:

> renv::snapshot()
The following package(s) will be updated in the lockfile:
# CRAN ===============================
- markdown    [* -> 1.1]
- rmarkdown   [* -> 2.1]
- yaml        [* -> 2.2.0]
Do you want to proceed? [y/N]: y
* Lockfile written to '~/projects/2020-rstudio-conf/renv.lock'.

The state of your project library will be encoded into a lockfile, called renv.lock.

The lockfile is a text (JSON) file, enumerating the packages installed in your project, their versions, and their sources.

16 / 32

Lockfile Example

{
  "R": {
    "Version": "3.6.1",
    "Repositories": [
      {
        "Name": "CRAN",
        "URL": "https://cran.rstudio.com"
      }
    ]
  },
  "Packages": {
    "renv": {
      "Package": "renv",
      "Version": "0.9.2",
      "Source": "Repository",
      "Repository": "CRAN"
    },
    < ... other package records ... >
  }
}

17 / 32

Lockfiles

The lockfile encodes the information required to later recover and re-install packages as necessary. This is useful for:

18 / 32

Lockfiles

The lockfile encodes the information required to later recover and re-install packages as necessary. This is useful for:

Time capsules, where you might want to freeze a project such that you can later return to the project with a record of the packages originally used to run the project,

18 / 32

Lockfiles

The lockfile encodes the information required to later recover and re-install packages as necessary. This is useful for:

Time capsules, where you might want to freeze a project such that you can later return to the project with a record of the packages originally used to run the project,
Collaborative workflows, where you might want to ensure all collaborators are working with the exact same set of packages, and

18 / 32

Lockfiles

The lockfile encodes the information required to later recover and re-install packages as necessary. This is useful for:

Time capsules, where you might want to freeze a project such that you can later return to the project with a record of the packages originally used to run the project,
Collaborative workflows, where you might want to ensure all collaborators are working with the exact same set of packages, and
Deployments, where you'd like to be sure that your project, when run remotely, uses the exact same set of packages that you were testing with locally.

18 / 32

Lockfiles

The lockfile encodes the information required to later recover and re-install packages as necessary. This is useful for:

Time capsules, where you might want to freeze a project such that you can later return to the project with a record of the packages originally used to run the project,
Collaborative workflows, where you might want to ensure all collaborators are working with the exact same set of packages, and
Deployments, where you'd like to be sure that your project, when run remotely, uses the exact same set of packages that you were testing with locally.

We'll see how you can use renv.lock to restore a project library next.

18 / 32

Restore

Given a lockfile renv.lock previously created by renv::snapshot(), you can restore the state of your project library using renv::restore():

19 / 32

Restore

Given a lockfile renv.lock previously created by renv::snapshot(), you can restore the state of your project library using renv::restore():

> renv::restore()
The following package(s) will be updated:
# CRAN ===============================
- markdown    [* -> 1.1]
- rmarkdown   [* -> 2.1]
- yaml        [* -> 2.2.0]
Do you want to proceed? [y/N]: y
Installing markdown [1.1] ...
    OK (linked cache)
Installing yaml [2.2.0] ...
    OK (linked cache)
Installing rmarkdown [2.1] ...
    OK (linked cache)

19 / 32

Restore

Given a lockfile renv.lock previously created by renv::snapshot(), you can restore the state of your project library using renv::restore():

> renv::restore()
The following package(s) will be updated:
# CRAN ===============================
- markdown    [* -> 1.1]
- rmarkdown   [* -> 2.1]
- yaml        [* -> 2.2.0]
Do you want to proceed? [y/N]: y
Installing markdown [1.1] ...
    OK (linked cache)
Installing yaml [2.2.0] ...
    OK (linked cache)
Installing rmarkdown [2.1] ...
    OK (linked cache)

Calling renv::restore() will download and re-install all of the declared packages as necessary.

19 / 32

Restore

renv contains the machinery required to install packages from many sources, including:

20 / 32

Restore

renv contains the machinery required to install packages from many sources, including:

CRAN

20 / 32

Restore

renv contains the machinery required to install packages from many sources, including:

CRAN
Bioconductor

20 / 32

Restore

renv contains the machinery required to install packages from many sources, including:

CRAN
Bioconductor
GitHub

20 / 32

Restore

renv contains the machinery required to install packages from many sources, including:

CRAN
Bioconductor
GitHub
Gitlab

20 / 32

Restore

renv contains the machinery required to install packages from many sources, including:

CRAN
Bioconductor
GitHub
Gitlab
Bitbucket

20 / 32

Restore

renv contains the machinery required to install packages from many sources, including:

CRAN
Bioconductor
GitHub
Gitlab
Bitbucket

renv also understands how to authenticate with private repositories as well. See https://rstudio.github.io/renv/articles/renv.html#authentication for more details.

20 / 32

Summary

We've now covered the three primary features of renv:

21 / 32

Summary

We've now covered the three primary features of renv:

Use renv::init() to initialize a project with a project-local library,

21 / 32

Summary

We've now covered the three primary features of renv:

Use renv::init() to initialize a project with a project-local library,
Use renv::snapshot() to save the project-local library's state,

21 / 32

Summary

We've now covered the three primary features of renv:

Use renv::init() to initialize a project with a project-local library,
Use renv::snapshot() to save the project-local library's state,
Use renv::restore() to restore the project-local library's state,

21 / 32

Summary

We've now covered the three primary features of renv:

Use renv::init() to initialize a project with a project-local library,
Use renv::snapshot() to save the project-local library's state,
Use renv::restore() to restore the project-local library's state,

The rest of this talk will focus on some of the extra features provided by renv.

21 / 32

Global Package Cache

One major issue with project-local libraries is the duplication of identical packages across projects.

22 / 32

Global Package Cache

One major issue with project-local libraries is the duplication of identical packages across projects.

For example, if you had 10 projects using dplyr 0.9.2, then you would also have 10 project libraries with dplyr 0.9.2 installed.

22 / 32

Global Package Cache

One major issue with project-local libraries is the duplication of identical packages across projects.

For example, if you had 10 projects using dplyr 0.9.2, then you would also have 10 project libraries with dplyr 0.9.2 installed.

This is costly -- both in terms of disk space used, as well as in installation time.

22 / 32

Global Package Cache

One major issue with project-local libraries is the duplication of identical packages across projects.

For example, if you had 10 projects using dplyr 0.9.2, then you would also have 10 project libraries with dplyr 0.9.2 installed.

This is costly -- both in terms of disk space used, as well as in installation time.

Imagine having to re-install all packages from the tidyverse every time you started a new project.

22 / 32

Global Package Cache

One major issue with project-local libraries is the duplication of identical packages across projects.

For example, if you had 10 projects using dplyr 0.9.2, then you would also have 10 project libraries with dplyr 0.9.2 installed.

This is costly -- both in terms of disk space used, as well as in installation time.

Imagine having to re-install all packages from the tidyverse every time you started a new project.

> system.time(install.packages("tidyverse"))
Installing package into '/home/kevin/R/x86_64-pc-linux-gnu-library/3.6'
(as 'lib' is unspecified)
also installing the dependencies <... 82 other packages ...>
< ... >
   user  system elapsed 
780.492 157.073 375.784

22 / 32

Global Package Cache

One major issue with project-local libraries is the duplication of identical packages across projects.

For example, if you had 10 projects using dplyr 0.9.2, then you would also have 10 project libraries with dplyr 0.9.2 installed.

This is costly -- both in terms of disk space used, as well as in installation time.

Imagine having to re-install all packages from the tidyverse every time you started a new project.

> system.time(install.packages("tidyverse"))
Installing package into '/home/kevin/R/x86_64-pc-linux-gnu-library/3.6'
(as 'lib' is unspecified)
also installing the dependencies <... 82 other packages ...>
< ... >
   user  system elapsed 
780.492 157.073 375.784

🤮

22 / 32

Global Package Cache

renv solves this problem through the use of a global package cache.

23 / 32

Global Package Cache

renv solves this problem through the use of a global package cache.

23 / 32

Global Package Cache

renv solves this problem through the use of a global package cache.

When dplyr 0.9.2 is installed, renv will move that installation into the global cache, and then link that installation into the project library as requested.

23 / 32

Package Installation

renv contains a helper function, renv::install(), that can be used to install packages.

24 / 32

Package Installation

renv contains a helper function, renv::install(), that can be used to install packages.

It borrows from the model used by the remotes package for installing packages.

24 / 32

Package Installation

renv contains a helper function, renv::install(), that can be used to install packages.

It borrows from the model used by the remotes package for installing packages.

For example:

24 / 32

Package Installation

renv contains a helper function, renv::install(), that can be used to install packages.

It borrows from the model used by the remotes package for installing packages.

For example:

renv::install("r-lib/rlang")

24 / 32

Package Installation

renv contains a helper function, renv::install(), that can be used to install packages.

It borrows from the model used by the remotes package for installing packages.

For example:

renv::install("r-lib/rlang")

can be used to install the development version of rlang package from GitHub.

24 / 32

Package Installation

The primary bonus is that renv will use the global package cache as appropriate when attempting to install a package -- thereby avoiding an unnecessary attempt to download and re-install an already-cached version of a package.

25 / 32

Package Installation

system.time(renv::install("tidyverse"))
< ... >
Installing rvest [0.3.5] ...
    OK (linked cache)
Installing tidyverse [1.3.0] ...
    OK (linked cache)
   user  system elapsed 
  0.802   0.141   1.054

25 / 32

Package Installation

system.time(renv::install("tidyverse"))
< ... >
Installing rvest [0.3.5] ...
    OK (linked cache)
Installing tidyverse [1.3.0] ...
    OK (linked cache)
   user  system elapsed 
  0.802   0.141   1.054

☺️

25 / 32

Dependency Discovery

renv also includes some machinery which can be used to find which R packages are used in your project.

26 / 32

Dependency Discovery

renv also includes some machinery which can be used to find which R packages are used in your project.

renv::dependencies()

26 / 32

Dependency Discovery

renv also includes some machinery which can be used to find which R packages are used in your project.

renv::dependencies()

This function will crawl the R scripts in your project to determine what R packages are referenced in your code. .R, .Rnw, and .Rmd files are all supported.

26 / 32

Dependency Discovery

renv also includes some machinery which can be used to find which R packages are used in your project.

renv::dependencies()

This function will crawl the R scripts in your project to determine what R packages are referenced in your code. .R, .Rnw, and .Rmd files are all supported.

For example, the project hosting these slides uses the following packages:

26 / 32

Dependency Discovery

renv also includes some machinery which can be used to find which R packages are used in your project.

renv::dependencies()

This function will crawl the R scripts in your project to determine what R packages are referenced in your code. .R, .Rnw, and .Rmd files are all supported.

For example, the project hosting these slides uses the following packages:

> renv::dependencies()
Finding R package dependencies ... Done!
      Source   Package Require Version   Dev
1  renv.lock      renv                 FALSE
2 slides.Rmd rmarkdown                 FALSE
3 slides.Rmd  xaringan                 FALSE

26 / 32

Snapshot Types

Sometimes, it can be important to control which packages enter the renv.lock lockfile. By default, only packages used in the project, as reported by renv::dependencies(), will enter the lockfile.

27 / 32

Snapshot Types

In other words, if you have a package installed in your project library, but you don't actually reference or use that package in your project, it will not enter the lockfile!

27 / 32

Snapshot Types

In other words, if you have a package installed in your project library, but you don't actually reference or use that package in your project, it will not enter the lockfile!

This can be useful if you have auxiliary packages that you use during project development, but don't actually require those packages at runtime.

27 / 32

Snapshot Types

In other words, if you have a package installed in your project library, but you don't actually reference or use that package in your project, it will not enter the lockfile!

This can be useful if you have auxiliary packages that you use during project development, but don't actually require those packages at runtime.

The snapshot type can be configured on a per-project basis, with:

renv::settings$snapshot.type("packrat")  # used packages
renv::settings$snapshot.type("simple")   # all packages

27 / 32

Version Control28 / 32

Version Control

By default, every project contains only a single renv.lock. A natural question, then, is:

28 / 32

Version Control

By default, every project contains only a single renv.lock. A natural question, then, is:

How do I manage the history of my renv.lock?

28 / 32

Version Control

By default, every project contains only a single renv.lock. A natural question, then, is:

How do I manage the history of my renv.lock?

renv delegates this responsibility to your version control system, and provides some special helpers for working specifically with Git.

28 / 32

Version Control

By default, every project contains only a single renv.lock. A natural question, then, is:

How do I manage the history of my renv.lock?

renv delegates this responsibility to your version control system, and provides some special helpers for working specifically with Git.

# find prior commits in which renv.lock has changed
renv::history()
# revert renv.lock to its state at a prior commit
renv::revert(commit = "abc123")

28 / 32

Version Control

By default, every project contains only a single renv.lock. A natural question, then, is:

How do I manage the history of my renv.lock?

renv delegates this responsibility to your version control system, and provides some special helpers for working specifically with Git.

# find prior commits in which renv.lock has changed
renv::history()
# revert renv.lock to its state at a prior commit
renv::revert(commit = "abc123")

28 / 32

Configuring renv

While renv tries to provide a default workflow that works well in most cases, you may still find it doesn't quite fit your particular use case.

29 / 32

Configuring renv

While renv tries to provide a default workflow that works well in most cases, you may still find it doesn't quite fit your particular use case.

Fortunately, renv is fairly extensible and configurable:

29 / 32

Configuring renv

While renv tries to provide a default workflow that works well in most cases, you may still find it doesn't quite fit your particular use case.

Fortunately, renv is fairly extensible and configurable:

See ?renv::paths for configuration of the various paths used by renv (e.g. library, cache paths),

29 / 32

Configuring renv

While renv tries to provide a default workflow that works well in most cases, you may still find it doesn't quite fit your particular use case.

Fortunately, renv is fairly extensible and configurable:

See ?renv::paths for configuration of the various paths used by renv (e.g. library, cache paths),
See ?renv::config for configuration of user-level behaviors,

29 / 32

Configuring renv

While renv tries to provide a default workflow that works well in most cases, you may still find it doesn't quite fit your particular use case.

Fortunately, renv is fairly extensible and configurable:

See ?renv::paths for configuration of the various paths used by renv (e.g. library, cache paths),
See ?renv::config for configuration of user-level behaviors,
See ?renv::settings for configuration of project-level settings.

29 / 32

Configuring renv

While renv tries to provide a default workflow that works well in most cases, you may still find it doesn't quite fit your particular use case.

Fortunately, renv is fairly extensible and configurable:

See ?renv::paths for configuration of the various paths used by renv (e.g. library, cache paths),
See ?renv::config for configuration of user-level behaviors,
See ?renv::settings for configuration of project-level settings.

For example, you can disable the use of the renv cache in a particular project, with:

renv::settings$use.cache(FALSE)

29 / 32

What's Missing?30 / 32

What's Missing?

renv, on its own, is not a panacea. It solves only one part of the problem -- reproduction of a project's R package dependencies. There are a myriad of other factors that can affect the results of an analysis, including (but not limited to):

30 / 32

What's Missing?

The version of R,

30 / 32

What's Missing?

The version of R,
The operating system in use,

30 / 32

What's Missing?

The version of R,
The operating system in use,
The compiler flags / versions used when R and packages are built,

30 / 32

What's Missing?

The version of R,
The operating system in use,
The compiler flags / versions used when R and packages are built,
The LAPACK / BLAS system(s) in use,

30 / 32

What's Missing?

The version of R,
The operating system in use,
The compiler flags / versions used when R and packages are built,
The LAPACK / BLAS system(s) in use,

And so on. Docker is a tool that helps solve this problem through the use of containers. Very roughly speaking, one can think of a container as a small, self-contained system within which different applications can be run. For more details, please see https://environments.rstudio.com/docker.

30 / 32

What's Missing?

You can view renv and Docker as complementary tools. Use Docker to manage your system-level requirements, and use renv to manage your R package-level requirements.

31 / 32

What's Missing?

You can view renv and Docker as complementary tools. Use Docker to manage your system-level requirements, and use renv to manage your R package-level requirements.

In addition, the rocker project brings R to Docker, and provides pre-built Docker images using R built on top of the Debian Linux operating system.

31 / 32

What's Missing?

You can view renv and Docker as complementary tools. Use Docker to manage your system-level requirements, and use renv to manage your R package-level requirements.

In addition, the rocker project brings R to Docker, and provides pre-built Docker images using R built on top of the Debian Linux operating system.

See https://rstudio.github.io/renv/articles/docker.html for more details.

31 / 32

https://github.com/kevinushey/2020-rstudio-conf

Thanks for attending!

Install renv from CRAN with:

install.packages("renv")

View the renv pkgdown documentation online at:

https://rstudio.github.io/renv/

Learn more about RStudio's tools for environment management:

https://environments.rstudio.com/

32 / 32

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help