Environments

In this lab, we will learn how to get information about environments and the search path.

Goal: by the end of this lab, you should be able to understand how loading packages affects the search path.

The search path

Understanding the search path is crucial to understanding how R looks for the values that are bound to names. When you start a new R session, the search path contains only those packages that are loaded by default.

search()
[1] ".GlobalEnv"        "package:stats"     "package:graphics" 
[4] "package:grDevices" "package:utils"     "package:datasets" 
[7] "package:methods"   "Autoloads"         "package:base"     

Note that if we use a function from another package using the :: operator, the package is loaded, but it is not added to the search path (i.e., attached).

rlang::search_envs()
[[1]] $ <env: global>
[[2]] $ <env: package:stats>
[[3]] $ <env: package:graphics>
[[4]] $ <env: package:grDevices>
[[5]] $ <env: package:utils>
[[6]] $ <env: package:datasets>
[[7]] $ <env: package:methods>
[[8]] $ <env: Autoloads>
[[9]] $ <env: package:base>
  1. How can you tell from the previous result that the rlang package is not part of the search path?

To add a package to the search path, use the library() command.

library(rlang)
search()
 [1] ".GlobalEnv"        "package:rlang"     "package:stats"    
 [4] "package:graphics"  "package:grDevices" "package:utils"    
 [7] "package:datasets"  "package:methods"   "Autoloads"        
[10] "package:base"     

The tidyverse is a kind of meta-package that loads several other packages. Note the order in which the new packages are loaded.

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ purrr::%@%()         masks rlang::%@%()
✖ dplyr::filter()      masks stats::filter()
✖ purrr::flatten()     masks rlang::flatten()
✖ purrr::flatten_chr() masks rlang::flatten_chr()
✖ purrr::flatten_dbl() masks rlang::flatten_dbl()
✖ purrr::flatten_int() masks rlang::flatten_int()
✖ purrr::flatten_lgl() masks rlang::flatten_lgl()
✖ purrr::flatten_raw() masks rlang::flatten_raw()
✖ purrr::invoke()      masks rlang::invoke()
✖ dplyr::lag()         masks stats::lag()
✖ purrr::splice()      masks rlang::splice()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
search()
 [1] ".GlobalEnv"        "package:lubridate" "package:forcats"  
 [4] "package:stringr"   "package:dplyr"     "package:purrr"    
 [7] "package:readr"     "package:tidyr"     "package:tibble"   
[10] "package:ggplot2"   "package:tidyverse" "package:rlang"    
[13] "package:stats"     "package:graphics"  "package:grDevices"
[16] "package:utils"     "package:datasets"  "package:methods"  
[19] "Autoloads"         "package:base"     
  1. How many different packages did the tidyverse add to the search path? Why do you think the developers chose the order they did?

Your environment

You can find out what environment you are in with current_env().

current_env()
<environment: R_GlobalEnv>

To see what is in an environment, use env_print().

env_print()
<environment: global>
Parent: <environment: package:lubridate>
Bindings:
• .QuartoInlineRender: <fn>
• posted: <lgl>
• .main: <fn>

Note that while the current environment is usually the global environment, that is is not always the case.

global_env()
<environment: R_GlobalEnv>

Let’s write a function that returns the environment that runs during its execution. Note that this is not the global environment.

func_env <- function() {
  x <- "what?"
  current_env()
}

env_print(func_env())
<environment: 0x559b2ab1df68>
Parent: <environment: global>
Bindings:
• x: <chr>
  1. What is the parent environment of the execution environment shown above?

Name masking

Let’s now change the global environment by writing a function called filter(). This function will just pass its argument to dplyr::filter() after printing a message to the screen.

filter <- function(.data, ...) {
  message(
    paste(
      "Filtering a", 
      first(class(.data)), 
      "object with", 
      nrow(.data), 
      "rows..."
    )
  )
  dplyr::filter(.data, ...)
}
  1. Use the new filter() function to find all the human characters in starwars. Does it work? Why or why not?

  2. How is this similar or different than what we did with print() in the S3 lab?

# SAMPLE SOLUTION

starwars %>%
  filter(species == "Human")
Filtering a tbl_df object with 87 rows...
# A tibble: 35 × 14
   name     height  mass hair_color skin_color eye_color birth_year sex   gender
   <chr>     <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr> <chr> 
 1 Luke Sk…    172    77 blond      fair       blue            19   male  mascu…
 2 Darth V…    202   136 none       white      yellow          41.9 male  mascu…
 3 Leia Or…    150    49 brown      light      brown           19   fema… femin…
 4 Owen La…    178   120 brown, gr… light      blue            52   male  mascu…
 5 Beru Wh…    165    75 brown      light      blue            47   fema… femin…
 6 Biggs D…    183    84 black      light      brown           24   male  mascu…
 7 Obi-Wan…    182    77 auburn, w… fair       blue-gray       57   male  mascu…
 8 Anakin …    188    84 blond      fair       blue            41.9 male  mascu…
 9 Wilhuff…    180    NA auburn, g… fair       blue            64   male  mascu…
10 Han Solo    180    80 brown      fair       brown           29   male  mascu…
# ℹ 25 more rows
# ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,
#   vehicles <list>, starships <list>

Note that our new function filter() is now in the global environment, which is first in the search path.

env_has(global_env(), "filter")
filter 
  TRUE 
search()
 [1] ".GlobalEnv"        "package:lubridate" "package:forcats"  
 [4] "package:stringr"   "package:dplyr"     "package:purrr"    
 [7] "package:readr"     "package:tidyr"     "package:tibble"   
[10] "package:ggplot2"   "package:tidyverse" "package:rlang"    
[13] "package:stats"     "package:graphics"  "package:grDevices"
[16] "package:utils"     "package:datasets"  "package:methods"  
[19] "Autoloads"         "package:base"     

But there are other environments that contain objects called filter. We can use map() to search through the chain of environments.

search_envs() %>%
  map_lgl(env_has, "filter")
           global package:lubridate   package:forcats   package:stringr 
             TRUE             FALSE             FALSE             FALSE 
    package:dplyr     package:purrr     package:readr     package:tidyr 
             TRUE             FALSE             FALSE             FALSE 
   package:tibble   package:ggplot2 package:tidyverse     package:rlang 
            FALSE             FALSE             FALSE             FALSE 
    package:stats  package:graphics package:grDevices     package:utils 
             TRUE             FALSE             FALSE             FALSE 
 package:datasets   package:methods         Autoloads      package:base 
            FALSE             FALSE             FALSE             FALSE 

We can also use find() to show us directly which environments contain an object that is bound to the name filter.

find("filter")
[1] ".GlobalEnv"    "package:dplyr" "package:stats"
  1. How can we use the filter() function from the stats package?
# SAMPLE SOLUTION

stats::filter()

Engagement

Prompt: If you could clarify one thing about environments, what would it be?