Introduction

This “Getting Started with bbr” vignette takes the user through some basic scenarios for modeling with NONMEM using bbr, introducing you to its standard workflow and functionality.

bbr is an R interface for running bbi. Together they provide a solution for managing projects involving modeling and simulation with a number of software solutions used in pharmaceutical sciences. Currently, only NONMEM is supported, so this vignette will only address that.

Setup

Installing bbi

The first time you use bbr on a system or disk, you must first install bbi.

Setting bbi_exe_path

The bbi executable is a single binary file, you just need to tell bbr where to find it. This can be done by setting the bbr.bbi_exe_path option like so:

options("bbr.bbi_exe_path" = "/data/apps/bbi")

Please note that this must be set for every new R session, so it’s recommended to include the above snippet in your .Rprofile, replacing the "/data/apps/bbi" with the absolute path from your system/project, if you have installed it somewhere different.

Also note that this can be pointed to anywhere on your system; installing to /data/apps/bbi is merely a convention used at Metrum Research Group, where bbr was developed.

Installing with use_bbi()

bbr provides an installer function named use_bbi() that makes it easy to install bbi from within R.

bbr::use_bbi()

With no arguments passed, use_bbi() will default to installing bbi at the path you have set in options("bbr.bbi_exe_path"). This is why we recommend setting this first (and putting it in your .Rprofile so that it is sourced for every new R session). If this option is not set, use_bbi() will fall back to the bbi in my $PATH (accessed via Sys.which("bbi")) and then an OS-dependent default, in that order. See ?use_bbi() if you would like more details.

The use_bbi() function also optionally takes the absolute path where you want to install bbi, though we recommend most users to follow the pattern above: first specifying a path in options("bbr.bbi_exe_path"), and then letting use_bbi() install to that path by default.

Updating bbi or checking if it’s installed

If you’re not sure if you have bbi installed, you can use bbi_version() to check. This will return the version number of your installation, if one is found, or an empty string if you do not have bbi installed. You can also use bbi_current_release() to see the most current version available and run use_bbi() as specified above if you want to update.

You can also check the bbi documentation for manual installation instructions from the command line.

bbi.yaml configuration file

To actually submit models with bbi, you will also need a bbi.yaml configuration file. Think of this as containing “global default settings” for your bbi runs. Those settings will not be discussed here, but know that they can be modified globally by editing that file, or model-by-model as described in the “Passing arguments to bbi” section below.

bbi_init()

The bbi_init() function will create a bbi.yaml, with the default settings, in the specified directory. Optionally, you can pass the path to a bbi.yaml to the .config_path argument of any bbr function that needs one. However, by default these functions will look for one in the same folder as the model being submitted or manipulated. Therefore, if you will have a number of models in the same folder, it is easiest to put a bbi.yaml that folder.

MODEL_DIR <- "../model/nonmem/basic"  # this should be relative to your working directory

bbi_init(.dir = MODEL_DIR,            # the directory to create the bbi.yaml in
         .nonmem_dir = "/opt/NONMEM", # location of NONMEM installation
         .nonmem_version = "nm74gf")  # default NONMEM version to use

Here we define a define variable MODEL_DIR with the path to the directory containing our models. Notice that you must also pass the path to a working installation of NONMEM, and a default NONMEM version to use.

Note this only needs to be done once for each folder you are modeling in. Once the bbi.yaml exists, you will not need to run bbi_init() again unless you want to create another one; for example if you move to modeling in a different directory, or if you want a different set of default settings for some specific subset of models.

Initial modeling run

Create model object

To begin modeling, first create a model object using new_model(). This is an S3 object which you will pass to all of the other bbr functions that submit, summarize, or manipulate models.

The first argument (.path) must be the path to your model control stream without file extension. For instance, the call below assumes you have a control stream named 1.ctl or 1.mod in the directory you just assigned to MODEL_DIR above.

mod1 <- new_model(file.path(MODEL_DIR, 1))

This new_model() call will also create a 1.yaml file in that same directory, which stores model metadata like description and tags (discussed below). If you ever need to recreate this model object in memory, just run mod1 <- read_model(file.path(MODEL_DIR, 1)) to rebuild it from the YAML file on disk.

The model object you have just created can be passed to various functions which you will see in a minute. Now that we’ve created a model, the first thing we will do is submit the model to be run.

Submit model

This will return a process object. We won’t discuss this object in this vignette, but it contains some information about the submission call. Please note that checking on a model run in progress is not fully implemented. For now, users should check on their runs manually (by looking at the output directory) and only proceed to the next steps once it has successfully completed.

Passing arguments to bbi

There are a number of arguments that bbi can take to modify how models are run. You can print a list of available arguments using the print_bbi_args() helper function.

print_bbi_args()
#> additional_post_work_envs (character) -- Any additional values (as ENV KEY=VALUE) to provide for the post execution environment (sets CLI flag `--additional_post_work_envs`)
#> background (logical) -- RAW NMFE OPTION - Tells nonmem not to scan StdIn for control characters (sets CLI flag `--background`)
#> clean_lvl (numeric) -- clean level used for file output from a given (set of) runs (default 1) (sets CLI flag `--clean_lvl`)
#> config (character) -- Path (relative or absolute) to another bbi.yaml to load (sets CLI flag `--config`)
#> copy_lvl (numeric) -- copy level used for file output from a given (set of) runs (sets CLI flag `--copy_lvl`)
#> debug (logical) -- debug mode (sets CLI flag `--debug`)
#> delay (numeric) -- Selects a random number of seconds between 1 and this value to stagger / jitter job execution. Assists in dealing with large volumes of work dealing with the same data set. May avoid NMTRAN issues about not being able read / close files (sets CLI flag `--delay`)
#> ext_file (character) -- name of custom ext-file (sets CLI flag `--ext-file`)
#> git (logical) -- whether git is used (sets CLI flag `--git`)
#> json (logical) -- json tree of output, if possible (sets CLI flag `--json`)
#> licfile (character) -- RAW NMFE OPTION - Specify a license file to use with NMFE (Nonmem) (sets CLI flag `--licfile`)
#> log_file (character) -- If populated, specifies the file into which to store the output / logging details from bbi (sets CLI flag `--log_file`)
#> maxlim (numeric) -- RAW NMFE OPTION - Set the maximum values set for the buffers used by Nonmem (default 100) (sets CLI flag `--maxlim`)
#> mpi_exec_path (character) -- The fully qualified path to mpiexec. Used for nonmem parallel operations (default '/usr/local/mpich3/bin/mpiexec') (sets CLI flag `--mpi_exec_path`)
#> nm_version (character) -- Version of nonmem from the configuration list to use (sets CLI flag `--nm_version`)
#> nm_qual (logical) -- Whether or not to execute with nmqual (autolog.pl (sets CLI flag `--nmqual`)
#> nobuild (logical) -- RAW NMFE OPTION - Skips recompiling and rebuilding on nonmem executable (sets CLI flag `--nobuild`)
#> no_ext_file (logical) -- do not use ext file (sets CLI flag `--no-ext-file`)
#> no_grd_file (logical) -- do not use grd file (sets CLI flag `--no-grd-file`)
#> no_shk_file (logical) -- do not use shk file (sets CLI flag `--no-shk-file`)
#> overwrite (logical) -- Whether or not to remove existing output directories if they are present (sets CLI flag `--overwrite`)
#> parafile (character) -- Location of a user-provided parafile to use for parallel execution (sets CLI flag `--parafile`)
#> parallel (logical) -- Whether or not to run nonmem in parallel mode (sets CLI flag `--parallel`)
#> parallel_timeout (numeric) -- The amount of time to wait for parallel operations in nonmem before timing out (default 2147483647) (sets CLI flag `--parallel_timeout`)
#> post_work_executable (character) -- A script or binary to run when job execution completes or fails (sets CLI flag `--post_work_executable`)
#> prcompile (logical) -- RAW NMFE OPTION - Forces PREDPP compilation (sets CLI flag `--prcompile`)
#> prsame (logical) -- RAW NMFE OPTION - Indicates to nonmem that the PREDPP compilation step should be skipped (sets CLI flag `--prsame`)
#> preview (logical) -- preview action, but don't actually run command (sets CLI flag `--preview`)
#> save_config (logical) -- Whether or not to save the existing configuration to a file with the model (default true) (sets CLI flag `--save_config`)
#> threads (numeric) -- number of threads to execute with (default 4) (sets CLI flag `--threads`)
#> verbose (logical) -- verbose output (sets CLI flag `--verbose`)

As discussed in “Setup” above, these can be set globally in the bbi.yaml file, and you can see the default values of them in that file. However, specific arguments can also be set or changed for each model. This can be done in two ways:

Note that any bbi_args attached to a model object will override the relevant settings in bbi.yaml, and that .bbi_args passed into a submit_ or _summary call will override the relevant settings in both bbi.yaml and bbi_args attached to the model object.

See the docs for any of the functions just mentioned for more details on usage and syntax.

Overwriting output from a previously run model

It is common to run a model, make some tweaks to it, and then run it again. However, to avoid accidentally deleting model outputs, bbr will error by default if it sees existing output when trying to submit a model. To automatically overwrite any previous model output, just pass overwrite = TRUE to the .bbi_args argument described in the previous section. For example:

mod1 %>% submit_model(.bbi_args = list(overwrite = TRUE))

You can also change this setting globally by setting overwrite: true in the bbi.yaml file for your project.

Summarize model

Once the model run has completed, users can get a summary object containing much of the commonly used diagnostic information in a named list.

sum1 <- mod1 %>% model_summary()
print(names(sum1))
#> [1] "absolute_model_path" "run_details"         "run_heuristics"     
#> [4] "parameters_data"     "parameter_names"     "ofv"                
#> [7] "condition_number"    "shrinkage_details"

These elements can be accessed manually or extracted with built-in helper functions like so:

param_df1 <- sum1 %>% param_estimates()
param_df1
#> # A tibble: 9 × 8
#>   parameter_names estimate   stderr random_effect_sd random_effect_sdse fixed
#>   <chr>              <dbl>    <dbl>            <dbl>              <dbl> <lgl>
#> 1 THETA1            2.32   8.72e- 2           NA              NA        FALSE
#> 2 THETA2           54.6    3.38e+ 0           NA              NA        FALSE
#> 3 THETA3          463.     3.03e+ 1           NA              NA        FALSE
#> 4 THETA4           -0.0820 5.61e- 2           NA              NA        FALSE
#> 5 THETA5            4.18   1.38e+ 0           NA              NA        FALSE
#> 6 OMEGA(1,1)        0.0985 2.03e- 2            0.314           3.24e- 2 FALSE
#> 7 OMEGA(2,1)        0      1   e+10            0               1   e+10 TRUE 
#> 8 OMEGA(2,2)        0.157  2.72e- 2            0.396           3.44e- 2 FALSE
#> 9 SIGMA(1,1)        1      1   e+10            1               1   e+10 TRUE 
#> # … with 2 more variables: diag <lgl>, shrinkage <dbl>

To see how to load summaries of multiple models to an easy-to-read tibble, see the Creating a Model Summary Log vignette.

You can also use the nm_join() function to load the input data and any output tables from your model into a single tibble.

join_df <- nm_join(mod1) 
join_df %>% 
  select(ID, TIME, DV, CL, V, NPDE, CWRES)
#> # A tibble: 779 × 7
#>       ID  TIME    DV    CL     V  NPDE CWRES
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1     1  0     1.22  43.4  463. 0.185 0.291
#>  2     1  0.25 12.6   43.4  463. 2.47  3.01 
#>  3     1  0.5  11.2   43.4  463. 2.71  2.68 
#>  4     1  0.75 17.7   43.4  463. 2.94  4.24 
#>  5     1  1    21.9   43.4  463. 2.94  5.25 
#>  6     1  1.25 16.2   43.4  463. 2.94  3.88 
#>  7     1  1.5  18.7   43.4  463. 2.94  4.47 
#>  8     1  1.75 23.8   43.4  463. 2.94  5.69 
#>  9     1  2    16.5   43.4  463. 2.94  3.94 
#> 10     1  2.25 17.3   43.4  463. 2.94  4.15 
#> # … with 769 more rows

Iteration

Much of the benefit of bbr is leveraged in the model iteration workflow, and the run log summaries that can be created afterwards. For example, imagine you look at these model results and want to begin iterating on them with a new model.

If you are now in a new R session and no longer have your mod1 object in memory, you can easily rebuild it from the YAML file on disk with read_model():

mod1 <- read_model(file.path(MODEL_DIR, 1))

copy_model_from()

Now you can create a new model object, based on the original, copying and renaming the control stream in the process. By default, the copy_model_from() function will increment the name of the new model to the next available integer in your modeling directory, but you can also pass a custom name to the .new_model argument. See ?copy_model_from for more details this, and other optional arguments.

The copy_model_from() call below will create both 2.ctl and 2.yaml files in the same directory as the parent model, and return the model object corresponding to them. (copy_model_from() also stores the model’s “ancestry” which can be useful later in the project, as shown in the Using the based_on field vignette.)

mod2 <- copy_model_from(mod1) %>%
  update_model_id()
#> replacing 1 with 2 in /data/GHE/deployment/deployments/2021-11-29/renv/library/R-4.1/x86_64-pc-linux-gnu/bbr/model/nonmem/basic/2.ctl

The new control stream file 2.ctl will be identical to its parent model, though you can use update_model_id() (as above) to update to the new model ID in places like $TABLE statements.

Your new control stream can now be edited with any desired changes and then submitted and summarized exactly as above. At any time, you can also see the difference between a model and its parent using model_diff().

# manually edit control stream, then...
model_diff(mod2)
mod2 %>% submit_model()
mod2 %>% model_summary()

Adding tags and notes

After looking at these results, the user can add tags, which can later be used to organize your modeling runs.

mod1 <- mod1 %>% add_tags("orig acop model")
mod2 <- mod2 %>% add_tags("2 compartment")

Note that using free text for your tags is discouraged, for reasons mentioned in the ?modify_tags help page. For simplicity’s sake, we ignore that advice here, but please read it before using tags extensively in the wild.

In addition to tags, the user can add notes that can be referenced later.

mod2 <- mod2 %>% 
  add_notes("2 compartment model more appropriate than 1 compartment")

The model_diff() and tags_diff() function make it easy to compare models to their parent model. See the “Using based_on” vignette for examples.

Continue to iterate…

Now the iteration process continues with a third model. Note that you can tell copy_model_from() to inherit the tags from the parent model and automatically add them to the new model.

mod3 <- copy_model_from(mod2, .inherit_tags = TRUE)

Submit and summarize as before.

# manually edit control stream, then...
mod3 %>% submit_model()
mod3 %>% model_summary()

Add tags and notes for filtering in run log, described next.

mod3 <- mod3 %>% 
        add_tags(c("combined RUV", "iiv CL")) %>%
        add_notes("Added combined error structure because it seemed like a good idea")

Run log

At any point, the user can easily construct a “run log” tibble to summarize all models run up to this point.

Before we move on, note that you can get even more information about your models from the config_log() and summary_log() functions, as well as add_config() and add_summary() which automatically join the columns output from those functions against the tibble output from run_log(). See the “Further Reading” section below for links to vignettes demonstrating those functions.

log_df <- run_log(MODEL_DIR)
log_df
#> # A tibble: 3 × 9
#>   absolute_model_… run   yaml_md5 model_type description bbi_args based_on tags 
#>   <chr>            <chr> <chr>    <chr>      <chr>       <list>   <list>   <lis>
#> 1 /data/GHE/deplo… 1     c7486e7… nonmem     <NA>        <NULL>   <NULL>   <chr…
#> 2 /data/GHE/deplo… 2     bc60c6b… nonmem     <NA>        <NULL>   <chr [1… <chr…
#> 3 /data/GHE/deplo… 3     5a4970e… nonmem     <NA>        <NULL>   <chr [1… <chr…
#> # … with 1 more variable: notes <list>

The run_log() returns a tibble which can be manipulated like any other tibble. However, several of the columns (tags, notes, and based_on for example) are list columns, which complicates how you can interact with them. We provide some helper functions to more seamlessly interact with these log tibbles, as well as some sample tidyverse code below.

Viewing tags example

The bbr::collapse_to_string() function can collapse any list column into a string representation of its contents. It is specifically designed for collapsing columns like tags, notes, and based_on into a more human-readable format.

log_df %>% 
  collapse_to_string(tags, notes) %>%
  select(run, tags, notes)
#> # A tibble: 3 × 3
#>   run   tags                                notes                               
#>   <chr> <chr>                               <chr>                               
#> 1 1     orig acop model                     <NA>                                
#> 2 2     2 compartment                       2 compartment model more appropriat…
#> 3 3     2 compartment, combined RUV, iiv CL Added combined error structure beca…

Filtering tags example

This code uses purrr::map_lgl to filter the run log to only rows containing a specific tag ("2 compartment").

log_df %>% 
  filter(map_lgl(tags, ~ "2 compartment" %in% .x)) %>%
  collapse_to_string(tags, notes) %>%
  select(run, tags, notes)
#> # A tibble: 2 × 3
#>   run   tags                                notes                               
#>   <chr> <chr>                               <chr>                               
#> 1 2     2 compartment                       2 compartment model more appropriat…
#> 2 3     2 compartment, combined RUV, iiv CL Added combined error structure beca…

Further reading

Hopefully this has given you a good start on understanding the capabilities and basic workflow of bbr. Please see the other vignettes for demonstrations of more advanced or specific functionality.