About

The yspec package will read you data specification file when it is written in a specific yaml format.

For each data set in your project that needs documentation, create a yaml file that lists the columns in the data set along with details about the data in that column. This yaml file can be loaded into your R session into an object that you can work with. This is referred to as a spec object. The term spec refers to a single documented data object / data file.

Once all of your data sets have been documented with their own yaml file, you can create another object called a yproj objects. This is used to template the rendering of a single integrated data definitions file for your entire project.

Keep reading the vignette to see how it works.


library(yspec)

Example spec file

An example specification file looks like:

specfile <- ys_help$file()

cat(readLines(specfile)[1:29], sep = "\n")
  SETUP__:
    description: Example PopPK analysis data set.
    sponsor: example-project
    projectnumber: examp101F
    use_internal_db: true
    glue: 
      mugL: "$\\mu$g/L"
  C:
  NUM:
  ID:
  SUBJ: !look
  TIME:
    short: time after first dose
    unit: hour
  SEQ: 
    short: data type
    values: {observation: 0, dose: 1}
  CMT:
  EVID:
  AMT: !look
    unit: mg
  DV:  !look
    unit: "<<mugL>>"
  AGE:
  WT:
  CRCL:
    about: [creatinine clearance, ml/min]
  ALB:
  BMI:

Once the data specification yaml file is written, it can be loaded in R

   name  c d unit     short                         source           
   C     + + .        comment character             ysdb_internal.yml
   NUM   - - .        record number                 ysdb_internal.yml
   ID    - - .        subject identifier            ysdb_internal.yml
   SUBJ  + - .        subject identifier            ysdb_internal.yml
   TIME  - - hour     time after first dose         .                
   SEQ   - + .        data type                     .                
   CMT   - - .        compartment number            ysdb_internal.yml
   EVID  - - .        event ID                      ysdb_internal.yml
   AMT   - - mg       dose amount                   ysdb_internal.yml
   DV    - - <<mugL>> dependent variable            ysdb_internal.yml
   AGE   - - years    age                           ysdb_internal.yml
   WT    - - kg       weight                        ysdb_internal.yml
   CRCL  - - ml/min   creatinine clearance          .                
   ALB   - - g/dL     albumin                       ysdb_internal.yml
   BMI   - - m2/kg    BMI                           ysdb_internal.yml
   AAG   - - mg/dL    alpha-1-acid glycoprotein     .                
   SCR   - - mg/dL    serum creatinine              .                
   AST   - - .        aspartate aminotransferase    .                
   ALT   - - .        alanine aminotransferase      .                
   HT    - - cm       height                        ysdb_internal.yml
   CP    - + .        Child-Pugh score              .                
   TAFD  - - hours    time after first dose         .                
   TAD   - - hours    time after dose               .                
   LDOS  - - mg       last dose amount              .                
   MDV   - - .        MDV                           ysdb_internal.yml
   BLQ   - - .        below limit of quantification .                
   PHASE - - .        study phase indicator         .                
   STUDY - + .        study number                  .                
   RF    + + .        renal function stage          .

Data from specific columns can be printed

   name  value  
   col   WT     
   type  numeric
   short weight 
   unit  kg     
   range .

or summarized

summary(spec, WT, DV, EGFR)
      name c d     unit                         short            source
  1      C + +        .             comment character ysdb_internal.yml
  2    NUM - -        .                 record number ysdb_internal.yml
  3     ID - -        .            subject identifier ysdb_internal.yml
  4   SUBJ + -        .            subject identifier ysdb_internal.yml
  5   TIME - -     hour         time after first dose                 .
  6    SEQ - +        .                     data type                 .
  7    CMT - -        .            compartment number ysdb_internal.yml
  8   EVID - -        .                      event ID ysdb_internal.yml
  9    AMT - -       mg                   dose amount ysdb_internal.yml
  10    DV - - <<mugL>>            dependent variable ysdb_internal.yml
  11   AGE - -    years                           age ysdb_internal.yml
  12    WT - -       kg                        weight ysdb_internal.yml
  13  CRCL - -   ml/min          creatinine clearance                 .
  14   ALB - -     g/dL                       albumin ysdb_internal.yml
  15   BMI - -    m2/kg                           BMI ysdb_internal.yml
  16   AAG - -    mg/dL     alpha-1-acid glycoprotein                 .
  17   SCR - -    mg/dL              serum creatinine                 .
  18   AST - -        .    aspartate aminotransferase                 .
  19   ALT - -        .      alanine aminotransferase                 .
  20    HT - -       cm                        height ysdb_internal.yml
  21    CP - +        .              Child-Pugh score                 .
  22  TAFD - -    hours         time after first dose                 .
  23   TAD - -    hours               time after dose                 .
  24  LDOS - -       mg              last dose amount                 .
  25   MDV - -        .                           MDV ysdb_internal.yml
  26   BLQ - -        . below limit of quantification                 .
  27 PHASE - -        .         study phase indicator                 .
  28 STUDY - +        .                  study number                 .
  29    RF + +        .          renal function stage                 .

Check a data set against the spec

Use the ys_check() function, with the data frame as the first argument and the spec object as the second argument

## The data set passed all checks.

Example to render spec

The specification object can be rendered to a specification file with the ys_document function

ys_document(spec, stem = "working_document")
## Warning: Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OML/mdput/m/it/12'. Switching off
## (microtype)                protrusion for this font on input line 77.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OML/mdput/m/it/8'. Switching off
## (microtype)                protrusion for this font on input line 77.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OMS/mdput/m/n/12'. Switching off
## (microtype)                protrusion for this font on input line 77.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OMS/mdput/m/n/8'. Switching off
## (microtype)                protrusion for this font on input line 77.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OML/mdput/m/n/12'. Switching off
## (microtype)                protrusion for this font on input line 77.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OML/mdput/m/n/8'. Switching off
## (microtype)                protrusion for this font on input line 77.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OML/mdput/m/it/10'. Switching off
## (microtype)                protrusion for this font on input line 117.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OMS/mdput/m/n/10'. Switching off
## (microtype)                protrusion for this font on input line 117.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OML/mdput/m/n/10'. Switching off
## (microtype)                protrusion for this font on input line 117.

With output here.

ys_document will pass along arguments to rmarkdown::render so that you can control those aspects of how the document is rendered. You can also create custom output formats to get the data table to render in the way that you like.

Example project object

To create an project-wide listing of documented data sets, we create a yproj or project object. We create this from the spec objects that we read about in the previous section. Let’s load another object to use along with the object loaded in the previous section.

pdspec <- load_spec_ex("DEM104101F_PKPD.yml")

Now, we have two objects to work with:

head(spec)
##    name c d     unit                 short            source
## 1     C + +        .     comment character ysdb_internal.yml
## 2   NUM - -        .         record number ysdb_internal.yml
## 3    ID - -        .    subject identifier ysdb_internal.yml
## 4  SUBJ + -        .    subject identifier ysdb_internal.yml
## 5  TIME - -     hour time after first dose                 .
## 6   SEQ - +        .             data type                 .
## 7   CMT - -        .    compartment number ysdb_internal.yml
## 8  EVID - -        .              event ID ysdb_internal.yml
## 9   AMT - -       mg           dose amount ysdb_internal.yml
## 10   DV - - <<mugL>>    dependent variable ysdb_internal.yml
head(pdspec)
##    name c d           unit       short source
## 1     C + -              .           C      .
## 2   MDV - -              .         MDV      .
## 3   SEQ - +              .         SEQ      .
## 4   AMT - -             mg         AMT      .
## 5    II - -          hours          II      .
## 6   CMT - -              . Compartment      .
## 7  TAFD - -          hours        TAFD      .
## 8    WT - -             kg      Weight      .
## 9  EGFR - - ml/min/1.73 m2        eGFR      .
## 10  SEX - +              .         SEX      .

We can create a project object from both objects

## projectnumber:  examp101F 
## sponsor:        example-project 
## --------------------------------------------
## datafiles: 
##  name            description                       data_stem      
##  analysis1       Example PopPK analysis data set.  analysis1      
##  DEM104101F_PKPD Population PKPD analysis data set DEM104101F_PKPD

Render a project file

Working document

To render the project file we’ll use the same ys_document() function.
This time, we’ll add some extra (optional) arguments that will help us get the document to look the way we want:

## Warning: Package Fancyhdr Warning: \headheight is too small (12.0pt): 
##  Make it at least 22.66415pt.
##  We now make it that large for the rest of the document.
##  This may cause the page layout to be inconsistent, however.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OML/mdput/m/it/10'. Switching off
## (microtype)                protrusion for this font on input line 236.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OMS/mdput/m/n/10'. Switching off
## (microtype)                protrusion for this font on input line 236.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OML/mdput/m/n/10'. Switching off
## (microtype)                protrusion for this font on input line 236.
## Package Fancyhdr Warning: \headheight is too small (12.0pt): 
##  Make it at least 22.66415pt.
##  We now make it that large for the rest of the document.
##  This may cause the page layout to be inconsistent, however.
## Package Fancyhdr Warning: \headheight is too small (12.0pt): 
##  Make it at least 22.66415pt.
##  We now make it that large for the rest of the document.
##  This may cause the page layout to be inconsistent, however.
## Package Fancyhdr Warning: \headheight is too small (12.0pt): 
##  Make it at least 22.66415pt.
##  We now make it that large for the rest of the document.
##  This may cause the page layout to be inconsistent, however.

With output here.

Using the build_dir argument gets us the document rendered with Metrum Research Group branding. Also, author and title are passed into the configuration fields for this document.

Regulatory document

To get a document that is formatted according to FDA requirements, use:

## Warning: Package Fancyhdr Warning: \headheight is too small (12.0pt): 
##  Make it at least 22.66415pt.
##  We now make it that large for the rest of the document.
##  This may cause the page layout to be inconsistent, however.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OML/mdput/m/it/10'. Switching off
## (microtype)                protrusion for this font on input line 194.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OMS/mdput/m/n/10'. Switching off
## (microtype)                protrusion for this font on input line 194.
## Package microtype Warning: I cannot find a protrusion list for font
## (microtype)                `OML/mdput/m/n/10'. Switching off
## (microtype)                protrusion for this font on input line 194.
## Package Fancyhdr Warning: \headheight is too small (12.0pt): 
##  Make it at least 22.66415pt.
##  We now make it that large for the rest of the document.
##  This may cause the page layout to be inconsistent, however.
## Package Fancyhdr Warning: \headheight is too small (12.0pt): 
##  Make it at least 22.66415pt.
##  We now make it that large for the rest of the document.
##  This may cause the page layout to be inconsistent, however.

With output here.