add_encrypt() adds an encryption step to a transformation pipeline. When ran as a transformation, each specified variable undergoes replacement via an encryption hashing function depending on the hash_key and seed set.

add_encrypt(object, ..., hash_key = "", seed = NA)

Arguments

object

Either a data.frame, tibble, or existing DeidentList pipeline.

...

variables to be transformed.

hash_key

a random alphanumeric key to control encryption

seed

a random alphanumeric to concat to the value being encrypted

Value

A 'DeidentList' representing the untrained transformation pipeline. The object contains fields:

  • deident_methods a list of each step in the pipeline (consisting of variables and method)

and methods:

  • mutate apply the pipeline to a new data set

  • to_yaml serialize the pipeline to a '.yml' file

Examples


# Basic usage; without setting a `hash_key` or `seed` encryption is poor.
pipe.encrypt <- add_encrypt(ShiftsWorked, Employee)
pipe.encrypt$mutate(ShiftsWorked)
#> # A tibble: 3,100 × 7
#>    `Record ID` Employee   Date       Shift `Shift Start` `Shift End` `Daily Pay`
#>          <int> <hash>     <date>     <chr> <chr>         <chr>             <dbl>
#>  1           1 312d52572… 2015-01-01 Night 17:01         00:01              78.1
#>  2           2 027f4cb5c… 2015-01-01 Day   08:01         16:01             155. 
#>  3           3 7abda183b… 2015-01-01 Day   08:01         16:01              77.8
#>  4           4 a9233521d… 2015-01-01 Day   08:01         15:01             203. 
#>  5           5 3dbf8cef9… 2015-01-01 Night 16:01         23:01             211. 
#>  6           6 a20aa7587… 2015-01-01 Night 17:01         00:01             142. 
#>  7           7 1cfba46f6… 2015-01-01 Rest  NA            NA                  0  
#>  8           8 09f892f6f… 2015-01-01 Night 17:01         00:01             213. 
#>  9           9 f86b94c94… 2015-01-01 Night 16:01         00:01             219. 
#> 10          10 7752db482… 2015-01-01 Night 16:01         00:01             242. 
#> # ℹ 3,090 more rows

# Once set the encryption is more secure assuming `hash_key` and `seed` are
# not exposed.
pipe.encrypt.secure <- add_encrypt(ShiftsWorked, Employee, hash_key = "hash1", seed = "Seed2")
pipe.encrypt.secure$mutate(ShiftsWorked)
#> # A tibble: 3,100 × 7
#>    `Record ID` Employee   Date       Shift `Shift Start` `Shift End` `Daily Pay`
#>          <int> <hash>     <date>     <chr> <chr>         <chr>             <dbl>
#>  1           1 53af26ae1… 2015-01-01 Night 17:01         00:01              78.1
#>  2           2 bb45dae19… 2015-01-01 Day   08:01         16:01             155. 
#>  3           3 67097a8d1… 2015-01-01 Day   08:01         16:01              77.8
#>  4           4 48c337406… 2015-01-01 Day   08:01         15:01             203. 
#>  5           5 8bff17034… 2015-01-01 Night 16:01         23:01             211. 
#>  6           6 70636e28d… 2015-01-01 Night 17:01         00:01             142. 
#>  7           7 0dec10cca… 2015-01-01 Rest  NA            NA                  0  
#>  8           8 eb692692b… 2015-01-01 Night 17:01         00:01             213. 
#>  9           9 1e13a7eca… 2015-01-01 Night 16:01         00:01             219. 
#> 10          10 83b9ae7db… 2015-01-01 Night 16:01         00:01             242. 
#> # ℹ 3,090 more rows