add_pseudonymize.Rd
add_pseudonymize()
adds a psuedonymization step to a transformation pipeline.
When ran as a transformation, terms that have not been seen before are given a new
random alpha-numeric string while terms that have been previously transformed
reuse the same term.
add_pseudonymize(object, ..., lookup = list())
A 'DeidentList' representing the untrained transformation pipeline. The object contains fields:
deident_methods
a list of each step in the pipeline (consisting of variables
and method
)
and methods:
mutate
apply the pipeline to a new data set
to_yaml
serialize the pipeline to a '.yml' file
# Basic usage;
pipe.pseudonymize <- add_pseudonymize(ShiftsWorked, Employee)
pipe.pseudonymize$mutate(ShiftsWorked)
#> # A tibble: 3,100 × 7
#> `Record ID` Employee Date Shift `Shift Start` `Shift End` `Daily Pay`
#> <int> <chr> <date> <chr> <chr> <chr> <dbl>
#> 1 1 n6ajf 2015-01-01 Night 17:01 00:01 78.1
#> 2 2 IwNIF 2015-01-01 Day 08:01 16:01 155.
#> 3 3 J26Z1 2015-01-01 Day 08:01 16:01 77.8
#> 4 4 ox8RD 2015-01-01 Day 08:01 15:01 203.
#> 5 5 Grs7g 2015-01-01 Night 16:01 23:01 211.
#> 6 6 WOLOF 2015-01-01 Night 17:01 00:01 142.
#> 7 7 dlqdf 2015-01-01 Rest NA NA 0
#> 8 8 siZKP 2015-01-01 Night 17:01 00:01 213.
#> 9 9 59DXe 2015-01-01 Night 16:01 00:01 219.
#> 10 10 sfcIr 2015-01-01 Night 16:01 00:01 242.
#> # ℹ 3,090 more rows
pipe.pseudonymize2 <- add_pseudonymize(ShiftsWorked, Employee,
lookup = list("Kyle Wilson" = "Kyle")
)
pipe.pseudonymize2$mutate(ShiftsWorked)
#> # A tibble: 3,100 × 7
#> `Record ID` Employee Date Shift `Shift Start` `Shift End` `Daily Pay`
#> <int> <chr> <date> <chr> <chr> <chr> <dbl>
#> 1 1 CSmIB 2015-01-01 Night 17:01 00:01 78.1
#> 2 2 raxif 2015-01-01 Day 08:01 16:01 155.
#> 3 3 ZxbqT 2015-01-01 Day 08:01 16:01 77.8
#> 4 4 ZChKS 2015-01-01 Day 08:01 15:01 203.
#> 5 5 X4eLw 2015-01-01 Night 16:01 23:01 211.
#> 6 6 qoGA5 2015-01-01 Night 17:01 00:01 142.
#> 7 7 atM46 2015-01-01 Rest NA NA 0
#> 8 8 bTiKo 2015-01-01 Night 17:01 00:01 213.
#> 9 9 MQtf6 2015-01-01 Night 16:01 00:01 219.
#> 10 10 q95aY 2015-01-01 Night 16:01 00:01 242.
#> # ℹ 3,090 more rows