On 1/19/21 11:17 AM, Bill Dunlap wrote:
Your translate... function seems unnecessarily complicated and reusing the
name 'var' for both the input and the data.frame containing the input makes
it confusing to me.  The following replacement, f, uses your algorithm but
I think gets the answer you want.


I was thinking that the tidyverse might already have a recode-like operation. But the dplyr::recode appears to be deprecated and you get referred to case_when. Perhaps following an example from the `case_when` help page:


case_SEER_tsize <- function(tsize, upper, exceptions){

    case_when(tsize <=upper ~tsize,

              tsize %in% exceptions$bif ~ exceptions$new[match(tsize, exceptions$bif)])}


I'm guessing that my lack of tidyversatility means there's probably a `match`-equivalent that I'm not familiar with.


> test1 <- data.frame(old = c(99,95,93, 8));lup <- data.frame(bif = c(93, 95, 99), +                                                            new = c(3, 5, NA))
>
> test1 %>%
+     mutate(varb = case_SEER_tsize(.$old, 90, lup))
  old varb
1  99   NA
2  95    5
3  93    3
4   8    8

--

David.


f <-
function(var, upper, lookup) {
     names(lookup) <- c('old','new')
     var_df <- data.frame(old = var)
     lookup2 <- data.frame(old = c(1:upper),
                           new = c(1:upper))
     lookup3 <- rbind(lookup, lookup2)
     res <- left_join(var_df, lookup3, by = 'old')
     res$new # return a vector, not a data.frame or tibble.
}
E.g.,
data.frame(XXX=c(95,93,10,20), YYY=c(55,66,93,98)) %>% mutate( YYY_mm =
f(YYY, 90, lup))
   XXX YYY YYY_mm
1  95  55     55
2  93  66     66
3  10  93      3
4  20  98     NA

You can modify this so that it names the output column based on the name of
the input column (by returning a data.frame/tibble instead of a numeric
vector):

f1 <-
function(var, upper, lookup,  new_varname =
paste0(deparse1(substitute(var)), "_mm")) {
     names(lookup) <- c('old',new_varname)
     var_df <- data.frame(old = var)
     lookup2 <- data.frame(old = c(1:upper),
                           new = c(1:upper))
     names(lookup2)[2] <- new_varname
     lookup3 <- rbind(lookup, lookup2)
     res <- left_join(var_df, lookup3, by = 'old')[2]
     res
}
E.g.,
data.frame(XXX=c(95,93,10,20), YYY=c(55,66,93,98)) %>% mutate( f1(YYY,
90, lup))
   XXX YYY YYY_mm
1  95  55     55
2  93  66     66
3  10  93      3
4  20  98     NA

-Bill

On Tue, Jan 19, 2021 at 10:24 AM Steven Rigatti <sjriga...@gmail.com> wrote:

I am having some problems with what seems like a pretty simple issue. I
have some data where I want to convert numbers. Specifically, this is
cancer data and the size of tumors is encoded using millimeter
measurements. However, if the actual measurement is not available the
coding may imply a less specific range of sizes. For instance numbers 0-89
may indicate size in mm, but 90 indicates "greater than 90 mm" , 91
indicates "1 to 2 cm", etc. So, I want to translate 91 to 90, 92 to 15,
etc.

I have many such tables so I would like to be able to write a function
which takes as input a threshold over which new values need to be looked
up, and the new lookup table, returning the new values.

I successfully wrote the function:

translate_seer_numeric <- function(var, upper, lookup) {
     names(lookup) <- c('old','new')
     names(var) <- 'old'
     var <- as.data.frame(var)
     lookup2 <- data.frame(old = c(1:upper),
                           new = c(1:upper))
     lookup3 <- rbind(lookup, lookup2)
  print(var)
     res <- left_join(var, lookup3, by = 'old') %>%
          select(new)

     res

}

test1 <- data.frame(old = c(99,95,93, 8))lup <- data.frame(bif = c(93, 95,
99),
                   new = c(3, 5, NA))
translate_seer_numeric(test1, 90, lup)

The above test generates the desired output:

   old1  992  953  934   8
   new1  NA2   53   34   8

My problem comes when I try to put this in line with pipes and the mutate
function:

test1 %>%
      mutate(varb = translate_seer_numeric(var = old, 90, lup))####
  Error: Problem with `mutate()` input `varb`.
x Join columns must be present in data.
x Problem with `old`.
i Input `varb` is `translate_seer_numeric(var = test1$old, 90, lup)`.

Thoughts??

         [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to