On 06/04/2019 10:03 a.m., Amit Govil wrote:
Hi,

I have a bunch of csv files to read in R. I'm unable to read them correctly
because in some of the files, there is a column ("Role") which has comma in
the values.

Sample data:

User, Role, Rule, GAPId
Sam, [HadoopAnalyst, DBA, Developer], R46443

I'm trying to play with the below code but it doesnt work:

Since you didn't give a reproducible example, you should at least say what "doesn't work" means.

But here's some general advice: if you want to debug code, don't write huge expressions like the chain of functions below, put things in temporary variables and make sure you get what you were expecting at each stage.

Instead of

files <- list.files(pattern='.*REDUNDANT(.*).csv$')

tbl <- sapply(files, function(f) {
   gsub('\\[|\\]', '"', readLines(f)) %>%
     read.csv(text = ., check.names = FALSE)
}) %>%
   bind_rows(.id = "id") %>%
   select(id, User, Rule) %>%
   distinct()

try


files <- list.files(pattern='.*REDUNDANT(.*).csv$')

tmp1 <- sapply(files, function(f) {
  gsub('\\[|\\]', '"', readLines(f)) %>%
    read.csv(text = ., check.names = FALSE)
})

tmp2 <- tmp1 %>% bind_rows(.id = "id")

tmp3 <- tmp2 %>% select(id, User, Rule)

tbl <- tmp3 %>% distinct()

(You don't need pipes here, but it will make it easier to put the giant expression back together at the end.)

Then look at tmp1, tmp2, tmp3 as well as tbl to see where things went wrong.

Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to