On 06/04/2019 10:03 a.m., Amit Govil wrote:
Hi,
I have a bunch of csv files to read in R. I'm unable to read them correctly
because in some of the files, there is a column ("Role") which has comma in
the values.
Sample data:
User, Role, Rule, GAPId
Sam, [HadoopAnalyst, DBA, Developer], R46443
I'm trying to play with the below code but it doesnt work:
Since you didn't give a reproducible example, you should at least say
what "doesn't work" means.
But here's some general advice: if you want to debug code, don't write
huge expressions like the chain of functions below, put things in
temporary variables and make sure you get what you were expecting at
each stage.
Instead of
files <- list.files(pattern='.*REDUNDANT(.*).csv$')
tbl <- sapply(files, function(f) {
gsub('\\[|\\]', '"', readLines(f)) %>%
read.csv(text = ., check.names = FALSE)
}) %>%
bind_rows(.id = "id") %>%
select(id, User, Rule) %>%
distinct()
try
files <- list.files(pattern='.*REDUNDANT(.*).csv$')
tmp1 <- sapply(files, function(f) {
gsub('\\[|\\]', '"', readLines(f)) %>%
read.csv(text = ., check.names = FALSE)
})
tmp2 <- tmp1 %>% bind_rows(.id = "id")
tmp3 <- tmp2 %>% select(id, User, Rule)
tbl <- tmp3 %>% distinct()
(You don't need pipes here, but it will make it easier to put the giant
expression back together at the end.)
Then look at tmp1, tmp2, tmp3 as well as tbl to see where things went
wrong.
Duncan Murdoch
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.