On Jun 29, 2015, at 5:03 PM, Rich Shepard wrote: > Moving from interactive use of R to scripts and functions and have bumped > into what I believe is a problem with variable names. Did not see a solution > in the two R programming books I have or from my Web searches. Inexperience > with ess-tracebug keeps me from refining my bug tracking. > > Here's a test data set (cleverly called 'testset.dput'): > > structure(list(stream = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("B", "J", > "S"), class = "factor"), > sampdate = structure(c(8121, 8121, 8121, 8155, 8155, 8155, > 8185, 8185, 8185, 8205, 8205, 8205, 8236, 8236, 8236, 8257, > 8257, 8257, 8308, 8785, 8785, 8785, 8785, 8785, 8785, 8785, > 8847, 8847, 8847, 8847, 8847, 8847, 8847, 8875, 8875, 8875, > 8875, 8875, 8875, 8875, 8121, 8121, 8121, 8155, 8155, 8155, > 8185, 8185, 8185, 8205, 8205, 8205, 8236, 8236, 8236, 8257, > 8257, 8257, 8301, 8301, 8301), class = "Date"), param = structure(c(2L, > 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, > 6L, 7L, 2L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, > 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, > 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L > ), .Label = c("Ca", "Cl", "K", "Mg", "Na", "SO4", "pH"), class = "factor"), > quant = c(4, 33, 8.43, 4, 32, 8.46, 4, 31, 8.43, 6, 33, 8.32, > 5, 33, 8.5, 5, 32, 8.5, 5, 59.9, 3.46, 1.48, 29, 7.54, 64.6, > 7.36, 46, 2.95, 1.34, 21.8, 5.76, 48.8, 7.72, 74.2, 5.36, > 2.33, 38.4, 8.27, 141, 7.8, 3, 76, 6.64, 4, 74, 7.46, 2, > 82, 7.58, 5, 106, 7.91, 3, 56, 7.83, 3, 51, 7.6, 6, 149, > 7.73)), .Names = c("stream", "sampdate", "param", "quant" > ), row.names = c(NA, -61L), class = "data.frame") > > I want to subset that data.frame on each of the stream names: B, J, and S. > This is the function that has the naming error (eda.R): > > extstream = function(alldf) { > sname = alldf$stream > sdate = alldf$sampdate > comp = alldf$param > value = alldf$quant > for (i in sname) { > sname <- subset(alldf, alldf$stream, select = c(sdate, comp, value))
Never use the form dfrm$colname as the argument to the subset argument of subset. You can see that 'stream' is a factor, right? Perhaps Furthermore, by inspection you can see that there is no colname =='sdate', so I would guess that would be your next error. Or 'comp' or 'value' for that matter. Oh now I see, you made them outside of `alldf`. Then how is that supposed to work. The subset function is supposed to be looking inside `alldf` to find those column names. Perhaps: subset(alldf, stream %in% c('B', 'J', 'S'), .... .... but have not figured out why you used 'subset' if you wanted: select = c(sdate, comp, value)) Furthermore, it is generally error prone to use `subset` inside functions. The help page warns against the practice. Better to use "[". > return(sname) > } > } > > This is the result of running source('eda.R') followed by > >> extstream(testset) > Error in subset.data.frame(alldf, alldf$stream, select = c(sdate, comp, : > 'subset' must be logical > > I've tried using sname for the rows to select, but that produces a > different error of trying to select undefined columns. Right. Those are not column names in any dataframe. > > A pointer to the correct syntax for subset() is needed. No. A pointer to the correct use of "[" is needed. -- David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.