Hello,

I've been very impressed by the reshape package and how easy it makes reorganizing statistical data structures. This makes me wonder if there's another package out there that addresses another set of tasks that one often does when preparing data for analysis.

For any particular set of analyses, one typically recodes variables and deletes cases and variables. It would be really nice to have a package that, for example, if one selected cases from a larger data set based on the values of certain variables would inspect the resulting data and drop any variables that have the same value for all cases. Similarly, if any cases are entirely zero or NA, the package could (under user control) drop these cases. Finally, it could take a set of data transformations and keep them as an object, so that the same selection/reshape/streamlining can easily be applied to similar data sets.

My motivation for this came from working with employment data this morning. I started out with 11 variables and 35569 cases for Rhode Island, a few selections later I had only 420 cases and 3 variables. It struck me that the process I went through, which included not only making selections but also inspecting the results and deleting unnecessary cases/variables, could be automated at least to eliminate the inspection step. Also, since I want to do the same thing with data for other states, automation would be very nice indeed.

I realize that programming this kind of stuff in R is relatively easy, but the reshape package makes me wonder if someone has already done it.

Thanks
    Marsh Feldman

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to