On Wed, Aug 8, 2012 at 10:58 AM, DG Christensen <d...@enservio.com> wrote: > Hello all, I would like some advice on how to order elements in a vector. > > Background: my company is running a k-means clustering model on our > historical data warehouse of products, which will produce a matrix of > cluster centers. Then, on our production web servers, we will take > newly created products and find the cluster that is closest to the new > product (we're calling this "scoring" the product). Simple stuff. The > complex part is that the data source for the model is different from the > source of the new product. > > My concern is how to best ensure that the order of the product > attributes in the clustering model matches the attributes of the new > product vector. Here's what I'm considering doing: > > Say my company keeps the attributes height, width, and length on our > products (in reality we'll have over 200 attributes). I will create a > constant of the column (i.e. attribute) names: > > PRODUCT.ATTRIBUTE.COLS <- c("H","W","L") > PRODUCT.ATTRIBUTE.COUNT <- length( PRODUCT.ATTRIBUTE.COLS ) > > All new vectors (both during modeling and scoring) will be created with > NaN values: > > product.vector <- rep(NaN, PRODUCT.ATTRIBUTE.COUNT) > names( product.vector ) <- PRODUCT.ATTRIBUTE.COLS > > The vector will then be populated with attribute values like this. The > values will be retrieved from whatever DB we're using: > > product.vector["H"] <- height.from.db > product.vector["W"] <- width.from.db > product.vector["L"] <- length.from.db > > Is this a reasonable way to do this? If so, one thing I'd like to add > is error checking that validates that the attribute name exists, so if > the code attempted to do: > > product.vector["WEIGHT"] <- weight.from.db > > it would throw some sort of error. What's the best way for handling > that? Can I set the length of the vector to a fixed size?
Hi DG, You can define your own class which errors out when accessing names which don't exist: E.g., as.strictvec <- function(x){ stopifnot(is.atomic(x)) class(x) <- c("strictvec", class(x)) x } `[<-.strictvec` <- function(x, i, j, value){ stopifnot(j %in% colnames(x)) NextMethod() } z <- matrix(1:3, ncol = 3); colnames(z) <- letters[1:3] z.strict <- as.strictvec(z) z[, "d"] <- 5 z.strict[, "d"] <- 5 # Error! Adapt as needed. Cheers, Michael > > Thanks for any guidance, > DG > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.