There is a very useful and apparently fundamental feature of R (or of the package pls) which I don't understand.
For datasets with many independent (X) variables such as chemometric datasets there is a convenient formula and dataframe construction that allows one to access the entire X matrix with a single term. Consider the gasoline dataset available in the pls package. For the model statement in the plsr function one can write: Octane ~ NIR NIR refers to a (wide) matrix which is a portion of a dataframe. The naming of the columns is of the form: 'NIR.xxxx nm' names(gasoline) returns... $names [1] "octane" "NIR" instead of... $names [1] "octane" "NIR.1000 nm" "NIR.1001 nm" ... How do I construct and manipulate such dataframes and the column names that go with? Does the use of these types of formulas and dataframes generalize to other modeling functions? Some specific clues on a help search might be enough, I've tried many. Regards, Brent [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.