There is a very useful and apparently fundamental feature of R (or of
the package pls) which I don't understand.

For datasets with many independent (X) variables such as chemometric
datasets there is a convenient formula and dataframe construction that
allows one to access the entire X matrix with a single term.

Consider the gasoline dataset available in the pls package. For the
model statement in the plsr function one can write: Octane ~ NIR

NIR refers to a (wide) matrix which is a portion of a dataframe. The
naming of the columns is of the form: 'NIR.xxxx nm'

names(gasoline) returns...

$names
[1] "octane" "NIR"   

instead of...

$names
[1] "octane" "NIR.1000 nm" "NIR.1001 nm" ... 

How do I construct and manipulate such dataframes and the column names
that go with?

Does the use of these types of formulas and dataframes generalize to
other modeling functions?

Some specific clues on a help search might be enough, I've tried many.

Regards,
Brent

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to