[R] Column header strategy

Peter Ehlers ehlers at ucalgary.ca
Fri Jul 9 00:36:49 CEST 2010


On 2010-07-08 13:14, Addi Wei wrote:
>
> Hopefully simple question:  What is the best way to name, and treat factor
> columns for data that has lots of columns?
>
> This is my column list:
> id pID50 D.1 D.2 D.3 D.4 D.5 , etc. all the way to D.185
>
> I was under the impression from several R examples in pls that if you name
> your columns like above, you should be able to simply call all the D factors
> with "D", instead of going in and putting a plus sign between each column.
> miceD<- plsr(pID50~D, ncomp=10,data = micetitletest)
> Error in model.frame.default(formula = pID50 ~ D, data = micetitletest) :
>    invalid type (closure) for variable 'D'
>
> VS.
>
> miceD<- plsr(pID50 ~ D.1 + D.2 + D.3 + D.4 etc. to D.185 , ncomp=10, data =
> micetitletest)
>
> What am I missing above that's causing that error message in bold?  Is there
> a better strategy for naming my columns in order to make R use easier?

 From the help page for plsr():

"The formula argument should be a symbolic formula of the
form response ~ terms, where response is the name of the
response vector or matrix (for multi-response models) and
terms is the name of one or more predictor _matrices_
(emphasis added), usually separated by +, e.g.,
water ~ FTIR or y ~ X + Z."

Note the word _matrices_; you may not have set up
your data correctly. Compare the 'yarn' dataset

  str(yarn)

with your data

  str(micetitletest)

And, as David says, don't use D for the name of your
predictor matrix (although it will probably work).

   -Peter Ehlers



More information about the R-help mailing list