[R] How to properly build model matrices
    Yang Zhang 
    yanghatespam at gmail.com
       
    Thu Feb  9 22:39:39 CET 2012
    
    
  
I always bump into a few (very minor) problems when building model
matrices with e.g.:
train = model.matrix(label~., read.csv('train.csv'))
target = model.matrix(label~., read.csv('target.csv'))
(1) The two may have different factor levels, yielding different
matrices.  I usually first rbind the data frames together to "meld"
the factors, and then split them apart and matrixify them.
(2) The target set that I'm predicting on typically doesn't have
labels.  I usually manually append dummy labels to the target data
frame.
(3) I almost always remove the Intercept from the model matrices,
since it seems to always be redundant (I usually use caret).
None of these is a big deal at all, but I'm just curious if I'm
missing something simple in how I'm doing things.  Thanks.
-- 
Yang Zhang
http://yz.mit.edu/
    
    
More information about the R-help
mailing list