[R] Making a group membership matrix

J.R. Lockwood lockwood at rand.org
Tue Jul 22 18:00:13 CEST 2003


Hi,

The resulting matrix will be *really* large, so make sure you have
enough RAM.  You might reduce this by dropping all the levels of the
factor that have zero counts.  As far as the solution, it is a
one-liner, but not really a crypic one:

model.matrix(~foo-1)

(assuming that you haven't changed the default contrasts in options() )


> Hi Helpers:
> 
> I have a factor object that has 314k entries of 39 land cover types.
> (This object can be coerced to characters neatly should that be easier
> to work with.) 
> > length(foo)
> [1] 314482
> > foo[1:10]
>  [1] Montane Chaparral Barren            Red Fir           Red Fir
> 
>  [5] Red Fir           Red Fir           Red Fir           Red Fir
> 
>  [9] Red Fir           Red Fir          
> 39 Levels: Alpine-Dwarf Shrub Annual Grassland Aspen Barren ... White
> Fir
> > summary(foo)
>          Alpine-Dwarf Shrub            Annual Grassland
> Aspen 
>                        7402                           0
> 582 
>                      Barren                 Bitterbrush      Blue
> Oak-Foothill Pine 
>                       69111                           9
> 0 
>           Blue Oak Woodland  Chamise-Redshank Chaparral    Closed-Cone
> Pine-Cypress 
>                           0                           0
> 0 
>                    Cropland                Desert Scrub
> Douglas-Fir 
>                           0                           0
> 0 
>               Eastside Pine Freshwater Emergent Wetland
> Jeffrey Pine 
>                           0                           0
> 11342 
>                 Joshua Tree                     Juniper
> Lacustrine 
>                           0                        1293
> 501 
>              Lodgepole Pine                    Low Sage
> Mixed Chaparral 
>                       60332                          31
> 1043 
>           Montane Chaparral            Montane Hardwood    Montane
> Hardwood-Conifer 
>                        6648                         326
> 0 
>            Montane Riparian        Orchard and Vineyard
> Perennial Grassland 
>                         180                           0
> 17 
>              Pinyon-Juniper              Ponderosa Pine
> Red Fir 
>                         968                         708
> 66263 
>                    Riverine                   Sagebrush       Sierran
> Mixed Conifer 
>                           0                        2292
> 14264 
>           Subalpine Conifer                       Urban
> Valley-Foothill Riparian 
>                       66237                           0
> 0 
>         Valley Oak Woodland                  Wet Meadow
> White Fir 
>                           0                        2216
> 2717 
> >
> 
> 
> I want to make a matrix that has the cover types as columns and
> length(foo) rows. I want the matrix entities to be scored one if that
> cover type else zero.
> 
> foo.mat <- matrix(data = 0, nrow = length(foo), 
>                   ncol = nlevels(foo))
> 
> colnames(foo.mat) <- levels(foo)
> 
> That is easy enough but I'm at a loss as how to populate it properly.
> 
> In case I'm not being clear. This is what I want:
> 
> > foo[1]
> [1] Montane Chaparral
> 39 Levels: Alpine-Dwarf Shrub Annual Grassland Aspen Barren ... White
> Fir
> > foo.mat[1,]

J.R. Lockwood
412-683-2300 x4941
lockwood at rand.org
http://www.rand.org/methodology/stat/members/lockwood/




More information about the R-help mailing list