[R] vectors levels are carried through to subsets...

Barry Rowlingson b.rowlingson at lancaster.ac.uk
Tue Sep 29 19:52:06 CEST 2009


On Tue, Sep 29, 2009 at 6:47 PM, chipmaney <chipmaney at hotmail.com> wrote:
>
> I have a dataset.  Initially, it has 25 levels for a certain factor,
> Description.
>
> However, I then subset it, because I am only interested in 2 of the 25
> factors.  When I subset it, I get the following. The vector lists only the
> two factors, yet there remain 25 levels:
>
>> Quadrats.df$Description
>  [1] Emergent 25x75  Emergent 25x75  Emergent 25x75  Emergent 25x75
> Emergent 25x75  Emergent 25x75  Emergent 25x75  Emergent 25x75  Emergent
> 25x75
> [10] Emergent 25x75  Emergent 25x75  Emergent 25x75  Emergent 25x75
> Emergent 25x75  Emergent 25x75  Hydroseed 25x75 Hydroseed 25x75 Hydroseed
> 25x75
> [19] Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75
> Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed
> 25x75
> [28] Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75 Hydroseed 25x75
> 25 Levels: Black Cottonwood Black Cottonwood Enhanced Emergent Emergent
> 25x75 Floodplain 1 Floodplain 2 Floodplain 3 Hydroseed 25x75 ... Western Red
> Cedar Enhanced
>
> This seems rather innocuous; however, when I run a by statement, it returns
> a list with 25 entries, 23 of which are of course NA....is there a way to
> avoid this?
>

 Just re-factor() it when you select a subset - and also it's nice if
you give us a simple example - all your Emergent this and Hydroseed
doesn't look very clear!

 Like this:

# make a factor:
> x=factor(sample(letters,10))
> x
 [1] z x f i n b y e p c
Levels: b c e f i n p x y z

# a subset:

> x[1:3]
[1] z x f
Levels: b c e f i n p x y z

# - still has all the levels. So re-"factor()":

> factor(x[1:3])
[1] z x f
Levels: f x z

 et voila?

Barry




More information about the R-help mailing list