[R] two apparent anomalies

analyst41 at hotmail.com
Sat Jan 22 16:11:29 CET 2011

On Jan 22, 9:50 am, Berwin A Turlach <ber... at maths.uwa.edu.au> wrote:
> On Sat, 22 Jan 2011 06:16:43 -0800 (PST)
>
> "analys... at hotmail.com" <analys... at hotmail.com> wrote:
> > (1)
>
> > > a = c("a","b")
> > > mode(a)
> > [1] "character"
> > > b = c(1,2)
> > > mode(b)
> > [1] "numeric"
> > > c = data.frame(a,b)
> > > mode(c\$a)
> > [1] "numeric"
>
> R> str(c)
> 'data.frame':   2 obs. of  2 variables:
>  \$ a: Factor w/ 2 levels "a","b": 1 2
>  \$ b: num  1 2
>
> Character vectors are turned into factors by default by data.frame().
>
> OTOH:
>
> R> c = data.frame(a,b, stringsAsFactors=FALSE)
> R> mode(c\$a)
> [1] "character"
>
> > (2)
>
> > > a = c("a","a","b","b","c")
> > > levels(as.factor(a))
> > [1] "a" "b" "c"
> > > levels(as.factor(a[1:3]))
> > [1] "a" "b"
> > > a = as.factor(a)
> > > levels(a)
> > [1] "a" "b" "c"
> > > levels(a[1:3])
> > [1] "a" "b" "c"
>
> Subsetting factors does not get rid of no-longer used levels by default.
>
> OTOH:
>
> R> levels(a[1:3, drop=TRUE])
> [1] "a" "b"
>
> or
>
> R> levels(factor(a[1:3]))
> [1] "a" "b"
>
> HTH.
>
> Cheers,
>
>         Berwin
>

Thanks for both responses.

is there a difference between the "as.factor" and "factor" commands
and also between "as.data.frame" and "data.frame"?

