[R] how to remove factors from whole dataframe?

Rolf Turner r@turner @end|ng |rom @uck|@nd@@c@nz
Sun Sep 19 11:32:34 CEST 2021


On Sun, 19 Sep 2021 10:17:51 +0200
Luigi Marongiu <marongiu.luigi using gmail.com> wrote:

> Hello,
> I woul dlike to remove factors from all the columns of a dataframe.

What on earth do you mean by that?  After struggling with your
(inadequate) example for a while, I conjecture that what you want to do
is to drop unused levels from all factor columns in a data frame.

I is that correct?

> I can do it n a column at the time with
> ```
> 
> df <- data.frame(region=factor(c('A', 'B', 'C', 'D', 'E')),
>                  sales = c(13, 16, 22, 27, 34), country=factor(c('a',
> 'b', 'c', 'd', 'e')))
> 
> new_df$region <- droplevels(new_df$region)
> ```

Before executing the foregoing command, you would have to create
new_df.  *Perhaps* you intended to do "new_df <- df" initially.

If this is the case, then new_df will be exactly the same as df
after you've applied droplevels() to new_df$region.

Note that droplevels() removes unused levels from the levels of a
factor.  The factor df$region in your confusing example has no unused
levels, so droplevels() has no effect upon it.

> 
> What is the syntax to remove all factors at once (from all columns)?
> For this does not work:
> ```
> > str(df)
> 'data.frame': 5 obs. of  3 variables:
>  $ region : Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5
>  $ sales  : num  13 16 22 27 34
>  $ country: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5
> > df = droplevels(df)
> > str(df)
> 'data.frame': 5 obs. of  3 variables:
>  $ region : Factor w/ 5 levels "A","B","C","D",..: 1 2 3 4 5
>  $ sales  : num  13 16 22 27 34
>  $ country: Factor w/ 5 levels "a","b","c","d",..: 1 2 3 4 5
> ```
> Thank you

I believe the reason you think "this does not work" is that your
example is inadequate.  If the factors in "df" actually had any unused
levels, then droplevels(df) would indeed remove them.

(a) In future please present your questions in a comprehensible manner.

(b) Also please construct your examples so that they are actually
capable of illustrating what you a trying to accomplish.

You are asking others for help.  Have a little consideration for the
helpers, who are giving of their time and effort free of charge!

(c) Note that "df" is a lousy name for a data frame, since it is the
name of a base R function (the density function for the F distribution).
No harm is done in the current context, but such nomenclature can at
times lead to errors "object of type 'closure' is not subsettable"
which mystifies most users.

cheers,

Rolf Turner

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276



More information about the R-help mailing list