[R] Refactor all factors in a data frame
Hilmar Berger
hilmar.berger at imise.uni-leipzig.de
Tue Jun 5 15:01:09 CEST 2007
Hi,
the best solution I found so far is (assuming <data> is your data.frame):
# identify all factor variables
factor.list = colnames(data)[sapply(data,class) == "factor"]
# use transform to apply factor() to all factor variables
trans.vars
=paste(factor.list,"=factor(",factor.list,")",sep="",collapse="," )
data = eval(parse(text=paste("transform(data,",trans.vars,")")))
Regards,
Hilmar
Hilmar Berger schrieb:
> Hi all,
>
> Assume I have a data frame with numerical and factor variables that I
> got through merging various other data frames and subsetting the
> resulting data frame afterwards. The number levels of the factors seem
> to be the same as in the original data frames, probably because subset()
> calls [.factor without drop = TRUE (that's what I gather from scanning
> the mailing lists).
>
> I wonder if there is a easy way to refactor all factors in the data
> frame at once. I noted that fix(data_frame) does the trick, however,
> this needs user interaction, which I'd like to avoid. Subsequent
> write.table / read.table would be another option but I'm not sure if R
> can guess the factor/char/numeric-type correctly when reading the table.
>
> So, is there any way in drop the unused factor levels from *all* factors
> of a data frame without import/export ?
>
> Thanks in advance,
> Hilmar
>
--
Hilmar Berger
Studienkoordinator
Institut für medizinische Informatik, Statistik und Epidemiologie
Universität Leipzig
Härtelstr. 16-18
D-04107 Leipzig
Tel. +49 341 97 16 101
Fax. +49 341 97 16 109
email: hilmar.berger at imise.uni-leipzig.de
More information about the R-help
mailing list