[R] understanding how R determines numbers and characters when creating a data frame
Domenico Vistocco
vistocco at unicas.it
Wed Feb 18 22:32:50 CET 2009
Alan Smith wrote:
> Hello R Users and Developers,
>
> I have a basic question about how R works. Over the past few years I have
> struggled when I try to generate a new data frame that I believe should
> contain numeric data in some columns and character data in others only to
> find everything converted to character data. Is there a general method to
> create data frames that contain the data in the desired format: numbers as
> numeric and character as a factor etc? I often have this problem and in the
> worst case I have to export the file and read it back it in. I have
> emulated a simple example of the problem. It often happens while using
> "for" loops. Could someone explain how to avoid this problem by properly
> creating data frames in for loops that can contain both numeric and
> character data.
>
>
>
> ********Question for example 1.
>
> Why does the cbind command convert the numeric data to character data? Why
> can't the character data be converted to numeric data using the fix command?
>
See ?cbind for a detailed explanation.
Anyway, when cbind/rbind is used on vector / matrix it returns matrix.
Matrix are necessarily composed of the same type of data (see
Introduction to R): combining character and numeric data you are
implicitly converting the "short" type (numeric) to the "long" type
(character).
>
> ### Example 1 #############
>
> data(iris)
>
> obsnum<-NULL
>
> results<-NULL
>
> for(s in unique(as.character(iris$Species))){
>
> temp1<-iris[iris$Species==s,]
>
> obsnum<-length(unique(temp1$Sepal.Length)) # a number
>
>
Instead of using cbind here:
> out1<-cbind(species=as.character(paste(s)),obsnum) # number converted to
> character
>
using data.frame:
out1 <- data.frame(species=as.character(paste(s)),obsnum)
you are telling R to convert character in factor and to preserve the
numeric:
c(class(results$species),mode(results$species))
c(class(results$obsnum),mode(results$obsnum))
You can keep the character using the stringsAsFactors argument of the
data.frame() function:
out1 <- data.frame(species=as.character(paste(s)),obsnum,
stringsAsFactors=FALSE)
And then:
class(results$species)
The message is: if you want to mix up different data type you need lists
(and data.frame are a special type of list where each component has the
same number of elements).
Ciao,
domenico
> results<-rbind(out1,results)
>
> }
>
> results
>
> #fix(results) # cannot convert obsnum to numeric using fix
>
> ####################################
>
>
>
> ******Question for example 2
>
> Why does adding the data.frame command allow the character data to be
> converted to numeric data using fix command?
>
> ### Example 2 #############
>
> data(iris)
>
> obsnum<-NULL
>
> results<-NULL
>
> for(s in unique(as.character(iris$Species))){
>
> temp1<-iris[iris$Species==s,]
>
> obsnum<-length(unique(temp1$Sepal.Length))
>
> out1<-data.frame(cbind(species=as.character(paste(s)),obsnum)) # number
> converted to character
>
> results<-rbind(out1,results)
>
> }
>
> results
>
> #fix(results) # can now convert obsnum to numeric using fix
>
>
>
> ######
>
>
>
>
>
> Thank you,
>
> Alan Smith
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list