[R] understanding how R determines numbers and characters when creating a data frame
Greg Snow
Greg.Snow at imail.org
Wed Feb 18 22:27:15 CET 2009
The culprit is the cbind function. When given 2 vectors (not already something else), cbind will create a matrix, not a data frame. A matrix can only have 1 type, so the numbers get converted to character. In your first example you never do create a data frame, you just build a matrix (try str(results)) so fix cannot change a single column to numeric in something that is a matrix. In the second example you do create a data frame so fix will allow changing of columns, but the cbind inside the call to data.frame is still creating a matrix (and converting numeric to character) before it is included in the data frame. Remove the cbind and just do:
out1 <- data.frame(species=as.character(paste(s)),obsnum=obsnum)
and then out1 will be a data frame without ever converting the number obsnum to a character.
Hope this helps,
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Alan Smith
> Sent: Wednesday, February 18, 2009 2:01 PM
> To: r-help at r-project.org
> Subject: [R] understanding how R determines numbers and characters when
> creating a data frame
>
> Hello R Users and Developers,
>
> I have a basic question about how R works. Over the past few years I
> have
> struggled when I try to generate a new data frame that I believe should
> contain numeric data in some columns and character data in others only
> to
> find everything converted to character data. Is there a general method
> to
> create data frames that contain the data in the desired format:
> numbers as
> numeric and character as a factor etc? I often have this problem and
> in the
> worst case I have to export the file and read it back it in. I have
> emulated a simple example of the problem. It often happens while using
> "for" loops. Could someone explain how to avoid this problem by
> properly
> creating data frames in for loops that can contain both numeric and
> character data.
>
>
>
> ********Question for example 1.
>
> Why does the cbind command convert the numeric data to character data?
> Why
> can't the character data be converted to numeric data using the fix
> command?
>
>
> ### Example 1 #############
>
> data(iris)
>
> obsnum<-NULL
>
> results<-NULL
>
> for(s in unique(as.character(iris$Species))){
>
> temp1<-iris[iris$Species==s,]
>
> obsnum<-length(unique(temp1$Sepal.Length)) # a number
>
> out1<-cbind(species=as.character(paste(s)),obsnum) # number converted
> to
> character
>
> results<-rbind(out1,results)
>
> }
>
> results
>
> #fix(results) # cannot convert obsnum to numeric using fix
>
> ####################################
>
>
>
> ******Question for example 2
>
> Why does adding the data.frame command allow the character data to be
> converted to numeric data using fix command?
>
> ### Example 2 #############
>
> data(iris)
>
> obsnum<-NULL
>
> results<-NULL
>
> for(s in unique(as.character(iris$Species))){
>
> temp1<-iris[iris$Species==s,]
>
> obsnum<-length(unique(temp1$Sepal.Length))
>
> out1<-data.frame(cbind(species=as.character(paste(s)),obsnum)) # number
> converted to character
>
> results<-rbind(out1,results)
>
> }
>
> results
>
> #fix(results) # can now convert obsnum to numeric using fix
>
>
>
> ######
>
>
>
>
>
> Thank you,
>
> Alan Smith
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list