[R] Error occurred during mean calculation of a column of a data frame, which is apparently contents numeric data

R. Michael Weylandt michael.weylandt at gmail.com
Wed Feb 29 14:16:07 CET 2012


Factors are internally stored as integers (enums if you have used
other programming languages) with a special label set -- it's more
memory efficient than storing the whole string over and over.

Michael

On Wed, Feb 29, 2012 at 5:49 AM, Aniruddha Mukherjee
<aniruddha.mukherjee at tcs.com> wrote:
> Hello Berend.
>
> Many thanks for your prompt reply and that helped me a lot. One more
> thing, if you please explain, I shall be highly obliged.
> Why in my case (i.e. when stringsAsFactors was TRUE by default),
>> as.numeric(matr1$Pulse_rate)
> displays the following
>  [1]  4  5  7  5  9  8  6 10  3  2  5  1 10 10
> ?
>
> Best regards.
>
>
> From:
> Berend Hasselman <bhh at xs4all.nl>
> To:
> Aniruddha Mukherjee <aniruddha.mukherjee at tcs.com>
> Cc:
> R-help <r-help at r-project.org>
> Date:
> 02/29/2012 03:57 PM
> Subject:
> Re: [R] Error occurred during mean calculation of a column of a data
> frame, which is apparently contents numeric data
>
>
>
>
> On 29-02-2012, at 09:45, Aniruddha Mukherjee wrote:
>
>> Hello R people,
>>
>> How can I compute the mean of the "Pulse_rate" column of the data frame
> or
>> matrix from the following character object called "str_got". It has 14
>> entries and each entry has 8 values, separated by commas. Please go thru
>
>> the following R commands to know how I tried to unstring and unlist the
>> values to form a data frame.
>>> str_got
>> [1]
> "bp,67,2011-12-09T19:59:44.044+05:30,9830576102,68.0,124.0,58.0,66.0"
>> "bp,67,2011-12-09T20:19:31.031+05:30,9830576102,72.0,133.0,93.0,40.0"
>> .....
>>>
>> matr<-matrix(unlist(strsplit(str_got, ",")), nrows, byrow=T)
>
> nrows?
> I assume this was set somewhere in your script and not shown.
> Is it length(str_got)?
>
>>> matr
>>        [,1]   [,2]                                              [,3]
>>       [,4]               [,5]        [,6]       [,7]       [,8]
>> [1,] "bp" "67"    "2011-12-09T19:59:44.044+05:30" "9830576102" "68.0"
>> ......
>
>> Note column names must be inserted before computing the desired mean
>> value.
>> matr1<-as.data.frame(matr)
>
> Use matr1 <- as.data.frame(matr, stringsAsFactors=FALSE)
>
> If you don't dos tringsAsFactors=FALSE the column will be a factor and
> that is not equivalent with numeric.
>
> What's wrong with
>
> matr1$Pulse_rate <- as.numeric(matr1$Pulse_rate)
>
> Then you can calculate the desired mean with
>
> mean(matr1$Pulse_rate)
>
> or
>
> mean(matr1[,"Pulse_rate"])
>
> Berend
>
>
>
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain
> confidential or privileged information. If you are
> not the intended recipient, any dissemination, use,
> review, distribution, printing or copying of the
> information contained in this e-mail message
> and/or attachments to it are strictly prohibited. If
> you have received this communication in error,
> please notify us by reply e-mail or telephone and
> immediately and permanently delete the message
> and any attachments. Thank you
>
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list