[R] aggregate() function, strange behavior for augmented data
David Afshartous
dafshartous at med.miami.edu
Mon Jun 16 17:50:05 CEST 2008
Everything was read in the same way, and str(junk1) confirms that they are
the same structure. This is very strange.
## original data:
> str(junk1)
'data.frame': 96 obs. of 3 variables:
$ Hour: int 0 3 5 0 3 5 0 3 5 0 ...
$ Drug: Factor w/ 2 levels "D","P": 2 2 2 1 1 1 2 2 2 1 ...
$ Aldo: int 9 15 4 8 13 3 5 11 5 7 ...
## augmented data:
> str(junk1)
'data.frame': 108 obs. of 3 variables:
$ Hour: int 0 3 5 0 3 5 0 3 5 0 ...
$ Drug: Factor w/ 2 levels "D","P": 2 2 2 1 1 1 2 2 2 1 ...
$ Aldo: int 9 15 4 8 13 3 5 11 5 7 ...
On 6/16/08 11:37 AM, "markleeds at verizon.net" <markleeds at verizon.net> wrote:
>
> hi: do str(junk1) and it will tell you what the components of junk1
> are.
>
> the only thing i can think of is that you used stringsAsFactors=FALSE
> when you ( probably ) used read.table to read in junk but you didn't use
> that
> options when you used read.table to read in junk1 ?
>
>
> On Mon, Jun 16, 2008 at 11:30 AM, David Afshartous wrote:
>
>> All,
>>
>> I'm re-running some analysis that has been augmented with additional
>> data.
>> When I use the exact same code for the augmented data, the behavior of
>> the
>> aggregate function is very strange, viz., one of the resulting
>> variables is
>> now coded as a factor while it was coded as numeric for the original
>> data.
>> Unfortunately, I cannot provide a reproducible code example since it
>> only
>> seems to occur with this data. I've checked and re-checked the of
>> both the
>> original and augmented data but nothing appears inconsistent. Any
>> suggestions much appreciated. See below for specifics.
>>
>> Cheers,
>> David
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> # original data
>>> dim(junk1)
>> [1] 96 3
>>> junk1[1,]
>> Hour Drug Aldo
>> 1 0 P 9
>>> junk1$Hour
>> [1] 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5
>> 0 3
>> 5 0 3
>> [39] 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3
>> 5 0
>> 3 5 0
>> [77] 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 ### Not coded as a
>> factor
>>> junk1.mean.time.drug = aggregate(junk1[3], junk1[c(1,2)], mean)
>>> junk1.mean.time.drug$Hour
>> [1] 0 3 5 0 3 5 ### not coded as a factor
>>
>> # augmented data
>> dim(junk1)
>> [1] 108 3
>>> junk1[1,]
>> Hour Drug Aldo
>> 1 0 P 9
>>> junk1$Hour
>> [1] 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3
>> 5 0 3
>> 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3
>> [51] 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0
>> 3 5 0
>> 3 5 0 3 5 0 3 5 0 3 5 0 3 5 0
>> [101] 3 5 0 3 5 0 3 5 ### not coded as a factor
>>> junk1.mean.time.drug = aggregate(junk1[3], junk1[c(1,2)], mean)
>>> junk1.mean.time.drug$Hour
>> [1] 0 3 5 0 3 5
>> Levels: 0 3 5 ################## coded as a factor now!
>>
>> ## of course, I get recode it again but I'm curious as to why this is
>> ## changing here
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list