[R] Output of tapply function as data frame: Problem Fixed

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Fri Mar 29 06:39:23 CET 2024


Às 01:43 de 29/03/2024, Ogbos Okike escreveu:
> Dear Rui,
> Thanks again for resolving this. I have already started using the version
> that works for me.
> 
> But to clarify the second part, please let me paste the what I did and the
> error message:
> 
>> set.seed(2024)
>> data <- data.frame(
> +    Date = sample(seq(Sys.Date() - 5, Sys.Date(), by = "1 days"), 100L,
> + TRUE),
> +    count = sample(10L, 100L, TRUE)
> + )
>>
>> # coerce tapply's result to class "data.frame"
>> res <- with(data, tapply(count, Date, mean)) |> as.data.frame()
> Error: unexpected '>' in "res <- with(data, tapply(count, Date, mean)) |>"
>> # assign a dates column from the row names
>> res$Date <- row.names(res)
> Error in row.names(res) : object 'res' not found
>> # cosmetics
>> names(res)[2:1] <- names(data)
> Error in names(res)[2:1] <- names(data) : object 'res' not found
>> # note that the row names are still tapply's names vector
>> # and that the columns order is not Date/count. Both are fixed
>> # after the calculations.
>> res
> 
> You can see that the error message is on the pipe. Please, let me know
> where I am missing it.
> Thanks.
> 
> On Wed, Mar 27, 2024 at 10:45 PM Rui Barradas <ruipbarradas using sapo.pt> wrote:
> 
>> Às 08:58 de 27/03/2024, Ogbos Okike escreveu:
>>> Dear Rui,
>>> Nice to hear from you!
>>>
>>> I am sorry for the omission and I have taken note.
>>>
>>> Many thanks for responding. The second solution looks elegant as it
>> quickly
>>> resolved the problem.
>>>
>>> Please, take a second look at the first solution. It refused to run.
>> Looks
>>> as if the pipe is not properly positioned. Efforts to correct it and get
>> it
>>> run failed. If you can look further, it would be great. If time does not
>>> permit, I am fine too.
>>>
>>> But having the too solutions will certainly make the subject more
>>> interesting.
>>> Thank you so much.
>>> With warmest regards from
>>> Ogbos
>>>
>>> On Wed, Mar 27, 2024 at 8:44 AM Rui Barradas <ruipbarradas using sapo.pt>
>> wrote:
>>>
>>>> Às 04:30 de 27/03/2024, Ogbos Okike escreveu:
>>>>> Warm greetings to you all.
>>>>>
>>>>> Using the tapply function below:
>>>>> data<-read.table("FD1month",col.names = c("Dates","count"))
>>>>> x=data$count
>>>>>     f<-factor(data$Dates)
>>>>> AB<- tapply(x,f,mean)
>>>>>
>>>>>
>>>>> I made a simple calculation. The result, stored in AB, is of the form
>>>>> below. But an effort to write AB to a file as a data frame fails. When
>> I
>>>>> use the write table, it only produces the count column and strip of the
>>>>> first column (date).
>>>>>
>>>>> 2005-11-01 2005-12-01 2006-01-01 2006-02-01 2006-03-01 2006-04-01
>>>>> 2006-05-01
>>>>>     -4.106887  -4.259154  -5.836090  -4.756757  -4.118011  -4.487942
>>>>>     -4.430705
>>>>> 2006-06-01 2006-07-01 2006-08-01 2006-09-01 2006-10-01 2006-11-01
>>>>> 2006-12-01
>>>>>     -3.856727  -6.067103  -6.418767  -4.383031  -3.985805  -4.768196
>>>>> -10.072579
>>>>> 2007-01-01 2007-02-01 2007-03-01 2007-04-01 2007-05-01 2007-06-01
>>>>> 2007-07-01
>>>>>     -5.342338  -4.653128  -4.325094  -4.525373  -4.574783  -3.915600
>>>>>     -4.127980
>>>>> 2007-08-01 2007-09-01 2007-10-01 2007-11-01 2007-12-01 2008-01-01
>>>>> 2008-02-01
>>>>>     -3.952150  -4.033518  -4.532878  -4.522941  -4.485693  -3.922155
>>>>>     -4.183578
>>>>> 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
>>>>> 2008-09-01
>>>>>     -4.336969  -3.813306  -4.296579  -4.575095  -4.036036  -4.727994
>>>>>     -4.347428
>>>>> 2008-10-01 2008-11-01 2008-12-01
>>>>>     -4.029918  -4.260326  -4.454224
>>>>>
>>>>> But the normal format I wish to display only appears on the terminal,
>>>>> leading me to copy it and paste into a text file. That is, when I enter
>>>> AB
>>>>> on the terminal, it returns a format in the form:
>>>>>
>>>>> 008-02-01  -4.183578
>>>>> 2008-03-01  -4.336969
>>>>> 2008-04-01  -3.813306
>>>>> 2008-05-01  -4.296579
>>>>> 2008-06-01  -4.575095
>>>>> 2008-07-01  -4.036036
>>>>> 2008-08-01  -4.727994
>>>>> 2008-09-01  -4.347428
>>>>> 2008-10-01  -4.029918
>>>>> 2008-11-01  -4.260326
>>>>> 2008-12-01  -4.454224
>>>>>
>>>>> Now, my question: How do I write out two columns displayed by AB on the
>>>>> terminal to a file?
>>>>>
>>>>> I have tried using AB<-data.frame(AB) but it doesn't work either.
>>>>>
>>>>> Many thanks for your time.
>>>>> Ogbos
>>>>>
>>>>>         [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>> Hello,
>>>>
>>>> The main trick is to pipe to as.data.frame. But the result will have one
>>>> column only, you must assign the dates from the df's row names.
>>>> I also include an aggregate solution.
>>>>
>>>>
>>>>
>>>> # create a test data set
>>>> set.seed(2024)
>>>> data <- data.frame(
>>>>      Date = sample(seq(Sys.Date() - 5, Sys.Date(), by = "1 days"), 100L,
>>>> TRUE),
>>>>      count = sample(10L, 100L, TRUE)
>>>> )
>>>>
>>>> # coerce tapply's result to class "data.frame"
>>>> res <- with(data, tapply(count, Date, mean)) |> as.data.frame()
>>>> # assign a dates column from the row names
>>>> res$Date <- row.names(res)
>>>> # cosmetics
>>>> names(res)[2:1] <- names(data)
>>>> # note that the row names are still tapply's names vector
>>>> # and that the columns order is not Date/count. Both are fixed
>>>> # after the calculations.
>>>> res
>>>> #>               count       Date
>>>> #> 2024-03-22 5.416667 2024-03-22
>>>> #> 2024-03-23 5.500000 2024-03-23
>>>> #> 2024-03-24 6.000000 2024-03-24
>>>> #> 2024-03-25 4.476190 2024-03-25
>>>> #> 2024-03-26 6.538462 2024-03-26
>>>> #> 2024-03-27 5.200000 2024-03-27
>>>>
>>>> # fix the columns' order
>>>> res <- res[2:1]
>>>>
>>>>
>>>>
>>>> # better all in one instruction
>>>> aggregate(count ~ Date, data, mean)
>>>> #>         Date    count
>>>> #> 1 2024-03-22 5.416667
>>>> #> 2 2024-03-23 5.500000
>>>> #> 3 2024-03-24 6.000000
>>>> #> 4 2024-03-25 4.476190
>>>> #> 5 2024-03-26 6.538462
>>>> #> 6 2024-03-27 5.200000
>>>>
>>>>
>>>>
>>>> Also,
>>>> I'm glad to help as always but Ogbos, you have been an R-Help
>>>> contributor for quite a while, please post data in dput format. Given
>>>> the problem the output of the following is more than enough.
>>>>
>>>>
>>>> dput(head(data, 20L))
>>>>
>>>>
>>>> Hope this helps,
>>>>
>>>> Rui Barradas
>>>>
>>>>
>>>> --
>>>> Este e-mail foi analisado pelo software antivírus AVG para verificar a
>>>> presença de vírus.
>>>> www.avg.com
>>>>
>>>
>> Hello,
>>
>> This pipe?
>>
>>
>> with(data, tapply(count, Date, mean)) |> as.data.frame()
>>
>>
>> I am not seeing anything wrong with it. I have tried it again just now
>> and it runs with no problems, like it had before.
>> A solution is not to pipe, separate the instructions.
>>
>>
>> res <- with(data, tapply(count, Date, mean))
>> res <- as.data.frame(res)
>>
>>
>> But this should be equivalent to the pipe, I cannot think of a way to
>> have this separated instructions run but not the pipe.
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>>
>> --
>> Este e-mail foi analisado pelo software antivírus AVG para verificar a
>> presença de vírus.
>> www.avg.com
>>
> 
Hello,

Yes, the problem seems to be the pipe but there is nothing wrong with 
the code.
The pipe operator was introduced in R 4.1.0, what is your version of R?

You can always not use the pipe,


res <- as.data.frame(with(data, tapply(count, Date, mean)))


Hope this helps,

Rui Barradas



-- 
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus.
www.avg.com



More information about the R-help mailing list