[R] Output of tapply function as data frame: Problem Fixed
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Wed Mar 27 22:45:49 CET 2024
Às 08:58 de 27/03/2024, Ogbos Okike escreveu:
> Dear Rui,
> Nice to hear from you!
>
> I am sorry for the omission and I have taken note.
>
> Many thanks for responding. The second solution looks elegant as it quickly
> resolved the problem.
>
> Please, take a second look at the first solution. It refused to run. Looks
> as if the pipe is not properly positioned. Efforts to correct it and get it
> run failed. If you can look further, it would be great. If time does not
> permit, I am fine too.
>
> But having the too solutions will certainly make the subject more
> interesting.
> Thank you so much.
> With warmest regards from
> Ogbos
>
> On Wed, Mar 27, 2024 at 8:44 AM Rui Barradas <ruipbarradas using sapo.pt> wrote:
>
>> Às 04:30 de 27/03/2024, Ogbos Okike escreveu:
>>> Warm greetings to you all.
>>>
>>> Using the tapply function below:
>>> data<-read.table("FD1month",col.names = c("Dates","count"))
>>> x=data$count
>>> f<-factor(data$Dates)
>>> AB<- tapply(x,f,mean)
>>>
>>>
>>> I made a simple calculation. The result, stored in AB, is of the form
>>> below. But an effort to write AB to a file as a data frame fails. When I
>>> use the write table, it only produces the count column and strip of the
>>> first column (date).
>>>
>>> 2005-11-01 2005-12-01 2006-01-01 2006-02-01 2006-03-01 2006-04-01
>>> 2006-05-01
>>> -4.106887 -4.259154 -5.836090 -4.756757 -4.118011 -4.487942
>>> -4.430705
>>> 2006-06-01 2006-07-01 2006-08-01 2006-09-01 2006-10-01 2006-11-01
>>> 2006-12-01
>>> -3.856727 -6.067103 -6.418767 -4.383031 -3.985805 -4.768196
>>> -10.072579
>>> 2007-01-01 2007-02-01 2007-03-01 2007-04-01 2007-05-01 2007-06-01
>>> 2007-07-01
>>> -5.342338 -4.653128 -4.325094 -4.525373 -4.574783 -3.915600
>>> -4.127980
>>> 2007-08-01 2007-09-01 2007-10-01 2007-11-01 2007-12-01 2008-01-01
>>> 2008-02-01
>>> -3.952150 -4.033518 -4.532878 -4.522941 -4.485693 -3.922155
>>> -4.183578
>>> 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
>>> 2008-09-01
>>> -4.336969 -3.813306 -4.296579 -4.575095 -4.036036 -4.727994
>>> -4.347428
>>> 2008-10-01 2008-11-01 2008-12-01
>>> -4.029918 -4.260326 -4.454224
>>>
>>> But the normal format I wish to display only appears on the terminal,
>>> leading me to copy it and paste into a text file. That is, when I enter
>> AB
>>> on the terminal, it returns a format in the form:
>>>
>>> 008-02-01 -4.183578
>>> 2008-03-01 -4.336969
>>> 2008-04-01 -3.813306
>>> 2008-05-01 -4.296579
>>> 2008-06-01 -4.575095
>>> 2008-07-01 -4.036036
>>> 2008-08-01 -4.727994
>>> 2008-09-01 -4.347428
>>> 2008-10-01 -4.029918
>>> 2008-11-01 -4.260326
>>> 2008-12-01 -4.454224
>>>
>>> Now, my question: How do I write out two columns displayed by AB on the
>>> terminal to a file?
>>>
>>> I have tried using AB<-data.frame(AB) but it doesn't work either.
>>>
>>> Many thanks for your time.
>>> Ogbos
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> Hello,
>>
>> The main trick is to pipe to as.data.frame. But the result will have one
>> column only, you must assign the dates from the df's row names.
>> I also include an aggregate solution.
>>
>>
>>
>> # create a test data set
>> set.seed(2024)
>> data <- data.frame(
>> Date = sample(seq(Sys.Date() - 5, Sys.Date(), by = "1 days"), 100L,
>> TRUE),
>> count = sample(10L, 100L, TRUE)
>> )
>>
>> # coerce tapply's result to class "data.frame"
>> res <- with(data, tapply(count, Date, mean)) |> as.data.frame()
>> # assign a dates column from the row names
>> res$Date <- row.names(res)
>> # cosmetics
>> names(res)[2:1] <- names(data)
>> # note that the row names are still tapply's names vector
>> # and that the columns order is not Date/count. Both are fixed
>> # after the calculations.
>> res
>> #> count Date
>> #> 2024-03-22 5.416667 2024-03-22
>> #> 2024-03-23 5.500000 2024-03-23
>> #> 2024-03-24 6.000000 2024-03-24
>> #> 2024-03-25 4.476190 2024-03-25
>> #> 2024-03-26 6.538462 2024-03-26
>> #> 2024-03-27 5.200000 2024-03-27
>>
>> # fix the columns' order
>> res <- res[2:1]
>>
>>
>>
>> # better all in one instruction
>> aggregate(count ~ Date, data, mean)
>> #> Date count
>> #> 1 2024-03-22 5.416667
>> #> 2 2024-03-23 5.500000
>> #> 3 2024-03-24 6.000000
>> #> 4 2024-03-25 4.476190
>> #> 5 2024-03-26 6.538462
>> #> 6 2024-03-27 5.200000
>>
>>
>>
>> Also,
>> I'm glad to help as always but Ogbos, you have been an R-Help
>> contributor for quite a while, please post data in dput format. Given
>> the problem the output of the following is more than enough.
>>
>>
>> dput(head(data, 20L))
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>>
>> --
>> Este e-mail foi analisado pelo software antivírus AVG para verificar a
>> presença de vírus.
>> www.avg.com
>>
>
Hello,
This pipe?
with(data, tapply(count, Date, mean)) |> as.data.frame()
I am not seeing anything wrong with it. I have tried it again just now
and it runs with no problems, like it had before.
A solution is not to pipe, separate the instructions.
res <- with(data, tapply(count, Date, mean))
res <- as.data.frame(res)
But this should be equivalent to the pipe, I cannot think of a way to
have this separated instructions run but not the pipe.
Hope this helps,
Rui Barradas
--
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus.
www.avg.com
More information about the R-help
mailing list