[R] Arrange data
Md. Moyazzem Hossain
ho@@@|nmm @end|ng |rom jun|v@edu
Sun Aug 9 21:59:07 CEST 2020
Dear Rui,
Thank you for your nice help.
Take care and be safe.
Md
On Tue, Aug 4, 2020 at 10:45 PM Rui Barradas <ruipbarradas using sapo.pt> wrote:
> Hello,
>
> Please keep cc-ing the list R-help is threaded and questions and answers
> might be of help to others in the future.
>
> As for the question, see if the following code does what you want.
> First, create a logical index i of the months between 7 and 3 and use
> that index to subset the original data.frame. Then, a cumsum trick gives
> a vector M defining the data grouping. Group and compute the Value means
> with aggregate. Finally, since each group spans a year border, create a
> more meaningful Years column and put everything together.
>
> df1 <- read.csv("mddat.csv")
>
> i <- with(df1, (Month >= 7 & Month <= 12) | (Month >= 1 & Month <= 3))
> df2 <- df1[i, ]
> M <- cumsum(c(FALSE, diff(as.integer(row.names(df2))) > 1))
>
> agg <- aggregate(Value ~ M, df2, mean)
> Years <- sapply(split(df2$Year, M), function(x){paste(x[1],
> x[length(x)], sep = "-")})
> final <- cbind.data.frame(Years, Value = agg[["Value"]])
>
> head(final)
> # Years Value
> #0 1975-1975 87.00000
> #1 1975-1976 89.44444
> #2 1976-1977 85.77778
> #3 1977-1978 81.55556
> #4 1978-1979 71.55556
> #5 1979-1980 75.77778
>
>
> Hope this helps,
>
> Rui Barradas
>
>
>
> Às 20:44 de 04/08/20, Md. Moyazzem Hossain escreveu:
> > Dear Rui,
> >
> > Thanks a lot for your help.
> >
> > It is working. Now I am also trying to find the average of values for
> > *July 1975 to March 1976* and record as the value of the year 1975.
> > Moreover, I want to continue it up to the year 2017. You may check the
> > attached file for data (mddat.csv).
> >
> > I use the following function but got error
> > aggregate(Value ~ Year, data = subset(df1, Month >= 7 & Month <= 3), FUN
> > = mean)
> >
> > Please help me again. Thanks in advance.
> >
> > Best Regards,
> > Md
> >
> > On Mon, Aug 3, 2020 at 11:28 PM Rui Barradas <ruipbarradas using sapo.pt
> > <mailto:ruipbarradas using sapo.pt>> wrote:
> >
> > Hello,
> >
> > And here is another way, with aggregate.
> >
> > Make up test data.
> >
> > set.seed(2020)
> > df1 <- expand.grid(Year = 2000:2018, Month = 1:12)
> > df1 <- df1[order(df1$Year),]
> > df1$Value <- sample(20:30, nrow(df1), TRUE)
> > head(df1)
> >
> >
> > #Use subset to keep only the relevant months
> > aggregate(Value ~ Year, data = subset(df1, Month <= 7), FUN = mean)
> >
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> > Às 12:33 de 03/08/2020, Rasmus Liland escreveu:
> > > On 2020-08-03 21:11 +1000, Jim Lemon wrote:
> > >> On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain
> > <hossainmm using juniv.edu <mailto:hossainmm using juniv.edu>> wrote:
> > >>> Hi,
> > >>>
> > >>> I have a dataset having monthly
> > >>> observations (from January to
> > >>> December) over a period of time like
> > >>> (2000 to 2018). Now, I am trying to
> > >>> take an average the value from
> > >>> January to July of each year.
> > >>>
> > >>> The data looks like
> > >>> Year Month Value
> > >>> 2000 1 25
> > >>> 2000 2 28
> > >>> 2000 3 22
> > >>> .... ...... .....
> > >>> 2000 12 26
> > >>> 2001 1 27
> > >>> ....... ........
> > >>> 2018 11 30
> > >>> 20118 12 29
> > >>>
> > >>> Can someone help me in this regard?
> > >>>
> > >>> Many thanks in advance.
> > >> Hi Md,
> > >> One way is to form a subset of your
> > >> data, then calculate the means by
> > >> year:
> > >>
> > >> # assume your data is named mddat
> > >> mddat2<-mddat[mddat$month < 7,]
> > >> jan2jun<-by(mddat2$value,mddat2$year,mean)
> > >>
> > >> Jim
> > > Hi Md,
> > >
> > > you can also define the period in a new
> > > column, and use aggregate like this:
> > >
> > > Md <- structure(list(
> > > Year = c(2000L, 2000L, 2000L,
> > > 2000L, 2001L, 2018L, 2018L),
> > > Month = c(1L, 2L, 3L, 12L, 1L,
> > > 11L, 12L),
> > > Value = c(25L, 28L, 22L, 26L,
> > > 27L, 30L, 29L)),
> > > class = "data.frame",
> > > row.names = c(NA, -7L))
> > >
> > > Md[Md$Month %in%
> > > 1:6,"Period"] <- "first six months of the year"
> > > Md[Md$Month %in% 7:12,"Period"] <- "last six months of the
> > year"
> > >
> > > aggregate(
> > > formula=Value~Year+Period,
> > > data=Md,
> > > FUN=mean)
> > >
> > > Rasmus
> > >
> > > ______________________________________________
> > > R-help using r-project.org <mailto:R-help using r-project.org> mailing list
> > -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> > --
> > Este e-mail foi verificado em termos de vírus pelo software
> > antivírus Avast.
> > https://www.avast.com/antivirus
> >
> > ______________________________________________
> > R-help using r-project.org <mailto:R-help using r-project.org> mailing list --
> > To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
>
--
Best Regards,
Md. Moyazzem Hossain
Associate Professor
Department of Statistics
Jahangirnagar University
Savar, Dhaka-1342
Bangladesh
Website: http://www.juniv.edu/teachers/hossainmm
Research: *Google Scholar
<https://scholar.google.com/citations?user=-U03XCgAAAAJ&hl=en&oi=ao>*;
*ResearchGate
<https://www.researchgate.net/profile/Md_Hossain107>*; *ORCID iD
<https://orcid.org/0000-0003-3593-6936>*
[[alternative HTML version deleted]]
More information about the R-help
mailing list