[R] Arrange data

Md. Moyazzem Hossain ho@@@|nmm @end|ng |rom jun|v@edu
Sun Aug 9 21:59:07 CEST 2020


Dear Rui,

Thank you for your nice help.

Take care and be safe.

Md

On Tue, Aug 4, 2020 at 10:45 PM Rui Barradas <ruipbarradas using sapo.pt> wrote:

> Hello,
>
> Please keep cc-ing the list R-help is threaded and questions and answers
> might be of help to others in the future.
>
> As for the question, see if the following code does what you want.
> First, create a logical index i of the months between 7 and 3 and use
> that index to subset the original data.frame. Then, a cumsum trick gives
> a vector M defining the data grouping. Group and compute the Value means
> with aggregate. Finally, since each group spans a year border, create a
> more meaningful Years column and put everything together.
>
> df1 <- read.csv("mddat.csv")
>
> i <- with(df1, (Month >= 7 & Month <= 12) | (Month >= 1 & Month <= 3))
> df2 <- df1[i, ]
> M <- cumsum(c(FALSE, diff(as.integer(row.names(df2))) > 1))
>
> agg <- aggregate(Value ~ M, df2, mean)
> Years <- sapply(split(df2$Year, M), function(x){paste(x[1],
> x[length(x)], sep = "-")})
> final <- cbind.data.frame(Years, Value = agg[["Value"]])
>
> head(final)
> #      Years    Value
> #0 1975-1975 87.00000
> #1 1975-1976 89.44444
> #2 1976-1977 85.77778
> #3 1977-1978 81.55556
> #4 1978-1979 71.55556
> #5 1979-1980 75.77778
>
>
> Hope this helps,
>
> Rui Barradas
>
>
>
> Às 20:44 de 04/08/20, Md. Moyazzem Hossain escreveu:
> > Dear Rui,
> >
> > Thanks a lot for your help.
> >
> > It is working. Now I am also trying to find the average of values for
> > *July 1975 to March 1976* and record as the value of the year 1975.
> > Moreover, I want to continue it up to the year 2017. You may check the
> > attached file for data (mddat.csv).
> >
> > I use the following function but got error
> > aggregate(Value ~ Year, data = subset(df1, Month >= 7 & Month <= 3), FUN
> > = mean)
> >
> > Please help me again. Thanks in advance.
> >
> > Best Regards,
> > Md
> >
> > On Mon, Aug 3, 2020 at 11:28 PM Rui Barradas <ruipbarradas using sapo.pt
> > <mailto:ruipbarradas using sapo.pt>> wrote:
> >
> >     Hello,
> >
> >     And here is another way, with aggregate.
> >
> >     Make up test data.
> >
> >     set.seed(2020)
> >     df1 <- expand.grid(Year = 2000:2018, Month = 1:12)
> >     df1 <- df1[order(df1$Year),]
> >     df1$Value <- sample(20:30, nrow(df1), TRUE)
> >     head(df1)
> >
> >
> >     #Use subset to keep only the relevant months
> >     aggregate(Value ~ Year, data = subset(df1, Month <= 7), FUN = mean)
> >
> >
> >     Hope this helps,
> >
> >     Rui Barradas
> >
> >     Às 12:33 de 03/08/2020, Rasmus Liland escreveu:
> >      > On 2020-08-03 21:11 +1000, Jim Lemon wrote:
> >      >> On Mon, Aug 3, 2020 at 8:52 PM Md. Moyazzem Hossain
> >     <hossainmm using juniv.edu <mailto:hossainmm using juniv.edu>> wrote:
> >      >>> Hi,
> >      >>>
> >      >>> I have a dataset having monthly
> >      >>> observations (from January to
> >      >>> December) over a period of time like
> >      >>> (2000 to 2018). Now, I am trying to
> >      >>> take an average the value from
> >      >>> January to July of each year.
> >      >>>
> >      >>> The data looks like
> >      >>> Year    Month  Value
> >      >>> 2000    1         25
> >      >>> 2000    2         28
> >      >>> 2000    3         22
> >      >>> ....    ......      .....
> >      >>> 2000    12       26
> >      >>> 2001     1       27
> >      >>> .......         ........
> >      >>> 2018    11       30
> >      >>> 20118   12      29
> >      >>>
> >      >>> Can someone help me in this regard?
> >      >>>
> >      >>> Many thanks in advance.
> >      >> Hi Md,
> >      >> One way is to form a subset of your
> >      >> data, then calculate the means by
> >      >> year:
> >      >>
> >      >> # assume your data is named mddat
> >      >> mddat2<-mddat[mddat$month < 7,]
> >      >> jan2jun<-by(mddat2$value,mddat2$year,mean)
> >      >>
> >      >> Jim
> >      > Hi Md,
> >      >
> >      > you can also define the period in a new
> >      > column, and use aggregate like this:
> >      >
> >      >       Md <- structure(list(
> >      >       Year = c(2000L, 2000L, 2000L,
> >      >       2000L, 2001L, 2018L, 2018L),
> >      >       Month = c(1L, 2L, 3L, 12L, 1L,
> >      >       11L, 12L),
> >      >       Value = c(25L, 28L, 22L, 26L,
> >      >       27L, 30L, 29L)),
> >      >       class = "data.frame",
> >      >       row.names = c(NA, -7L))
> >      >
> >      >       Md[Md$Month %in%
> >      >               1:6,"Period"] <- "first six months of the year"
> >      >       Md[Md$Month %in% 7:12,"Period"] <- "last six months of the
> >     year"
> >      >
> >      >       aggregate(
> >      >         formula=Value~Year+Period,
> >      >         data=Md,
> >      >         FUN=mean)
> >      >
> >      > Rasmus
> >      >
> >      > ______________________________________________
> >      > R-help using r-project.org <mailto:R-help using r-project.org> mailing list
> >     -- To UNSUBSCRIBE and more, see
> >      > https://stat.ethz.ch/mailman/listinfo/r-help
> >      > PLEASE do read the posting guide
> >     http://www.R-project.org/posting-guide.html
> >      > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >     --
> >     Este e-mail foi verificado em termos de vírus pelo software
> >     antivírus Avast.
> >     https://www.avast.com/antivirus
> >
> >     ______________________________________________
> >     R-help using r-project.org <mailto:R-help using r-project.org> mailing list --
> >     To UNSUBSCRIBE and more, see
> >     https://stat.ethz.ch/mailman/listinfo/r-help
> >     PLEASE do read the posting guide
> >     http://www.R-project.org/posting-guide.html
> >     and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
>


-- 
Best Regards,
Md. Moyazzem Hossain
Associate Professor
Department of Statistics
Jahangirnagar University
Savar, Dhaka-1342
Bangladesh
Website: http://www.juniv.edu/teachers/hossainmm
Research: *Google Scholar
<https://scholar.google.com/citations?user=-U03XCgAAAAJ&hl=en&oi=ao>*;
*ResearchGate
<https://www.researchgate.net/profile/Md_Hossain107>*; *ORCID iD
<https://orcid.org/0000-0003-3593-6936>*

	[[alternative HTML version deleted]]



More information about the R-help mailing list