[R] How to average minutes per hour per month in the form of '# hours #minutes'
Jeff Newmiller
jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Fri Mar 26 23:24:58 CET 2021
Avi, I see no limitations in using R for this task, so throwing Python into the discussion seems only to confuse the issue. I just see multiple ways to interpret the desired calculation of the result, as illustrated by your discussion and elaborated in my next paragraph, and I would rather the OP did the work of clarifying what his needs are so the question becomes about R instead of about different people guessing what the goal is.
For example, you could calculate an average daily hours billed and multiply by seven, or you could prorate hours in partial weeks and calculate the average of those weekly values, or you could simply exclude partial weeks, and then there is the issue of whether a week begins on Monday or Sunday. There is the discrepancy of asking for "by week by month" and then saying he really wants three specific months at a time... does he plan to calculate weekly summaries within months and then average three months? This is not a reasonable question for this mailing list yet.
I acknowledge that OP may not have these answers now but answers (intentional or not) will have to be embedded in whatever solution is proposed, even if some random method in a Python package purports to solve this in one line of code. So either answers from the insurance company or arbitrarily selected by OP should be identified.
On March 26, 2021 2:28:21 PM PDT, Avi Gross via R-help <r-help using r-project.org> wrote:
>There are rather straightforward ways to manipulate your data step by
>step to make harder things possible, or you can use creative ways
>harder for people to understand.
>
>So adding columns to your data that take existing times/dates and
>record them with names like Q1Y2021 can give you abilities but as noted
>they will NOT line up with weeks as in 1 to 52.
>
>You can calculate the sum of hours per week, if you had the ability to
>group by week, and place that in a column that repeats that number for
>each day recorded for that week. You can then take the same data and
>group by quarter and take some kind of average of that column but it
>probably will be WRONG if you did the above as it will take the average
>of whatever rows it encounters and that may include partial weeks or
>other anomalies like when you only recorded three days for that week.
>
>So consider other plans. What if you kept track of the number of weeks
>per month as in 28 days is 4 weeks and 31 days is 4.43 or so weeks. You
>could simply calculate the sum of hours for that month and divide by
>the number of weeks by that measure in that month. Would that number
>satisfy them?
>
>And, again, rather than trying to SORT Month names, consider adding a
>column with a numerical version. Sure, you can play with factors so the
>months are recorded in the order you want and some things like ggplot
>will then honor that order.
>
>If and when you become more expert, much of what you want might be done
>other ways without making columns for real. But it may make sense to
>start simple.
>
>Here is an example of a simple change to Months Abbreviations to be
>made into a factor in order:
>
> df$mo <- factor(df$mo,levels=month.abb)
>
>Similar ideas involve how you convert hours and minutes to just minutes
>for averaging by adding calculated columns and you can convert the
>results back to whatever format you need later.
>
>Just FYI, many database programs might let you do much of this
>internally. Python using the tools you are using is arguably much more
>flexible.
>
>-----Original Message-----
>From: R-help <r-help-bounces using r-project.org> On Behalf Of Dr Eberhard W
>Lisse
>Sent: Friday, March 26, 2021 3:22 AM
>To: r-help using r-project.org
>Subject: Re: [R] How to average minutes per hour per month in the form
>of '# hours #minutes'
>
>Jeff,
>
>thank you. However, if I knew how to do this, I would probably not have
>asked :-)-O
>
>I think I have been reasonably comprehensive in describing my issue,
>but let me do it now with the real life problem:
>
>My malpractice insurance gives me a discount if I consult up to 22
>hours per week in a 3 months period.
>
>I add every patient, date and minutes whenever I see her into a MySQL
>database. I want to file the report of my hours worked with them for
>the first 3 month period (November to January and not properly
>quarterly unfortunately :-)-0), and while I can generate this with
>LyX/LateX and knitR producing a (super)tabular table containing the
>full list, and tables for time per week and time per month I really
>can't figure out is how to average the hours worked per week for each
>month (even if weeks don't align with months properly :-)-O)
>
>While I am at it how would I get this to sort properly (year, month) if
>I used the proper names of the months, ie '%Y %B' or '%B %Y'?
>
> CONSMINUTES %>%
> select(datum, dauer) %>%
> group_by(month = format(datum, '%Y %m'),
> week = format(datum, '%V')) %>%
> summarise_if(is.numeric, sum) %>%
> mutate(hm=sprintf("%d Hour%s %d Minutes", dauer %/% 60,
> ifelse((dauer %/% 60) == 1, " ", "s"), dauer %% 60)) %>%
> select(-dauer)
>
>
>Any help, or just pointers to where I can read this up, are highly
>appreciated.
>
>greetings, el
>
>
>On 2021-03-25 22:37 , Jeff Newmiller wrote:
> > This is a very unclear question. Weeks don't line up with months..
>> so you need to clarify how you would do this or at least give an >
>explicit example of input data and result data.
> >
>> On March 25, 2021 11:34:15 AM PDT, Dr Eberhard W Lisse
><nospam using lisse.NA> wrote:
> >> Thanks, that is helpful.
> >>
> >> But, how do I group it to produce hours worked per week per month?
> >>
> >> el
> >>
> >>
> >> On 2021-03-25 19:03 , Greg Snow wrote:
> >>> Here is one approach:
> >>>
> >>> tmp <- data.frame(min=seq(0,150, by=15)) >>> >>> tmp %>%
> >>> mutate(hm=sprintf("%2d Hour%s %2d Minutes",
> >>> min %/% 60, ifelse((min %/% 60) == 1, " ", "s"),
> >>> min %% 60))
> >>>
>>>> You could replace `sprintf` with `str_glue` (and update the syntax
>>>> as well) if you realy need tidyverse, but you would also loose some
> >>> formatting capability.
> >>>
>>>> I don't know of tidyverse versions of `%/%` or `%%`. If you need
>>>> the numeric values instead of a string then just remove the >>>
>`sprintf` and use mutate directly with `min %/% 60` and `min %% 60`.
> >>>
>>>> This of course assumes all of your data is in minutes (by the time
>>>> you pipe to this code) and that all hours have 60 minutes (I don't
>>>> know of any leap hours.
> >>>
>>>> On Sun, Mar 21, 2021 at 8:31 AM Dr Eberhard W Lisse
><nospam using lisse.na> >> wrote:
> >>>>
> >>>> Hi,
> >>>>
>>>>> I have minutes worked by day (with some more information) >>>>
>>>>> which when using >>>>
> >>>> library(tidyverse)
> >>>> library(lubridate)
> >>>>
> >>>> run through
> >>>>
> >>>> CONSMINUTES %>%
> >>>> select(datum, dauer) %>%
> >>>> arrange(desc(datum))
> >>>>
> >>>> look somewhat like
> >>>>
> >>>> # A tibble: 142 x 2
> >>>> datum dauer
> >>>> <date> <int>
> >>>> 1 2021-03-18 30
> >>>> 2 2021-03-17 30
> >>>> 3 2021-03-16 30
> >>>> 4 2021-03-16 30
> >>>> 5 2021-03-16 30
> >>>> 6 2021-03-16 30
> >>>> 7 2021-03-11 30
> >>>> 8 2021-03-11 30
> >>>> 9 2021-03-11 30
> >>>> 10 2021-03-11 30
> >>>> # … with 132 more rows
> >>>>
> >>>> I can extract minutes per hour
> >>>>
> >>>> CONSMINUTES %>%
> >>>> select(datum, dauer) %>%
> >>>> group_by(week = format(datum, '%Y %V'))%>%
> >>>> summarise_if(is.numeric, sum)
> >>>>
> >>>> and minutes per month
> >>>>
> >>>> CONSMINUTES %>%
> >>>> select(datum, dauer) %>%
> >>>> group_by(month = format(datum, '%Y %m'))%>%
> >>>> summarise_if(is.numeric, sum)
> >>>>
>>>>> I need to show the time worked per week per month in the format of
> >>>>
> >>>> '# hours # minutes'
> >>>>
>>>>> and would like to also be able to show the average time per week
>>>>> per month.
> >>>>
> >>>> How can I do that (preferably with tidyverse :-)-O)?
> >>>>
> >>>> greetings, el
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
--
Sent from my phone. Please excuse my brevity.
More information about the R-help
mailing list