[R] Question on creating Date variable

Christofer Bogaso bogaso.christofer at gmail.com
Tue Jan 1 06:40:25 CET 2013


On 01 January 2013 03:00:18, David Winsemius wrote:
>
> On Dec 31, 2012, at 11:57 AM, David Winsemius wrote:
>
>>
>> On Dec 31, 2012, at 11:54 AM, Christofer Bogaso wrote:
>>
>>> On 01 January 2013 01:29:53, David Winsemius wrote:
>>>>
>>>> On Dec 31, 2012, at 11:35 AM, Christofer Bogaso wrote:
>>>>
>>>>> On 01 January 2013 00:17:50, David Winsemius wrote:
>>>>>>
>>>>>> On Dec 31, 2012, at 9:12 AM, Christofer Bogaso wrote:
>>>>>>
>>>>>>> Hello all,
>>>>>>>
>>>>>>> Let say I have following (numeric) vector:
>>>>>>>
>>>>>>> > x
>>>>>>> [1] 11.00 11.25 11.35 12.01 11.14 13.00 13.25 13.35 14.01 13.14
>>>>>>> 14.50
>>>>>>> 14.75 14.85 15.51 14.64
>>>>>>>
>>>>>>> Now, I want to create a 'Date' variable (i.e. I should be able
>>>>>>> to do
>>>>>>> all calculations pertaining to date/time and also time-series
>>>>>>> plotting etc.) like
>>>>>>>
>>>>>>> 2012-12-31 11:00:00 AM, 2012-12-31 11:25:00 AM, 2012-12-31 11:35:00
>>>>>>> AM, 2012-12-31 12:01:00 PM, . . . .
>>>>>>>
>>>>>>
>>>>>> Those _times_ ( _not_ Dates) cannot possibly be in %M.%S" format,
>>>>>> given the number of items to the right of the decimal point that are
>>>>>> greater than 60. So will proceed on the arguably more likely
>>>>>> assumption that they are in fractional minutes. To recover from that
>>>>>> problem, one might consider:
>>>>>>
>>>>>> > as.POSIXct(paste( floor(x), round(60*(x-floor(x))) ),
>>>>>> format="%M %S")
>>>>>> [1] "2012-12-31 00:11:00 PST" "2012-12-31 00:11:15 PST"
>>>>>> [3] "2012-12-31 00:11:21 PST" "2012-12-31 00:12:01 PST"
>>>>>> [5] "2012-12-31 00:11:08 PST" "2012-12-31 00:13:00 PST"
>>>>>> [7] "2012-12-31 00:13:15 PST" "2012-12-31 00:13:21 PST"
>>>>>> [9] "2012-12-31 00:14:01 PST" "2012-12-31 00:13:08 PST"
>>>>>> [11] "2012-12-31 00:14:30 PST" "2012-12-31 00:14:45 PST"
>>>>>> [13] "2012-12-31 00:14:51 PST" "2012-12-31 00:15:31 PST"
>>>>>> [15] "2012-12-31 00:14:38 PST"
>>>>>>
>>>>>
>>>>> I understand that some of those elements are not "dates". However
>>>>> what I want is the ***"PM/AM" suffix*** on those elements which are
>>>>> considered as Dates.
>>>>>
>>>>> ***Getting those suffix*** and doing calculations on those changed
>>>>> variables is my primary concern.
>>>>
>>>> That's the first time that AM/PM has bee mentioned and I suppose if
>>>> those were fractional hours rather than my guess of fractional minutes
>>>> that there might be representatives of both in the numeric data you
>>>> offered. Why don't you clarify what these number do in fact represent?
>>>> And what problem you are trying to solve?
>>>>
>>>
>>> Basically those are artificial data! Actually I do not have the
>>> right to give out the original data in any public forum. So I
>>> created those artificial data so that I can get the fundamental idea
>>> ...........
>>>
>>> Each element (assuming they are legitimate time) represents the time
>>> for a particular day when some event is pop-up. like, 11AM, 11.30AM,
>>> 12.05PM etc.. I could work with something like 11.00, 11.30, 12.05,
>>> 15.00 etc. however I believe adding AM/PM suffice will make my
>>> report more eye-catching.
>>>
>>> Please let me know if you need more clarification.
>>
>> So what's with the values above 59 in the minutes?
>
> Failing an answer to that question, this code shows how to input
> date-time vectors from character vectors and then output it from
> date-time class to character class:
>
>  x <- scan(text="11.00 11.25 11.35 12.01 11.14 13.00 13.25 13.35 14.01
> 13.14 14.50 14.75 14.85 15.51 14.64")  # This will come in as a
> numeric vector
>
> ?strptime     # for the available format specifications
> format( as.POSIXct(as.character(x), format="%H.%M"),  # That is the
> input format
>              format="%I.%M %p")     # the output format
>  [1] NA         "11.25 AM" "11.35 AM" "12.01 PM" "11.14 AM" NA
>  [7] "01.25 PM" "01.35 PM" "02.01 PM" "01.14 PM" "02.05 PM" NA
> [13] NA         "03.51 PM" NA
>
> I suspect that the NA when minutes are ".00" comes from the implicit
> loss of the trailing digits:
>
> > as.character(0.00)
> [1] "0"
>
> The claim that this data is proprietary and cannot presented in its
> original form sound somewhat ridiculous.  Simmply post:
>
>  dput(head(dfrm$time_data_column_name, 20))
>
> How could that represent any disclosure of proprietary information if
> presented with no context?
>

'How could that represent any disclosure of proprietary information if 
presented with no context? ' I must agree with you. But I just dont 
want to take any risk! (job scenario in my country is not very 
optimistic and I want to give my boss minimal chance/reason to fire!)

And secondly with your approach, I cant do any calculation. Let take 
following example:

y <- format( as.POSIXct(as.character(x), format="%H.%M"),  # That is 
the input format
             format="%I.%M %p")

y[3] - y[2]

This gives me following error:

Error in y[3] - y[2] : non-numeric argument to binary operator

I am having same error with Devid's approach as well:

> y <- as.POSIXct(paste( floor(x), round(60*(x-floor(x))) ), format="%H %M")
> z <- format(y, format="%Y-%m-%d %I:%M %p")
> z[2] - z[1]
Error in z[2] - z[1] : non-numeric argument to binary operator.

Thanks and regards,




More information about the R-help mailing list