[R] Question on creating Date variable

David Winsemius dwinsemius at comcast.net
Mon Dec 31 22:15:18 CET 2012


On Dec 31, 2012, at 11:57 AM, David Winsemius wrote:

>
> On Dec 31, 2012, at 11:54 AM, Christofer Bogaso wrote:
>
>> On 01 January 2013 01:29:53, David Winsemius wrote:
>>>
>>> On Dec 31, 2012, at 11:35 AM, Christofer Bogaso wrote:
>>>
>>>> On 01 January 2013 00:17:50, David Winsemius wrote:
>>>>>
>>>>> On Dec 31, 2012, at 9:12 AM, Christofer Bogaso wrote:
>>>>>
>>>>>> Hello all,
>>>>>>
>>>>>> Let say I have following (numeric) vector:
>>>>>>
>>>>>> > x
>>>>>> [1] 11.00 11.25 11.35 12.01 11.14 13.00 13.25 13.35 14.01 13.14  
>>>>>> 14.50
>>>>>> 14.75 14.85 15.51 14.64
>>>>>>
>>>>>> Now, I want to create a 'Date' variable (i.e. I should be able  
>>>>>> to do
>>>>>> all calculations pertaining to date/time and also time-series
>>>>>> plotting etc.) like
>>>>>>
>>>>>> 2012-12-31 11:00:00 AM, 2012-12-31 11:25:00 AM, 2012-12-31  
>>>>>> 11:35:00
>>>>>> AM, 2012-12-31 12:01:00 PM, . . . .
>>>>>>
>>>>>
>>>>> Those _times_ ( _not_ Dates) cannot possibly be in %M.%S" format,
>>>>> given the number of items to the right of the decimal point that  
>>>>> are
>>>>> greater than 60. So will proceed on the arguably more likely
>>>>> assumption that they are in fractional minutes. To recover from  
>>>>> that
>>>>> problem, one might consider:
>>>>>
>>>>> > as.POSIXct(paste( floor(x), round(60*(x-floor(x))) ),  
>>>>> format="%M %S")
>>>>> [1] "2012-12-31 00:11:00 PST" "2012-12-31 00:11:15 PST"
>>>>> [3] "2012-12-31 00:11:21 PST" "2012-12-31 00:12:01 PST"
>>>>> [5] "2012-12-31 00:11:08 PST" "2012-12-31 00:13:00 PST"
>>>>> [7] "2012-12-31 00:13:15 PST" "2012-12-31 00:13:21 PST"
>>>>> [9] "2012-12-31 00:14:01 PST" "2012-12-31 00:13:08 PST"
>>>>> [11] "2012-12-31 00:14:30 PST" "2012-12-31 00:14:45 PST"
>>>>> [13] "2012-12-31 00:14:51 PST" "2012-12-31 00:15:31 PST"
>>>>> [15] "2012-12-31 00:14:38 PST"
>>>>>
>>>>
>>>> I understand that some of those elements are not "dates". However
>>>> what I want is the ***"PM/AM" suffix*** on those elements which are
>>>> considered as Dates.
>>>>
>>>> ***Getting those suffix*** and doing calculations on those changed
>>>> variables is my primary concern.
>>>
>>> That's the first time that AM/PM has bee mentioned and I suppose if
>>> those were fractional hours rather than my guess of fractional  
>>> minutes
>>> that there might be representatives of both in the numeric data you
>>> offered. Why don't you clarify what these number do in fact  
>>> represent?
>>> And what problem you are trying to solve?
>>>
>>
>> Basically those are artificial data! Actually I do not have the  
>> right to give out the original data in any public forum. So I  
>> created those artificial data so that I can get the fundamental  
>> idea ...........
>>
>> Each element (assuming they are legitimate time) represents the  
>> time for a particular day when some event is pop-up. like, 11AM,  
>> 11.30AM, 12.05PM etc.. I could work with something like 11.00,  
>> 11.30, 12.05, 15.00 etc. however I believe adding AM/PM suffice  
>> will make my report more eye-catching.
>>
>> Please let me know if you need more clarification.
>
> So what's with the values above 59 in the minutes?

Failing an answer to that question, this code shows how to input date- 
time vectors from character vectors and then output it from date-time  
class to character class:

  x <- scan(text="11.00 11.25 11.35 12.01 11.14 13.00 13.25 13.35  
14.01 13.14 14.50 14.75 14.85 15.51 14.64")  # This will come in as a  
numeric vector

?strptime     # for the available format specifications
format( as.POSIXct(as.character(x), format="%H.%M"),  # That is the  
input format
              format="%I.%M %p")     # the output format
  [1] NA         "11.25 AM" "11.35 AM" "12.01 PM" "11.14 AM" NA
  [7] "01.25 PM" "01.35 PM" "02.01 PM" "01.14 PM" "02.05 PM" NA
[13] NA         "03.51 PM" NA

I suspect that the NA when minutes are ".00" comes from the implicit  
loss of the trailing digits:

 > as.character(0.00)
[1] "0"

The claim that this data is proprietary and cannot presented in its  
original form sound somewhat ridiculous.  Simmply post:

  dput(head(dfrm$time_data_column_name, 20))

How could that represent any disclosure of proprietary information if  
presented with no context?

-- 

David Winsemius, MD
Alameda, CA, USA




More information about the R-help mailing list