[R] Check for date variable in a arbitrary dataset

Gabor Grothendieck ggrothendieck at gmail.com
Tue Nov 25 13:20:59 CET 2008


If its a date you should ensure its of "Date" class prior to
performing this analysis rather than representing it as something
else.  See R News 4/1.

On Tue, Nov 25, 2008 at 2:28 AM, Harsh <singhalblr at gmail.com> wrote:
> Thank you Gabor for your prompt reply.
>
> I had tried checking for class, but it returns three types of my
> dataset, which are numeric, integer and character.
> The problem with that is, I need to classify some columns as
> categorical and in doing so I have a cut off of 100 or less unique
> values in the column/variable.
>
> In case of dates, I cannot consider it to be categorical, since in a
> 100,000 row dataset, dates will take more than 100 unique values, and
> checking for its class will return character.
> If i could somehow know it was a date, then I could provide a Time
> Series analysis for it.
> The same may hold true for, example serial number.
>
> I tried using regexpr to check for the presence of "/" OR "-" to check
> for it being a date column, but it returns the presence of the first
> "/" which if is at the position 3, could mean a date format of
> dd/mm/yyyy or mm/dd/yyyy.
>
> This would be a long winded approach, and I am looking for something
> more efficient.
>
> Thank you for your time.
>
>
> Harsh Singhal
>
>
> On Mon, Nov 24, 2008 at 7:06 PM, Gabor Grothendieck
> <ggrothendieck at gmail.com> wrote:
>> The classes of the columns are:
>>
>> sapply(DF, class)
>>
>>
>> On Mon, Nov 24, 2008 at 3:39 AM, Harsh <singhalblr at gmail.com> wrote:
>>> Hello,
>>> This is my first time posting to the R-help list and I apologize for
>>> the apparent triviality of my query.
>>> I am creating an R script to create Univariate Exploratory Analysis of
>>> a input dataset (No meta-data to provide extra information about each
>>> column)
>>> .
>>> Providing summary statistics is possible in case of numeric data and
>>> using all.is.numeric() from the Hmisc package allows me to filter out
>>> those columns with alpha-numeric content.
>>>
>>> I have tried to check if a column is a date field or not, but have not
>>> been able to do so. Are Regular Expressions the only answer? I've also
>>> looked for CRAN packages but haven't found any.
>>> Bering a newbie user of R, I do not possess the requisite knowledge to
>>> write my own function for the above objective.
>>>
>>> Thank you for your time
>>> Harsh
>>> Decisions Systems Group
>>> Mu Sigma Inc.
>>> Chicago, IL
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>



More information about the R-help mailing list