[R] linear regression with dates

David Winsemius dwinsemius at comcast.net
Tue Dec 28 04:14:41 CET 2010


On Dec 27, 2010, at 8:56 PM, Entropi ntrp wrote:

> Thanks  for the response. I proivded the necessary details below,   
> and also
> have a general question for how to deal with dates in R. Is there a  
> way to
> make R read dates as numbers?
>
> Here is the details of the R code:
>
> egfr <- read.csv(file.choose(), header=TRUE, sep=",")   #egfr is a  
> matrix
> read from a .csv file.

Well, whatever file you chose might have been a matrix in its former  
life, but is _now_ a dataframe.

>
> egfr_value=egfr$VALUE  #a vector that contains all the values of y

If egfr has a column named "VALUE" then that might be so. But what is  
this "y-value" notion you are carrying on about?

>
> test_egfr=egfr_value[1:14]  #a vector with the first 14 values of y

1st 14 values of egfr_value, anyway.

>
> lab_date=egfr$LAB  #a vector that contains all the values of x,  
> which are
> dates
>
> test_lab_date=lab_date[1:14] # a vector with the first 14 values of x

Of unceratin class at the moment.
>
> test_egfr
> [1] 16.8 16.9 20.4 16.8 19.5 20.2 17.2 17.8 15.9 15.6 17.3 15.3 17.4  
> 15.9
> 666 Levels:  <5 0 10 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8  
> 10.9 ... There
> is

Aha! A factor classed variable! .... not the numeric value you were  
expecting. Your dataset had a non-numeric value and R decided it was  
really a string during the process of read.csv()  and made it a  
factor. Look at the read.table help page and read about asIs and  
stringsAsFactors.


>
>> test_lab_date
> [1] 11/12/1999 11/29/1999 1/4/2000   1/14/2000  1/31/2000  2/8/2000
> 2/17/2000
> [8] 2/19/2000  2/22/2000  2/23/2000  2/4/1997   2/25/1997  3/11/1997
> 3/25/1997
> 3538 Levels:  1/1/2004 1/1/2005 1/1/2006 1/1/2007 1/1/2009  
> 1/1/2010 ...
> 9/9/2010
>
> R code:
>> lm.egfr=lm(test_lab_date~test_egfr)
>
>
> Error  messsage after running the above line:
>
> Error in storage.mode(y) <- "double" :
>  invalid to change the storage mode of a factor
> In addition: Warning message:
> In model.response(mf, "numeric") :
>  using type="numeric" with a factor response will be ignored
>
> Thanks,
>
>
> On Mon, Dec 27, 2010 at 3:04 PM, Entropi ntrp <entropy053 at gmail.com>  
> wrote:
>
>> Hi,
>> I am trying to do simple linear regression using dates in R but  
>> receiving
>> error messages. With the data shown below, I would like to regress  
>> x on y.
>>
>> x                                 y
>>  11/12/1999 56.8  11/29/1999 17.9  01/04/2000 27.4  1/14/2000 96.8
>> 1/31/2000 49.5
>> R gives the following error messages  after reading the linear  
>> regression
>> command:
>>
>>  Error in storage.mode(y) <- "double" :
>>  invalid to change the storage mode of a factor
>> In addition: Warning message:
>> In model.response(mf, "numeric") :
>>  using type="numeric" with a factor response will be ignored
>>
>> Can someone explain me how to resolve this issue?
>>
>> I appreciate your help in advance.
-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list