[R] linear regression with dates
David Winsemius
dwinsemius at comcast.net
Tue Dec 28 04:14:41 CET 2010
On Dec 27, 2010, at 8:56 PM, Entropi ntrp wrote:
> Thanks for the response. I proivded the necessary details below,
> and also
> have a general question for how to deal with dates in R. Is there a
> way to
> make R read dates as numbers?
>
> Here is the details of the R code:
>
> egfr <- read.csv(file.choose(), header=TRUE, sep=",") #egfr is a
> matrix
> read from a .csv file.
Well, whatever file you chose might have been a matrix in its former
life, but is _now_ a dataframe.
>
> egfr_value=egfr$VALUE #a vector that contains all the values of y
If egfr has a column named "VALUE" then that might be so. But what is
this "y-value" notion you are carrying on about?
>
> test_egfr=egfr_value[1:14] #a vector with the first 14 values of y
1st 14 values of egfr_value, anyway.
>
> lab_date=egfr$LAB #a vector that contains all the values of x,
> which are
> dates
>
> test_lab_date=lab_date[1:14] # a vector with the first 14 values of x
Of unceratin class at the moment.
>
> test_egfr
> [1] 16.8 16.9 20.4 16.8 19.5 20.2 17.2 17.8 15.9 15.6 17.3 15.3 17.4
> 15.9
> 666 Levels: <5 0 10 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8
> 10.9 ... There
> is
Aha! A factor classed variable! .... not the numeric value you were
expecting. Your dataset had a non-numeric value and R decided it was
really a string during the process of read.csv() and made it a
factor. Look at the read.table help page and read about asIs and
stringsAsFactors.
>
>> test_lab_date
> [1] 11/12/1999 11/29/1999 1/4/2000 1/14/2000 1/31/2000 2/8/2000
> 2/17/2000
> [8] 2/19/2000 2/22/2000 2/23/2000 2/4/1997 2/25/1997 3/11/1997
> 3/25/1997
> 3538 Levels: 1/1/2004 1/1/2005 1/1/2006 1/1/2007 1/1/2009
> 1/1/2010 ...
> 9/9/2010
>
> R code:
>> lm.egfr=lm(test_lab_date~test_egfr)
>
>
> Error messsage after running the above line:
>
> Error in storage.mode(y) <- "double" :
> invalid to change the storage mode of a factor
> In addition: Warning message:
> In model.response(mf, "numeric") :
> using type="numeric" with a factor response will be ignored
>
> Thanks,
>
>
> On Mon, Dec 27, 2010 at 3:04 PM, Entropi ntrp <entropy053 at gmail.com>
> wrote:
>
>> Hi,
>> I am trying to do simple linear regression using dates in R but
>> receiving
>> error messages. With the data shown below, I would like to regress
>> x on y.
>>
>> x y
>> 11/12/1999 56.8 11/29/1999 17.9 01/04/2000 27.4 1/14/2000 96.8
>> 1/31/2000 49.5
>> R gives the following error messages after reading the linear
>> regression
>> command:
>>
>> Error in storage.mode(y) <- "double" :
>> invalid to change the storage mode of a factor
>> In addition: Warning message:
>> In model.response(mf, "numeric") :
>> using type="numeric" with a factor response will be ignored
>>
>> Can someone explain me how to resolve this issue?
>>
>> I appreciate your help in advance.
--
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list