[R-SIG-Finance] Error in lm prediction

amol gupta amolgupta87 at gmail.com
Thu Aug 3 08:12:11 CEST 2017


All

Thank you all for the response. I could resolve the issue by separating
formula and data. That is where I was committing error.

Joshua Ulrich
 I will try to send data set along  or ensure that the problem is
reproducible.



On Fri, Jul 28, 2017 at 7:28 PM, Kevin Dhingra <
kevin.dhingra at appliedacademics.com> wrote:

> Hi Amol,
>
> The lm function is not intended to be used in the way you are calling it.
> Even though you can actually pass y and x as actual data in the formula
> argument (y~x), its better to pass the data set in the data argument and
> use column names in the formula argument especially when you want to use
> the predict function on the fitted object as predict.lm looks for variables
> in the function environment. In your example, newdata and those variables
> would not have similar length that results in length of y_hat equal to 300.
>
> Now there might be some clever way to get around this with the same
> function call that you used (you can try playing with the variable name of
> new data to be same as column names in x) but I would rather suggest using
> this -
>
> a<-300
> data_fit = data.frame(x = matrix(rnorm(1700*5), ncol = 5), y =
> matrix(rnorm(1700)))
> data_fit_is = data_fit[1:a,] #In Sample
> data_fit_os = data_fit[(a+1):nrow(data_fit), ] #Out of Sample
> m1 = lm(y~., data = data_fit_is)
> length(predict(m1, data_fit_os[, 1:5])) #Should be equal to 1400 now and
> 300 now
>
> Regards,
> Kshitij Dhingra
>
> On Sun, Jul 16, 2017 at 4:31 PM, amol gupta <amolgupta87 at gmail.com> wrote:
>
>> Hi
>>
>> I am most likely committing an error in trying to predict  using linear
>> regression lm model. please help me figure out what am I doing wrong. I am
>> trying to regress a index and its constituents. Here is the code
>>
>>
>> #split ts inttwo parts
>> a<-300;
>>
>> x1<-x[1:a,];
>> y1<-y[1:a,];
>>
>> x2<-x[(a+1):nrow(x),];
>> y2<-y[(a+1):nrow(y),];
>>
>>
>> #regression
>> m1<-lm( y1~x1)
>> r1<-residuals(m1)
>> coef(m1)
>>
>> ##out of sample
>> y_hat<-predict.lm(m1,x2);
>> r2<-y_hat-y2;
>>
>>
>> x,y are xts. X contains multiple time series. The y_ hat turns out to be
>> of
>> 300 samples only, whereas x2 contains 1400 samples.
>>
>> Please help me figure out how to predict using model that I have found
>> using regression.
>>
>>
>> --
>> Regards
>> Amol
>> +91-9897860992
>> +91-8889676918
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-SIG-Finance at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-finance
>> -- Subscriber-posting only. If you want to post, subscribe first.
>> -- Also note that this is not the r-help list where general R questions
>> should go.
>>
>
>
>
> --
> Kshitij Dhingra
> Applied Academics LLC
> Office: +1.917.262.0516
> Mobile: +1.206.696.5945
> Email: kshitij.dhingra at appliedacademics.com
> Website: http://www.AppliedAcademics.com
>



-- 
Regards
Amol
+91-9897860992
+91-8889676918

	[[alternative HTML version deleted]]



More information about the R-SIG-Finance mailing list