[R] fitting a sinus curve

Marianne.ZEYRINGER at ec.europa.eu Marianne.ZEYRINGER at ec.europa.eu
Mon Aug 1 16:58:31 CEST 2011


Dear David and Hans- Werner,
Thank you very much for your help. I would like to compare now if a
polynomial or the sinus model fits better. How can I see R-squared or
the F- Statistic for the sinus regression, so as to be able to compare
it with the polynomial model?
Thanks a lot and have a nice evening.
Best, 
Mairanne

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Hans W Borchers
Sent: Friday, July 29, 2011 12:21 PM
To: r-help at stat.math.ethz.ch
Subject: Re: [R] fitting a sinus curve

David Winsemius <dwinsemius <at> comcast.net> writes:

> 
> 
> On Jul 28, 2011, at 1:07 PM, Hans W Borchers wrote:
> 
> > maaariiianne <marianne.zeyringer <at> ec.europa.eu> writes:
> >
> >> Dear R community!
> >> I am new to R and would be very grateful for any kind of help. I am

> >> a PhD student and need to fit a model to an electricity load 
> >> profile of a household (curve with two peaks). I was thinking of 
> >> looking if a polynomial of 4th order,  a sinus/cosinus combination 
> >> or a combination of 3 parabels fits the data best. I have problems 
> >> with the sinus/cosinus
> >> regression:

time <- c(
0.00, 0.15,  0.30,  0.45, 1.00, 1.15, 1.30, 1.45, 2.00, 2.15, 2.30,
2.45, 3.00, 3.15, 3.30, 3.45, 4.00, 4.15, 4.30, 4.45, 5.00, 5.15, 5.30,
5.45, 6.00, 6.15, 6.30, 6.45, 7.00, 7.15, 7.30, 7.45, 8.00, 8.15, 8.30,
8.45, 9.00, 9.15, 9.30, 9.45, 10.00, 10.15, 10.30, 10.45, 11.00, 11.15,
11.30, 11.45, 12.00, 12.15, 12.30, 12.45, 13.00, 13.15, 13.30, 13.45,
14.00, 14.15, 14.30, 14.45, 15.00, 15.15, 15.30, 15.45, 16.00, 16.15,
16.30, 16.45, 17.00, 17.15, 17.30, 17.45, 18.00, 18.15, 18.30, 18.45,
19.00, 19.15, 19.30, 19.45, 20.00, 20.15, 20.30, 20.45, 21.00, 21.15,
21.30, 21.45, 22.00, 22.15, 22.30, 22.45, 23.00, 23.15, 23.30, 23.45) 

watt <- c(
94.1, 70.8, 68.2, 65.9, 63.3, 59.5, 55, 50.5, 46.6, 43.9, 42.3, 41.4,
40.8, 40.3, 39.9, 39.5, 39.1, 38.8, 38.5, 38.3, 38.3, 38.5, 39.1, 40.3,
42.4, 45.6, 49.9, 55.3, 61.6, 68.9, 77.1, 86.1, 95.7, 105.8, 115.8,
124.9, 132.3, 137.6, 141.1, 143.3, 144.8, 146, 147.2, 148.4, 149.8,
151.5, 153.5, 156, 159, 162.4, 165.8, 168.4, 169.8, 169.4, 167.6, 164.8,
161.5, 158.1, 154.9, 151.8, 149, 146.5, 144.4, 142.7, 141.5, 140.9,
141.7, 144.9, 151.5, 161.9, 174.6, 187.4, 198.1, 205.2, 209.1, 211.1,
212.2, 213.2, 213, 210.4, 203.9, 192.9, 179, 164.4, 151.5, 141.9, 135.3,
131, 128.2, 126.1, 124.1, 121.6, 118.2, 113.4, 107.4, 100.8)

> >> df<-data.frame(time,  watt)
> >> lmfit <- lm(time ~ watt + cos(time) + sin(time),  data = df)
> >
> > Your regression formula does not make sense to me.
> > You seem to expect a periodic function within 24 hours, and if not 
> > it would still be possible to subtract the trend and then look at a 
> > periodic solution.
> > Applying a trigonometric regression results in the following
> > approximations:

    library(pracma)
    plot(2*pi*time/24, watt, col="red")
    ts  <- seq(0, 2*pi, len = 100)
    xs6 <- trigApprox(ts, watt, 6)
    xs8 <- trigApprox(ts, watt, 8)
    lines(ts, xs6, col="blue", lwd=2)
    lines(ts, xs8, col="green", lwd=2)
    grid()

> > where as examples the trigonometric fits of degree 6 and 8 are used.
> > I would not advise to use higher orders, even if the fit is not 
> > perfect.
> 
> Thank you ! That is a real gem of a worked example. Not only did it 
> introduce me to a useful package I was not familiar with, but there 
> was even a worked example in one of the help pages that might have 
> specifically answered the question about getting a 2nd(?) order trig 
> regression. If I understood the commentary on that page, this method 
> might also be appropriate for an irregular time series, whereas 
> trigApprox and trigPoly would not?


That's true. For the moment, the trigPoly() function works correctly
only with equidistant data between 0 and 2*pi.


> This is adapted from the trigPoly help page in Hans Werner's pracma
> package:


The error I made myself was to take the 'time' variable literally,
though obviously the numbers after the decimal point were meant as
minutes. Thus

  time <- seq(0, 23.75, len = 96)

would be a better choice.
The rest in your adaptation is absolutely correct.

  A <- cbind(1, cos(pi*time/24),   sin(pi*time/24),
                cos(2*pi*time/24), sin(2*pi*time/24))
  (ab <- qr.solve(A, watt))
  # [1] 127.29131 -26.88824 -10.06134 -36.22793 -38.56219
  ts <- seq(0, pi, length.out = 100)
  xs <- ab[1] + ab[2]*cos(ts)   + ab[3]*sin(ts)   +
                ab[4]*cos(2*ts) + ab[5]*sin(2*ts)
  plot(pi*time/24, watt, col = "red",
       xlim=c(0, pi), ylim=range(watt), main = "Trigonometric
Regression")
  lines(ts, xs, col="blue")


> Hans:  I corrected the spelling of "Trigonometric", but other than 
> that I may well have introduced other errors for which I would be 
> happy to be corrected. For instance, I'm unsure of the terminology 
> regarding the ordinality of this model. I'm also not sure if my pi/24 
> and 2*pi/24 factors were correct in normalizing the time scale, 
> although the prediction seemed sensible.


And yes, this curve is the best trigonometric approximation you can get
for this order(?). You will see the same result when you apply and plot

  xs1 <- trigApprox(ts, watt, 1)

But I see your problem with the term 'order' I will have a closer look
at this and clarify the terminology on the help page.

[All this reminds me of an article in the Mathematical Intelligencer
some  years ago where it was convincingly argued that the universal
constant \pi  should have the value 2*pi (in today's notation).]

Thanks, Hans Werner


> 
> >
> > Hans Werner
> >
> >> Thanks a lot,
> >> Marianne
>

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list