[R] Alternatives to integrate?

Mon Sep 5 16:27:57 CEST 2011

Hi, continuing the improvements...

I've prepared a new code:

ddae <- function(individuals, frac, sad, samp="pois", trunc=0, ...) {
  dots <- list(...)
  Compound <- function(individuals, frac, n.species, sad, samp, dots) {
    print(c("Size:", length(individuals), "Compound individuals:",
individuals, "End."))
    RegDist <- function(n.species, sad, dots) {  # "RegDist" may be
Exponential, Gamma, etc.
      dcom <- paste("d", as.name(sad), sep="")
      dots <- as.list(c(n.species, dots))
      ans <- do.call(dcom, dots)
      return(ans)
    }
    SampDist <- function(individuals, frac, n.species, samp) {  #
"SampDist" may be Poisson or Negative Binomial
      dcom <- paste("d", samp, sep="")
      lambda <- frac * n.species
      dots <- as.list(c(individuals, lambda))
      ans <- do.call(dcom, dots)
      return(ans)
    }
    ans <- RegDist(n.species, sad, dots) * SampDist(individuals, frac,
n.species, samp)
    return(ans)
  }
  IntegrateScheme <- function(Compound, individuals, frac, sad, samp, dots) {
    print(c("Size:", length(individuals), "Integrate individuals:",
individuals))
    ans <- integrate(Compound, 0, 2000, individuals, frac, sad, samp,
dots)$value
    return(ans)
  }
  ans <- IntegrateScheme(Compound, individuals, frac, sad, samp, dots)
  return(ans)
}

ddae(2, 0.05, "exp")

Now I can't understand what happen to "individuals", why is it
changing in value and size? I've tried to "traceback()" and "debug()",
but I was not smart enough to understand what is going on.

Could you, please, give some more help?

Thanks in advance.

On Thu, Sep 1, 2011 at 10:41 PM, R. Michael Weylandt
<michael.weylandt at gmail.com> wrote:
> Actually, it's very easy to integrate a function of two variables in a
> single variable for a given value of the other variable.
>
> Using your example:
>
> MySum <- function(x,y) {
>      ans = x + y
>      return(ans)
> }
>
> Note a things about how I wrote this. One, I broke the function out and used
> curly braces to enclose the body of the expression; secondly, I kept the
> body of the function at a constant indent level using spaces, not hard tabs;
> thirdly, I gave it a meaningful (if somewhat silly) name. (There are so many
> things that have names like "func" or "f" in R that you really don't want to
> risk overloading something important) Finally, I used the (technically
> unnecessary) return() command to say specifically what values my function
> will be return. The use of "ans" is a personal preference, but I think it
> makes clear what the function is aiming at.
>
> Suppose I want to integrate this over [0,1] with y = 3. This can be coded
>
> R> integrate(MySum, 0, 1, 3)
> 3.5
>
> If you read the documentation for integrate (? integrate) you'll see that
> there is an optional "..." argument that allows further parameters to be
> passed to the integrand. Here, this is only the value of y.
>
> Now suppose I want to define a function that integrates over that same unit
> interval but takes y as an argument. This can be done as
>
> BadIntegrateMySum <- function(y) {
>      ans = integrate(MySum, 0, 1, y)
>      return(ans)
> }
>
> However, this is a potentially dangerous thing to do because it requires
> MySum to just show up inside of BadIntegrateMySum. R is able to try to help
> you out, but really it's very dangerous so don't rely on it. Rather, define
> MySum inside of the first function as a helper inside of the larger
> function:
>
> GoodIntegrateMySum <- function(y) {
>
>     MySumHelper <- function(x,y) {
>         ans = x + y
>         return(ans)
>     }
>
>     ans = integrate(MySumHelper, 0, 1, y)
>     return(ans)
> }
>
> Hopefully this is much clearer. There's a slightly contentious stylistic
> point here -- whether it's ok to use y in the definition of the helper and
> in the bigger function -- but I think it's ok in this circumstance because
> the two instances specifically correspond to each other.
>
> A more general form of this could take in "MySumHelper" as an argument (yes
> functions can be passed like that)
>
> # MySum as above
>
> GoodIntegrateUnitInterval <- function(xIntegrand, yParameter) {
>     # Requires xIntegrand to be a function of two variables x,y
>     # You can actually do this in the code, but for now let's just assume no
> user error and that xIntegrand is the right sort of thing.
>     ans = integrate(xIntegrand, 0, 1, yParameter)
>     return(ans)
> }
>
> R> GoodIntegrateUnitInverval(MySum, 3)
> 3.5
>
> as before.
>
> There's nothing wrong with using "result" like I've used "ans," but I do
> hesitate to see it used as a function rather than a variable. A good rule of
> thumb is to check if a variable is already defined as a function name using
> the apropos() command.
>
> I don't have time or inclination to rework your whole code right now, but
> take a stab at formatting it with consistent+informative variable and
> function names, a well reasoned use of scoping, and appropriate use of
> integrate() and I'll happily comment on it.
>
> Hope this helps,
>
> Michael Weylandt
>
> On Thu, Sep 1, 2011 at 8:57 PM, . . <xkziloj at gmail.com> wrote:
>>
>> Thanks for your reply Michael, it seems I have a lot of things to
>> learn yet but for sure, your response is being very helpful in this
>> proccess. I will try to explore every point you said:
>>
>> A doubt I have is, if I define "func <- function(x,y) x + y" how can I
>> integrate it only in "x"? My solution for this would be to define
>> "func <- function(x) x + y". Is not ok?
>>
>> Also, with respect to the helper functions I'd created, I am wondering
>> if you can see a better organization for my code. It is so because
>> this is the only way I can see. Particularly I do not like how I am
>> using "results", but I can not think in another form.
>>
>> Thanks in advance.
>>
>> On Thu, Sep 1, 2011 at 2:44 PM, R. Michael Weylandt
>> <michael.weylandt at gmail.com> wrote:
>> > Leaving aside some other issues that this whole email chain has opened
>> > up,
>> >
>> > I'd guess that your most immediate problem is that you are trying to
>> > numerically integrate the PMF of a discrete distribution but you are
>> > treating it as a continuous distribution. If you took the time to
>> > properly
>> > debug (as you were instructed yesterday) you'd probably find that
>> > whenever
>> > you call dpois(x, lambda) for x not an integer you get a warning
>> > message.
>> >
>> > Specifically, check this out
>> >
>> >> integrate(dpois,0,Inf,1)
>> > 9.429158e-13 with absolute error < 1.7e-12
>> >
>> >> n = 0:1000; sum(dpois(n,1))
>> > 1
>> >
>> > I could be entirely off base here, but I'm guessing that many of your
>> > problems derive from this.
>> >
>> >
>> >
>> > On another basis, please, please read this:
>> > http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html
>> > or this
>> > http://had.co.nz/stat405/resources/r-style-guide.html
>> >
>> > And, perhaps most importantly, don't rely on the black magic of values
>> > moving in and out of functions (lexical scoping). Seriously, just don't
>> > do
>> > it.
>> >
>> > If you have helper functions that need values, actively pass them: you
>> > will
>> > save yourself hours of trouble when (not if) you debug your functions.
>> > I'm
>> > looking, for example, at g() in the first big block of code you
>> > provided.
>> > Call it g(a,n) and spend the extra 4 keystrokes to pass the values. It
>> > makes
>> > everyone happier.
>> >
>> > Michael
>> >
>> > On Thu, Sep 1, 2011 at 12:37 PM, . . <xkziloj at gmail.com> wrote:
>> >>
>> >> So, please excuse me Michael, you are completely sure. I will try
>> >> describe I am trying to do, please let me know if I can provide more
>> >> info.
>> >>
>> >> The idea is provide to "func" two probability density functions(PDFs)
>> >> and obtain another PDF that is a compound of them. In a final analysis
>> >> this characterize an abundance distribution for me. The two PDFs are
>> >> provided through "f" and "g" and there is some manipulation here
>> >> because I need flexibility to easily change this two funcions.
>> >>
>> >> In the code provided, "f" is the Exponential distribution and "g" is
>> >> the Poisson distribution. For this case, I have the analytical
>> >> solution, below. This way I can check the result. But I am also
>> >> considering other combinations of  "f" and "g" that have difficult, or
>> >> even does not have analitical solution. This is the reason why I am
>> >> trying to develop "func".
>> >>
>> >> func2 <- function(y, frac, rate, trunc=0, log=FALSE) {
>> >>    is.wholenumber <- function(x, tol = .Machine$double.eps^0.5)
>> >>        abs(x - round(x)) < tol
>> >>    if(FALSE %in% sapply(y,is.wholenumber))
>> >>        print("y must be integer because dpoix is a discrete PDF.")
>> >>    else {
>> >>        f <- function(y){
>> >>            b <- y*log(frac)
>> >>            m <- log(rate)
>> >>            n <- (y+1)*log(rate+frac)
>> >>            if(log)b+m-n else exp(b+m-n)
>> >>        }
>> >>        f(y)/(1-f(trunc))
>> >>    }
>> >> }
>> >> > func2(200,0.05,0.001)
>> >> [1] 0.000381062
>> >>
>> >> In theory, the interval of integration is 0 to Inf, but for some tests
>> >> I did, go up to 2000 may still provide reasonable results.
>> >>
>> >> Also, as it seems, I am still writing my first functions in R and
>> >> suggestions are welcome, please.
>> >>
>> >> Again, appologies for my previous mistake. It was not my intention to
>> >> blame about "integrate".
>> >>
>> >> On Thu, Sep 1, 2011 at 11:49 AM, R. Michael Weylandt
>> >> <michael.weylandt at gmail.com> wrote:
>> >> > I'm going to try to put this nicely:
>> >> >
>> >> > What you provided is not a problem with integrate. Instead, you
>> >> > provided
>> >> > a
>> >> > rather unintelligible and badly-written piece of code that
>> >> > (miraculously)
>> >> > seems to work, though it's not well documented so I have no idea if
>> >> > 1.3e-21
>> >> > is what you want to get.
>> >> >
>> >> > Let's try this again: per your original request, what is the problem
>> >> > with
>> >> > integrate?
>> >> >
>> >> > If instead you feel there's something wrong with your code, might I
>> >> > suggest
>> >> > you just say that and ask for help, rather than passing the blame
>> >> > onto a
>> >> > perfectly useful base function.
>> >> >
>> >> > Oh, and since you asked that I propose something: comment your code.
>> >> >
>> >> > Michael
>> >> >
>> >> > On Thu, Sep 1, 2011 at 10:33 AM, . . <xkziloj at gmail.com> wrote:
>> >> >>
>> >> >> Hi Michael,
>> >> >>
>> >> >> This is the problem:
>> >> >>
>> >> >> func <- Vectorize(function(x, a, sad, samp="pois", trunc=0, ...) {
>> >> >>  result <- function(x) {
>> >> >>    f1 <- function(n) {
>> >> >>                        f <- function() {
>> >> >>        dcom <- paste("d", sad, sep="")
>> >> >>        dots <- c(as.name("n"), list(...))
>> >> >>        do.call(dcom, dots)
>> >> >>                        }
>> >> >>      g <- function() {
>> >> >>        dcom <- paste("d", samp, sep="")
>> >> >>        lambda <- a * n
>> >> >>        dots <- c(as.name("x"), as.name("lambda"))
>> >> >>        do.call(dcom, dots)
>> >> >>      }
>> >> >>      f() * g()
>> >> >>    }
>> >> >>    integrate(f1,0,2000)$value
>> >> >> #     adaptIntegrate(f1,0,2000)$integral
>> >> >>
>> >> >> #     n <- 0:2000
>> >> >> #     trapz(n,f1(n))
>> >> >>
>> >> >> #     area(f1, 0, 2000, limit=10000, eps=1e-100)
>> >> >>  }
>> >> >>  return(result(x) / (1 - result(trunc)))
>> >> >> }, "x")
>> >> >> func(200, 0.05, "exp", rate=0.001)
>> >> >>
>> >> >> If you could propose something I will be gratefull.
>> >> >>
>> >> >> Thanks in advance.
>> >> >>
>> >> >> On Thu, Sep 1, 2011 at 10:55 AM, R. Michael Weylandt
>> >> >> <michael.weylandt at gmail.com> wrote:
>> >> >> > Mr ". .",
>> >> >> >
>> >> >> > MASS::area comes to mind but it may be more helpful if you could
>> >> >> > say
>> >> >> > what
>> >> >> > you are looking for / why integrate is not appropriate it is for
>> >> >> > whatever
>> >> >> > you are doing.
>> >> >> >
>> >> >> > Strictly speaking, I suppose there are all sorts of "alternatives"
>> >> >> > to
>> >> >> > integrate() if you are willing to be really creative and build
>> >> >> > something
>> >> >> > from scratch: diff(), cumsum(), lm(), hist(), t(), c(), ....
>> >> >> >
>> >> >> > Michael Weylandt
>> >> >> >
>> >> >> > On Thu, Sep 1, 2011 at 9:53 AM, B77S <bps0002 at auburn.edu> wrote:
>> >> >> >>
>> >> >> >> package "caTools"
>> >> >> >> see ?trapz
>> >> >> >>
>> >> >> >>
>> >> >> >> . wrote:
>> >> >> >> >
>> >> >> >> > Hi all,
>> >> >> >> >
>> >> >> >> > is there any alternative to the function integrate?
>> >> >> >> >
>> >> >> >> > Any comments are welcome.
>> >> >> >> >
>> >> >> >> > Thanks in advance.
>> >> >> >> >
>> >> >> >> > ______________________________________________
>> >> >> >> > R-help at r-project.org mailing list
>> >> >> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> >> > PLEASE do read the posting guide
>> >> >> >> > http://www.R-project.org/posting-guide.html
>> >> >> >> > and provide commented, minimal, self-contained, reproducible
>> >> >> >> > code.
>> >> >> >> >
>> >> >> >>
>> >> >> >> --
>> >> >> >> View this message in context:
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> http://r.789695.n4.nabble.com/Alternatives-to-integrate-tp3783624p3783645.html
>> >> >> >> Sent from the R help mailing list archive at Nabble.com.
>> >> >> >>
>> >> >> >> ______________________________________________
>> >> >> >> R-help at r-project.org mailing list
>> >> >> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> >> PLEASE do read the posting guide
>> >> >> >> http://www.R-project.org/posting-guide.html
>> >> >> >> and provide commented, minimal, self-contained, reproducible
>> >> >> >> code.
>> >> >> >
>> >> >> >
>> >> >
>> >> >
>> >
>> >
>
>