[R] Uniroot and Newton-Raphson Anomaly

Mon Mar 16 16:43:33 CET 2009

Harold,

Here is an approach that might work well for all the students (it doesn't
use derivatives):

	require(BB)  # I use the dfsane() root-finder from the "B" package

	f <- function(a,R,c1,q1) sum((1 - (1-R)^a)^(1/a)) - c1 * q1  # I
have re-named the variables 

	dfsane(par=0.01, fn=f,  R = c(.2,.4), c1=.5, q1=2)  # Note it works
even for extreme initial values

The dfsane() algoithm is derivative free, and is robust to poor starting
values.  Granted, this is bit of an overkill to use an algorithm designed
for large-scale systems to solve a univariate problem, but it seems to work.

Hope this helps,
Ravi.

----------------------------------------------------------------------------
-------

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvaradhan at jhmi.edu

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

----------------------------------------------------------------------------
--------

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Doran, Harold
Sent: Monday, March 16, 2009 10:34 AM
To: r-help at r-project.org
Subject: [R] Uniroot and Newton-Raphson Anomaly

I have the following function for which I need to find the root of a:

f <- function(R,a,c,q) sum((1 - (1-R)^a)^(1/a)) - c * q

To give context for the problem, this is a psychometric issue where R is a
vector denoting the percentage of students scoring correct on test item i in
class j, c is the proportion correct on the test by student k, and q is the
number of items on the test in total.

I have programmed this using Newton-Raphson as follows:

numer <- function(R,a,c,q){
         result <- sum((1-(1-R)^a)^(1/a))-c*q
         result 
      }

      denom <- function(R,a,c,q){
         result <- sum(-((1 - (1 - R)^a)^(1/a) * (log((1 - (1 - R)^a)) *
(1/a^2)) + 
            (1 - (1 - R)^a)^((1/a) - 1) * ((1/a) * ((1 - R)^a * log((1 -
R))))))
         result
      }

   aConst <- function(R, c, q, con){
      a <- .5 # starting value for a
      change <- 1
      while(abs(change) > con) {
	   r1 <- numer(R,a,c,q)
	   r2 <- denom(R,a,c,q)
         change <- r1/r2
         a      <- a - change
      }
      a
   }

The function now works as follows. Assume there are two test items on the
test. Assume the proportion correct on the items in class j is .2 and .4,
and assume student k scored correctly on one item only.

aConst(R = c(.2,.4), c=.5, q=2, con=.001)

Now, one might notice that the first derivative of the function (in the
function denom) has in it log(1-R). In cases where all students in a class
answer the item correct, R = 1, and this creates an anomaly in that log(0)
is undefined. One cheap trick, I think, is to correct the vector R such that
any values of 1 become .999 and any values of 0 become .001 (0 is also an
anomaly).

I am accustomed to Newton-Raphson and don't use bisection or the uniroot
function. So, given the anomaly above, I am thinking a change of mindset
might be necessary, although I am not certain if the same issue that affects
NR will propagate to other root finding algorithms. 

Now, to use uniroot, my understanding is that I need to start with two
guesses for a lower and upper limit of a such that:

f(x_l)*f(x_u) < 0

Where f(x_l) and f(x_u) are the lower and upper limits, respectively.
With this, I can try:

f <- function(R,a,c,q) sum((1 - (1-R)^a)^(1/a)) - c * q uniroot(f, c(.5,2),
R = c(.2,.4), c = .5, q=2)

Which gives the same root as my NR function. Now, the issue I am running
into is that this function is applied to each student in the data, which can
be in the hundreds of thousands, and the only value that is fixed is q. The
values of c vary by student and the values in the vector R vary by class.
So, as I have applied this to a larger dataset, I often can't find the root
because the values I use for the search interval are not universal to
maintain the necessary condition that f(x_l)*f(x_u) < 0 for student k' in
class j'. As a result, the error that the endpoints are not of opposite sign
appears. 

Now, it is not the error that confuses me, that I believe is rather
transparent in meaning. It is how to apply a search interval that can be
universally applied when implementing this function to many students when
the values of R and c vary.  Obviously, with hundreds of thousands of
students I cannot toy around with the function for each student to find a
search interval to maintain f(x_l)*f(x_u) < 0. So, after pondering this over
the weekend, I wonder if the list might have reactions on the following two
questions;

1) First, would the issue I note about NR having log(0) propagate into other
root finding algorithms? I realize bisection does not make use of
derivatives, and this occurs in the first derivative, but I'm not savvy
enough to see if this is an artifact of the function.

2) Second, is there a more thoughtful way to identify a search interval that
can be automatically programmed when iterating over hundreds of thousands of
cases such that "universal" values for the search interval can be used?

Many thanks,

Harold

> sessionInfo()
R version 2.8.1 (2008-12-22)
i386-pc-mingw32 

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] MiscPsycho_1.4 statmod_1.3.8 
> 

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.