[R] Incidence Function Model in R help
David Winsemius
dwinsemius at comcast.net
Mon Jul 6 16:54:08 CEST 2009
It appears to me that you really have no idea what the original author
was doing. I honestly don't either. I never used the sweep operator
and I don't know what his "S"'s represent. I do understand what
offsets represent in the context of glm(), and this set of operations
does not fit my understanding. It makes little statistical sense that
an offset would be something that was constructed through a search for
an optimum parameter driven by data. Maybe someone on the list can
offer me a counter-example.
Furthermore the range of A in your data seems extreme and the degree
of variablility remains low in terms of your "p" variable . To this
point your problems have been of a mathematical nature and are
amenable to critique purely on that basis, but it appears that you
need more fundamental statistical support and you should either seek a
real statistical consult or contact Dr Oksanen for guidance.
Attempting further debugging, I was surprised that I actually had a
"d" object lying around in my workspace and realized it was from your
question yesterday. It should not have existed from the code in you
most recent posting since all those operations were inside a
function. Messing around with the earlier code and setting alpha=0.1
in tha code and recalculating S with the new d matrix, I got a non-
zero S vector. Staring from that point I still got the error you got.
When I limited the range for the "alphascan" to( 0.1, 10) I got an
error, but when I limited it on the other end to (0.01, 0.1) I got
these results:
> (sol<-optimize(alphascan,c(0.1,10),d=d,A=A,p=p))
Error: NA/NaN/Inf in foreign function call (arg 4)
>
> (sol<-optimize(alphascan,c(0.01,.1),d=d,A=A,p=p))
$minimum
[1] 0.0312918
$objective
[1] 144.1746
As suggested above this could be mathematical gibberish. I cannot
recommend thinking of this particular alpha as scientifically
meaningful without careful review by someone who can apply a critical
view using domain knowledge to this outcome.
--
David Winsemius
On Jul 5, 2009, at 5:15 PM, sjkimble wrote:
>
> R Help:
>
>>
>> When I look at the object S, I see about half of them are
>> zeroes.
>
> I've address the object S being zero so often by changing the value of
> alpha, which I'm allowed to do according to the author. All values
> of S are
> non zero and should not give Inf.
>
>> The "expectation" of a binomial model is for the LHS of the
>> formulat to be either a 1/0 vector or something of the form
>> cbind(events, non_events). You have satisfied that expectation with p
>> but there are only 2 such cases. It seems unreasonable to my thinking
>> to expect that a logistic regression model can deliver sensible
>> output
>> from only 2 events. And that holds doubly (or perhaps infinitely?)
>> true when you are starting out with half of your covariates equal to
>> log(0) = -Inf.
>
> Object p is presence (1) or absence (0) of a species on the 12
> islands, and
> some of them are rare so they are mostly zeros. Nevertheless I tried
> with a
> more common species whose data are:
>
> x.crd y.crd A p
> 1 361763 1034071 94.169 1
> 2 370325 1027277 127.642 1
> 3 370416 1027166 127.961 1
> 4 370471 1027148 1804.846 1
> 5 369050 1031312 1790.493 1
> 6 370908 1026354 199.103 1
> 7 361562 1034311 2047.637 1
> 8 365437 1022188 1622678.961 1
> 9 347047 1025334 21.169 1
> 10 349186 1024441 408556.801 1
> 11 361762 1034052 3.414 0
> 12 370799 1026557 103.994 0
>
> The code, plus error message:
>
>> haliclona_reniera_manglaris<-read.table(file.choose(),header=T)
>> attach(haliclona_reniera_manglaris)
>> alphascan<-function(alpha,d,A,p){
> + edis<-as.matrix(exp(-alpha*d))
> + edis<-sweep(edis,2,A,"*")
> + S<-rowSums(edis[,p>0])
> + mod<-glm(p~offset(2*log(S))+log(A),family=binomial)
> + deviance(mod)
> + }
>> (sol<-optimize(alphascan,c(0.001,10),d=d,A=A,p=p))
> Error: NA/NaN/Inf in foreign function call (arg 4)
>
>
> Details:
> Reference: Incidence Function Model in R, J Oksanen, 2004, available
> at:
> cc.oulu.fi/~jarioksa/opetus/openmeta/metafit.pdf
> OS: Mac OS 10.5.7
> R version: 2.9.0
> GUI: 1.28 Tiger build 32-bit
>
> Thanks,
> Steve Kimble
> Department of Forestry and Natural Resources
> Purdue UniversityWest Lafayette, IN 47907
>
> --
> View this message in context: http://www.nabble.com/Incidence-Function-Model-in-R-help-tp24138251p24346992.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
On Jul 6, 2009, at 9:52 AM, David Winsemius wrote:
>
> On Jul 5, 2009, at 5:15 PM, sjkimble wrote:
>
>>
>> R Help:
>>
>>>
>>> When I look at the object S, I see about half of them are
>>> zeroes.
>>
>> I've address the object S being zero so often by changing the value
>> of
>> alpha, which I'm allowed to do according to the author. All values
>> of S are
>> non zero and should not give Inf.
>>
>>> The "expectation" of a binomial model is for the LHS of the
>>> formulat to be either a 1/0 vector or something of the form
>>> cbind(events, non_events). You have satisfied that expectation
>>> with p
>>> but there are only 2 such cases. It seems unreasonable to my
>>> thinking
>>> to expect that a logistic regression model can deliver sensible
>>> output
>>> from only 2 events. And that holds doubly (or perhaps infinitely?)
>>> true when you are starting out with half of your covariates equal to
>>> log(0) = -Inf.
>>
>> Object p is presence (1) or absence (0) of a species on the 12
>> islands, and
>> some of them are rare so they are mostly zeros. Nevertheless I
>> tried with a
>> more common species whose data are:
>>
>> x.crd y.crd A p
>> 1 361763 1034071 94.169 1
>> 2 370325 1027277 127.642 1
>> 3 370416 1027166 127.961 1
>> 4 370471 1027148 1804.846 1
>> 5 369050 1031312 1790.493 1
>> 6 370908 1026354 199.103 1
>> 7 361562 1034311 2047.637 1
>> 8 365437 1022188 1622678.961 1
>> 9 347047 1025334 21.169 1
>> 10 349186 1024441 408556.801 1
>> 11 361762 1034052 3.414 0
>> 12 370799 1026557 103.994 0
>
> So now you have gone from a situation with only two events to one
> with only two non-events. That is not a situation that is really any
> better from a statistical point of view.
>>
>> The code, plus error message:
>>
>>> haliclona_reniera_manglaris<-read.table(file.choose(),header=T)
>>> attach(haliclona_reniera_manglaris)
>>> alphascan<-function(alpha,d,A,p){
>> + edis<-as.matrix(exp(-alpha*d))
>> + edis<-sweep(edis,2,A,"*")
>> + S<-rowSums(edis[,p>0])
>
> Hey,.... you told me you did <something> to get rid of the zeros in
> S. When I take this code out of the functional wrapper I get:
>
> > S
> 1 2 3 4
> 5 6 7 8
> 6.965905e-07 5.139157e-07 8.325874e-60 1.318603e-22
> 1.329174e-22 0.000000e+00 1.390849e-94 1.755094e-97
> 9 10 11 12
> 1.411310e-134 0.000000e+00 0.000000e+00 0.000000e+00
>
> So now instead of 3/4 of them being zero only 1/3 of them are still
> zero. That does not seem to be satisfactory.
>
>> + mod<-glm(p~offset(2*log(S))+log(A),family=binomial)
>> + deviance(mod)
>> + }
>>> (sol<-optimize(alphascan,c(0.001,10),d=d,A=A,p=p))
>
>
> So maybe the problem remain the same as before and supplying the
> missing d will not be enough. You still (I think) need a dataset
> that supplies enough information for a statistical approach.
>
>
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
More information about the R-help
mailing list