John Sorkin
JSorkin at grecc.umaryland.edu
Wed Sep 23 00:01:11 CEST 2015
Bert
I am surprised by your response. Statistics serves two purposes: estimation and hypothesis testing. Sometimes we are fortunate and theory, physiology, physics, or something else tell us what is the correct, or perhaps I should same most adequate model. Sometimes theory fails us and we wish to choose between two competing models. This is my case. The cell sizes may come from one normal distribution (theory 1) or two (theory 2). Choosing between the models will help us postulate about physiology. I want to use statistics to help me decide between the two competing models, and thus inform my understanding of physiology. It is true that statistics can't tell me which model is the "correct" or "true" model, but it should be able to help me select the more "adequate" or "appropriate" or "closer to he truth" model.
In any event, I still don't know how to fit a single normal distribution and get a measure of fit e.g. log likelihood.
John
>>> Bert Gunter <bgunter.4567 at gmail.com> 09/22/15 4:48 PM >>>
I'll be brief in my reply to you both, as this is off topic.
So what? All this statistical stuff is irrelevant baloney(and of
questionable accuracy, since based on asymptotics and strong
assumptions, anyway) . The question of interest is whether a mixture
fit better suits the context, which only the OP knows and which none
of us can answer.
I know that many will disagree with this -- maybe a few might agree --
but please send all replies, insults, praise, and learned discourse to
me privately, as I have already occupied more space on the list than
I should.
Cheers,
Bert
Bert Gunter
"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
-- Clifford Stoll
> That's true but if he uses some AIC or BIC criterion that penalizes the
> number of parameters,
> then he might see something else ? This ( comparing mixtures to not mixtures
> ) is not something I deal with so I'm just throwing it out there.
>> > I have data that may be the mixture of two normal distributions (one
>> > contained within the other) vs. a single normal.
>> > I used normalmixEM to get estimates of parameters assuming two normals:
>> >
>> >
>> > GLUT <- scale(na.omit(data[,"FCW_glut"]))
>> > GLUT
>> > mixmdl = normalmixEM(GLUT,k=1,arbmean=TRUE)
>> > summary(mixmdl)
>> > plot(mixmdl,which=2)
>> > lines(density(data[,"GLUT"]), lty=2, lwd=2)
>> >
>> >
>> >
>> >
>> >
>> > summary of normalmixEM object:
>> > comp 1 comp 2
>> > lambda 0.7035179 0.296482
>> > mu -0.0592302 0.140545
>> > sigma 1.1271620 0.536076
>> > loglik at estimate: -110.8037
>> >
>> >
>> >
>> > I would like to see if the two normal distributions are a better fit
>> > that one normal. I have two problems
>> > (1) normalmixEM does not seem to what to fit a single normal (even if I
>> > address the error message produced):
>> >
>> >
>> >> mixmdl = normalmixEM(GLUT,k=1)
>> > Error in normalmix.init(x = x, lambda = lambda, mu = mu, s = sigma, k =
>> > k, :
>> > arbmean and arbvar cannot both be FALSE
>> >> mixmdl = normalmixEM(GLUT,k=1,arbmean=TRUE)
>> > Error in normalmix.init(x = x, lambda = lambda, mu = mu, s = sigma, k =
>> > k, :
>> > arbmean and arbvar cannot both be FALSE
>> >
>> >
>> >
>> > (2) Even if I had the loglik from a single normal, I am not sure how
>> > many DFs to use when computing the -2LL ratio test.
>> >
>> >
>> > Any suggestions for comparing the two-normal vs. one normal distribution
>> > would be appreciated.
>> >
>> >
>> > Thanks
>> > John
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
