[R] question about result of loglinear analysis
Lao Meng
laomeng.3 at gmail.com
Tue Jan 25 02:58:56 CET 2011
Thanks for your help sir.
Well,I follow your suggestion of scatterplot3d,and find that no matter I use
data_Analysis or x(as you suggested last mail),the two plots are the
same(See the attachment).So,I wander which data should I use to draw the 3d
plot,or both datasets(x and data_Analysis)are ok?
Thanks!
My best
2011/1/19 Mike Marchywka <marchywka at hotmail.com>
> > Date: Wed, 19 Jan 2011 01:20:06 -0800
> > From: djmuser at gmail.com
> > To: laomeng.3 at gmail.com
> > CC: r-help at r-project.org
> > Subject: Re: [R] question about result of loglinear analysis
> > Well, you fit a saturated model. How many degrees of freedom do you have
> > left for error? The fact that the standard errors are so huge relative to
> > the estimates is a clue.
> >
> > Taking a look at your data, it's pretty clear that nation 3 is an
> > outstanding outlier on its own. It is clearly - nay, blatantly -
> different
> > from the other nations in the sample. Look at
> > boxplot(fre ~ nation, data = data_Analysis)
> > boxplot(sqrt(fre) ~ nation, data = data_Analysis)
> I'm scrolling back though my cygwin windoh, last night I used this,
> ( read data into "x" not data_Analysis)
>
> > x<-read.table("area_nation.txt",header=TRUE)
> > str(x)
> 'data.frame': 77 obs. of 3 variables:
> $ area : int 1 1 1 1 1 1 1 1 1 1 ...
> $ nation: int 1 2 3 4 5 6 7 8 9 10 ...
> $ fre : int 0 0 85 2 0 0 0 0 1 0 ...
> > library(scatterplot3d)
> > library(rgl)
> > scatterplot3d(x$area,x$nation,x$fre,type="h")
> > scatterplot3d(x$area,x$nation,log(x$fre+1),type="h")
> there is always a discussion here on "looking at pictures" and post hoc
> analysis or what is legitimate to do with outliers that may be confusing to
> some readers but you always need to keep in mind your overall objectives
> here.
> It often helps to forget for a minute that you are doing something
> intellectual
> or pompous and just stare at the pictures ( or someone else quoted a
> statistician
> talking about getting rat dropping under your finger nails presumably
> meaning
> getting more familiar with details of your data aqusition system LOL).
> > the latter to deal with the huge outlier near 1200 in the original data.
> > Even on the square root scale, nation 3 sticks out like a sore thumb.
> 43/77
> > of your responses have zero frequency, so you should probably be looking
> > into zero-inflated Poisson models and some of its relatives. Here is one
> > citation to get you started:
> >
> > http://www.jstatsoft.org/v27/i08/paper
> >
> > Package VGAM also has functionality to fit these types of models.
> >
> > Using package sos, I typed
> >
> > # Install package sos first if you don't have it:
> > library(sos)
> > findFn('zero Poisson')
> >
> > which found 255 matches; you should find several packages that pertain to
> > zero-inflated/zero-altered Poisson models.
> >
> > In the absence of the scientific background behind the data, the
> dominance
> > of nation 3 may well mask more subtle effects among the other nations, so
> > you might want to consider analyses with and without nation 3.
> >
> > On Tue, Jan 18, 2011 at 5:45 PM, Lao Meng wrote:
> >
> > > Hi all:
> > > Here's a question about result of loglinear analysis.
> > > There're 2 factors:area and nation.The raw data is in the attachment.
> > >
> > > I fit the saturated model of loglinear with the command:
> > > glm_sat<-glm(fre~area*nation, family=poisson, data=data_Analysis)
> > >
> > > After that,I extract the coefficients:
> > > result_sat<-summary(glm_sat)
> > > result_coe<-result_sat$coefficients
> > >
> > > I find that all the coeffients are 1 or very near to 1.
> > >
> > > How does this happen?Why all the coeffients are 1 or very near to 1?
> > >
> > > Thanks!
> > >
> > > My best
> > >
