[R--gR] Modelformulae

Thu Aug 19 15:22:04 CEST 2004

just 2 comments:

> 
> ~ (A+B)*C|D & B*E|E & D|E   or equivalently
> ~ b(A+B)*C|D, B*E|E, D|E)

I think you will have to use the b() notation. The '&' operator will
probably confuse the formula parser.

> *	Conditioning symbol | is followed by a simple variable list eg
> (X,Y,A)
> 
> eg ~ A*(X+Y+Z)^2|(X,Y,Z)

This is illegal in R. You will need to use a separator like + or again a
grouping function:

~ A * (X+Y+Z) | X + Y + Z
or
~ A * (X+Y+Z) | l(X,Y,Z)

just my 2c

David

> 
> 3. Functions can be used, eg ~Z+log(X), sqrt(x-min(x)) 
> 
> 4. Ramifications of ':'
> *     My understanding is that the use of ':' rather than '*' relates
> to different parametrisations of the same space. 
> 	In principle when specifying a model this should be irrelevant.
> Or do we want to commit ourselves to a certain
>       parametrisation - if so, why? 
> *     I suppose if ':' is allowed we should also allow %in% and /
> (nested).
> 
> 5. Question to the (ha)R(d)-core: can the existing R formula parser be
> used with these formulae? Or how should it be done?
> If we need a special parser, what should this return?
> 
> Best regards
> David 
> 
> -----Original Message-----
> From: r-sig-gr-bounces at stat.math.ethz.ch
> [mailto:r-sig-gr-bounces at stat.math.ethz.ch] On Behalf Of Steffen
> Lauritzen
> Sent: 19. august 2004 11:39
> To: gRlist
> Subject: [R--gR] Modelformulae
> 
> 
> Dear gR-folks
> 
> The Danish gR-gang have been talking about describing a model language
> for graphical models that
> 
> 1) could specify at least chain graph models, based on the most
> general hierarchical mixed models as described in Lauritzen (1996) [my
> book], section 6.4, pages 199-216. (More general than MIM-models).
> 
> 2) did not confuse people who were accustomed to glim-type notation
> and formulae
> 
> 3) did not conflict too much with existing formula conventions (MIM,
> ggm)
> 
> 4) was clear and unambiguous, and immediately understandable without
> too much explanation
> 
> 5) did not conflict too much with the whole idea and setup of
> graphical interaction models
> 
> 6) accomodates idea of multiple response variables
> 
> Here is a first attempt. It may well work, but I would appreciate
> having response back if I have overlooked some nasty conflicts or bad
> sides to this.
> 
> The whole issue is somewhat plagued by the "coincidental" fact that
> *intrinsically multivariate* log-linear models via "the Poisson trick"
> can be described through univariate response models for the counts.
> 
> Below I will first describe the basic general setup, then some
> conventions which enable people to use alternative, more traditional
> approaches, without ambiguity.
> 
> What do you all think of this? Please reply to the entire list...;-)
> 
> If it works, the suggestion would be for gRbase to adopt it and
> abandon MIM-notation alltogether, as the latter is slightly different
> in style.
> 
> Hopefully it can also be extended to cover BUGS-type models without
> too many direct conflicts.
> 
> Best regards
> Steffen
> 
> --
> Steffen L. Lauritzen
> Department of Statistics, University of Oxford
> 1 South Parks Road, Oxford OX1 3TG, United Kingdom
> Tel: +44 1865 272877; Fax: +44 1865 272595
> email: steffen at stats.ox.ac.uk URL: www.stats.ox.ac.uk/~steffen/
> 
> ---------
> 
> 
> The following signs are (at least) permissible:  ~, + , *  ,  :  ,  ^ 
> ,. and  |
> 
> ~ indicates the beginning of a formula. Implicitly think of
> 
> log f ~ ....
> 
> | denotes parenthood in graph, equiv to normalising/conditioning
> 
> + denotes multiplicative combination (log-additive). Chain components 
> + must
> be contained within parentheses.
> 
> * or : denotes (tensor)product of interaction terms, decomposed into
> terms of lower order or not, i.e. A*B*C specifies all subsets of ABC,
> whereas A:B:C only uses ABC.
> 
> strength of bindings   (*,:)   >   +   >  |
> 
> examples of legal formulae (same model with three chain components
> specified)
> 
> m <- gm( log f ~ (A:B+C:D|D)+(B*E|E)+(D*E|E))
> 
> m <- gm( ~ (A:B+C*D|D), ~(B*E|E)+(D*E|E))
> 
> hierarchical models, as in  CoCoCg and Lauritzen (1996)cf p. 213
> 
> ~ A+B:X+B*Y+A*B*X^2+A*X:Y+Y^2 not a mim-model
> 
> ~ A+B:X+A*Y+A*(X+Y)^2 = mim(A+B/AX+BY/AXY)
> 
> some different models
> 
> m1<- gm(~A*B+C*D|B*D) equiv  gm(~A*B+C*D+B*D|B*D)
> 
> m2<-gm(~((B+D)*E)|E)
> 
> m<-b(m1,m2)
> 
> m <- gm( ~ (A*B)+(C*D|D)+(B*E+D*E|E))
> 
> m<- gm( ~ (A*B)+(C|D)+(B+D|E))
> 
> CONVENTION for compatibility with standard regression and ggm:
> 
> Y~X+U:A is the same as ~(Y:X+Y:U:A |XUA) = ~(Y:(X+U:A) |XUA),
> 
> that is: *If * there is a variable on the left hand side of ~, this is
> a response to the variables on the right hand side, and the
> interaction structure is the product of right and left hand sides.
> 
> Work still needs to be done to identify when models are legal, the
> same, and parse them for proper and correct analysis.
> 
> Is this the way ahead?
> 
> _______________________________________________
> R-sig-gR mailing list
> R-sig-gR at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-gr
> 
> _______________________________________________
> R-sig-gR mailing list
> R-sig-gR at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-gr
> 

-- 
Dr. David Meyer
Department of Information Systems

Vienna University of Economics and Business Administration
Augasse 2-6, A-1090 Wien, Austria, Europe
Fax: +43-1-313 36x746 
Tel: +43-1-313 36x4393
HP:  http://wi.wu-wien.ac.at/Wer_sind_wir/meyer/