[R] Getting started with R

Tue Jan 15 20:02:51 CET 2002

If I understand what you're asking, it's essentially the same thing I
asked the list for a week or so ago.  

First, if A, B, and C conditions are mutually exclusive, then yes, I would
suggest working with a single variable with three values. As a rule of
thumb (more about database theory than statistics) you should avoid
designing data structures that can hold invalid data.

I quote below the responses from Rossini and Lumley to my original query:
On 9 Jan 2002, A.J. Rossini wrote:

> >>>>> "AP" == Andrew Perrin <andrew_perrin at unc.edu> writes:
>
>     AP> I'd like to get summary statistics (really just a mean would
>     AP> be fine) for a vector in a data frame, but split based on the
>     AP> value of another vector.  That is, I have a data frame
>     AP> (hcd.df) with variables datecat (which is always 1 or 2) and
>     AP> auth.sum (-8..+8).  I've used xtabs to get chi-square
>     AP> comparisons, but what I need now is a simple mean of auth.sum
>     AP> where datecat is 1 and another where datecat is 2. Thanks for
>     AP> any advice.
>
> Something like :
>
>         lapply(split(hcd.df$auth.sum,hcd.df$datecat),mean)
>

Or
    tapply(hcf.df$auth.sum, hcd.df$datecat, mean)

or (in 1.4.0)

    with(hcf.df, {tapply(auth.sum, datecat, mean})

        -thomas

Thomas Lumley                   Asst. Professor, Biostatistics
tlumley at u.washington.edu        University of Washington, Seattle

-----

In your case, I'd say something like:

tapply(df$angle, df$condition, summary)

is probably right.

----------------------------------------------------------------------
Andrew J Perrin - andrew_perrin at unc.edu - http://www.unc.edu/~aperrin
 Assistant Professor of Sociology, U of North Carolina, Chapel Hill
      269 Hamilton Hall, CB#3210, Chapel Hill, NC 27599-3210 USA

On Tue, 15 Jan 2002, Jay Pfaffman wrote:

> I've got a background in computer science & have been using Linux for
> nearly a decade.  I'm working on a Ph.D. in education and technology
> and I essentially live in emacs and do all of my writing in LaTeX.
> To me R seems like the perfect stats package.  Unfortunately, the
> learning curve is killing me.  I feel like that if I'd waded through
> pulling down menus in SPSS or SAS I could have gotten a bit more done
> by now, but I don't want to use those programs.
> 
> What I'd like is a cookbook of a few basic procedures.  I think I'm
> more interested in the R code than I am statistical explication,
> though I don't object to the latter.  Is Venables and Ripley "MASS"
> going to do that for me or would "S Programming" be more appropriate?
> In my cursory look through the sample chapter from Nolan and Speed I
> saw no S-plus/S/R code whatsoever.
> 
> One thing I'm trying to do right now is certainly trivial, but I can't
> quite get it going.  Hopefully I'm not sounding too much like I'm
> asking you to do my homework.
> 
> In a perception study, I've got three within-subject conditions, A, B,
> and C.  Each condition has 4 trials with 2 times and an angle
> (actually an error measurement between the actual angle and the one
> the subjects pointed to).  All I want is to get the stuff that
> summary() gives split out by condition.  It might also be nice to
> split it out between subjects as well to look at, and possibly correct
> for individual differences, (which might be difficult with so few
> trials?).  My data columns are as follows:
> 
> A B C (with 0 or 1 to indicate condition, would a single column with
> 1-3 be better?)
> 
> t1, t2, angle-error
> 
> Surely fewer than 10 lines of R could yield me these results and maybe
> a couple pretty graphs.
> 
> In another study where I'm looking at motivation and hobbies, which I
> have almost no idea how to analyze (which suggests I might have chosen
> a bad design & that a problem like this probably doesn't belong in my
> "cookbook") I've had people rank a set of 25 characteristics of their
> activities or motivations (5 in each of 5 categories) and would like
> to see if any patterns are emerging there.  My data start out as an
> ordered list of these cards (1-25); I futzed in a spreadsheet to get
> two columns, the motivation number and its rank.  If I could avoid
> using the spreadsheet, that'd be nice.  
> 
> Thanks.
> 
> -- 
> Jay Pfaffman                           pfaffman at relaxpc.com
> +1-415-821-7507 (H)                    +1-415-810-2238 (M)
> http://relax.ltc.vanderbilt.edu/~pfaffman/
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> 

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._