[R] Getting started with R
Andrew Perrin
andrew_perrin at unc.edu
Tue Jan 15 20:02:51 CET 2002
If I understand what you're asking, it's essentially the same thing I
asked the list for a week or so ago.
First, if A, B, and C conditions are mutually exclusive, then yes, I would
suggest working with a single variable with three values. As a rule of
thumb (more about database theory than statistics) you should avoid
designing data structures that can hold invalid data.
I quote below the responses from Rossini and Lumley to my original query:
On 9 Jan 2002, A.J. Rossini wrote:
> >>>>> "AP" == Andrew Perrin <andrew_perrin at unc.edu> writes:
>
> AP> I'd like to get summary statistics (really just a mean would
> AP> be fine) for a vector in a data frame, but split based on the
> AP> value of another vector. That is, I have a data frame
> AP> (hcd.df) with variables datecat (which is always 1 or 2) and
> AP> auth.sum (-8..+8). I've used xtabs to get chi-square
> AP> comparisons, but what I need now is a simple mean of auth.sum
> AP> where datecat is 1 and another where datecat is 2. Thanks for
> AP> any advice.
>
> Something like :
>
> lapply(split(hcd.df$auth.sum,hcd.df$datecat),mean)
>
Or
tapply(hcf.df$auth.sum, hcd.df$datecat, mean)
or (in 1.4.0)
with(hcf.df, {tapply(auth.sum, datecat, mean})
-thomas
Thomas Lumley Asst. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
-----
In your case, I'd say something like:
tapply(df$angle, df$condition, summary)
is probably right.
----------------------------------------------------------------------
Andrew J Perrin - andrew_perrin at unc.edu - http://www.unc.edu/~aperrin
Assistant Professor of Sociology, U of North Carolina, Chapel Hill
269 Hamilton Hall, CB#3210, Chapel Hill, NC 27599-3210 USA
On Tue, 15 Jan 2002, Jay Pfaffman wrote:
> I've got a background in computer science & have been using Linux for
> nearly a decade. I'm working on a Ph.D. in education and technology
> and I essentially live in emacs and do all of my writing in LaTeX.
> To me R seems like the perfect stats package. Unfortunately, the
> learning curve is killing me. I feel like that if I'd waded through
> pulling down menus in SPSS or SAS I could have gotten a bit more done
> by now, but I don't want to use those programs.
>
> What I'd like is a cookbook of a few basic procedures. I think I'm
> more interested in the R code than I am statistical explication,
> though I don't object to the latter. Is Venables and Ripley "MASS"
> going to do that for me or would "S Programming" be more appropriate?
> In my cursory look through the sample chapter from Nolan and Speed I
> saw no S-plus/S/R code whatsoever.
>
> One thing I'm trying to do right now is certainly trivial, but I can't
> quite get it going. Hopefully I'm not sounding too much like I'm
> asking you to do my homework.
>
> In a perception study, I've got three within-subject conditions, A, B,
> and C. Each condition has 4 trials with 2 times and an angle
> (actually an error measurement between the actual angle and the one
> the subjects pointed to). All I want is to get the stuff that
> summary() gives split out by condition. It might also be nice to
> split it out between subjects as well to look at, and possibly correct
> for individual differences, (which might be difficult with so few
> trials?). My data columns are as follows:
>
> A B C (with 0 or 1 to indicate condition, would a single column with
> 1-3 be better?)
>
> t1, t2, angle-error
>
> Surely fewer than 10 lines of R could yield me these results and maybe
> a couple pretty graphs.
>
> In another study where I'm looking at motivation and hobbies, which I
> have almost no idea how to analyze (which suggests I might have chosen
> a bad design & that a problem like this probably doesn't belong in my
> "cookbook") I've had people rank a set of 25 characteristics of their
> activities or motivations (5 in each of 5 categories) and would like
> to see if any patterns are emerging there. My data start out as an
> ordered list of these cards (1-25); I futzed in a spreadsheet to get
> two columns, the motivation number and its rank. If I could avoid
> using the spreadsheet, that'd be nice.
>
> Thanks.
>
> --
> Jay Pfaffman pfaffman at relaxpc.com
> +1-415-821-7507 (H) +1-415-810-2238 (M)
> http://relax.ltc.vanderbilt.edu/~pfaffman/
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list