[R] Array help

Joshua Wiley jwiley.psych at gmail.com
Mon Nov 29 18:07:42 CET 2010


Hi Brian,

I believe there was some miscommunication earlier due to R's array
class for objects and the colloquial usage of array (the idea that
'array' is used colloquially is a bit odd, but I digress).  In any
case, here are some steps I take (certainly not the only ones) when
exploring a new dataset that I am not familiar with:


## load the package
library(PASWR)

## look at the str()ucture of the object of interest
str(StatTemps)

## Hmm, it is a 'data.frame' with 3 variables
## one variable is 'num' and the other two are 'Factor'
## let's see if we can find out more about those data classes
## (pull up the documentation on each, it can be hard to know at first
##  that 'num' stands for numeric and 'Factor' needs to be lowercase)
?data.frame
?numeric
?factor

## in this case, it is easy to print the whole data set so
StatTemps # print to screen
## but you can also get a nice little summary
summary(StatTemps)

## For the documentation on extraction/indexing
?Extract

## and some examples
StatTemps$temperature
StatTemps$gender
StatTemps$class
## now using a different operator than '$'
## You can call by name by quoting
StatTemps[ , "temperature"]
## or since we know it is column 1
StatTemps[ , 1]
## conversely, we can get row 1
StatTemps[1, ]
## or some combination of rows
StatTemps[c(1:7, 22:34), ]
## or rows and columns
StatTemps[c(1:7, 22:34), c(1, 3)]

## But since you have a factor, there may be an easier way
subset(StatTemps, gender == "Male")
subset(StatTemps, gender == "Female")

subset(StatTemps, class == "8 a.m.")
subset(StatTemps, class == "9 a.m.")

## on more than one variable
subset(StatTemps, class == "8 a.m." & gender == "Male")

## with a continuous variable
subset(StatTemps, temperature < 94)

## and we can do calculations by() groups
by(data = StatTemps$temperature, INDICES = StatTemps$gender, FUN = mean)
## but typing the name is annoying
with(StatTemps, by(data = temperature, INDICES = gender, FUN = mean))
## even more detailed (but leaving off the explicit argument names)
with(StatTemps, by(temperature, list(gender, class), mean))

## A couple visual summaries
boxplot(temperature ~ gender, data = StatTemps)
boxplot(temperature ~ class, data = StatTemps)
## or hop on over to lattice for something a little more advanced
bwplot(temperature ~ gender | class, data = StatTemps)

## and you can select certain parts without subset()
## first let's see what happens with
StatTemps$gender == "Female"
## now if you pass a logical vector to the extraction operator, '['
StatTemps[StatTemps$gender == "Female", ]
## same thing but just the first column
StatTemps[StatTemps$gender == "Female", 1]
## That came out as a vector, but
StatTemps[StatTemps$gender == "Female", 1, drop = FALSE]


HTH,

Josh


On Mon, Nov 29, 2010 at 5:01 AM, bfhancock <brianfhancock at gmail.com> wrote:
>
> if you can load the PASWR package and pull up StatTemps you will see what I
> am talking about.  Otherwise I fear that my question will just be confusing.
> --
> View this message in context: http://r.789695.n4.nabble.com/Array-help-tp3062992p3063535.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/



More information about the R-help mailing list