[R] Problem with subsets and xyplot

Bert Gunter gunter.berton at gene.com
Wed Feb 7 22:03:02 CET 2007


?aggregate says:

"... the result is reformatted into a data frame containing the variables in
by and x. The ones arising from by contain the unique combinations of
grouping values used for determining the subsets, and the ones arising from
x the corresponding summary statistics for the subset of the respective
variables in x. "

so meansbymsa does not have the same number of rows as your original data
frame, which it must for subsetting to work properly (meansbymsa[,2] was
recycled to be of the right length by default, which produces the nonsense
you got. See ?xyplot)


Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404
650-467-7374


-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Peter Flom
Sent: Wednesday, February 07, 2007 12:10 PM
To: r-help at r-project.org
Subject: [R] Problem with subsets and xyplot

Hello

I have a dataframe that looks like this

     MSA                          CITY HIVEST YEAR   YR CAT
1   0200  Albuquerque                     0.50 1996 1996   5
2   0520  Atlanta                        13.00 1997 1997   5
3   0720  Baltimore                      29.10 1994 1994   1
4   0720  Baltimore                      13.00 1995 1995   5
5   0720  Baltimore                       3.68 1996 1996   3
6   0720  Baltimore                       9.00 1997 1997   5
7   0720  Baltimore                      11.00 1998 1998   5
8   0875  Bergen-Passaic                 51.80 1990 1990   5


many more rows....

I would like to create some xyplots, but separately for MSAs that are
high, moderate or low on HIVEST.  Here's what I tried

#### READ IN DATA AND RECODE SOME VARIABLES
attach(hivest)

cat <- CAT
cat[cat > 5] <- 6


msa <- as.numeric(MSA)
msa[msa == 7361] <- 7360
msa[msa == 7362] <- 7360
msa[msa == 7363] <- 7360

msa[msa == 5601] <- 5600
msa[msa == 5602] <- 5600

msa[msa == 6484] <- 6483


####   FIND MEANS FOR EACH MSA, FOR SUBSETTING LATER
meanbymsa <- aggregate(HIVEST, by = list(msa), FUN = mean, na.rm = T)

#### meanbymsa[,2] gives me the column I want; the 25%tile of this
column is about 3.1.

but when I try

plot1 <- xyplot(HIVEST~YEAR|as.factor(msa),  pch = LETTERS[cat], subset
= (meanbymsa[,2] < 3.1))
plot1


I don't get what I expect.  No errors, and it is a subset, but the
subset is NOT MSAs with low values of HIVEST.


Any help appreciated.


Peter




Peter L. Flom, PhD
Assistant Director, Statistics and Data Analysis Core
Center for Drug Use and HIV Research
National Development and Research Institutes
71 W. 23rd St
http://cduhr.ndri.org
www.peterflom.com
New York, NY 10010
(212) 845-4485 (voice)
(917) 438-0894 (fax)

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list