[R] Manage an unknown and variable number of data frames

Mark Knecht markknecht at gmail.com
Sun Sep 13 04:13:00 CEST 2009


Hi,
   In the code below I create a small data.frame (dat) and then cut it
into different groups using CutList. The lists in CutList allow to me
choose whatever columns I want from dat and allow me to cut it into
any number of groups by changing the lists. It seems to work OK but
when I'm done I have a variable number of data frames what I need to
do further operations on and I don't know how to manage them as a
collection.

   How do experience R coders handle keeping all this straight so that
if I add another column from dat and more groups in the cuts it all
stays straight? I need to send each dataf rame to another function to
add columns of specific data calcuations to each of them.

   Best for me (I think) would be to enumerate each data frame using
the row.name number from CutTable if possible, but that's just my
thought. If each data frame became an element of CutTable then I'd
always know where they are. Really I'm needing to get a handle on
keeping a variable and unknown number of these things straight.

Thanks,
Mark






dat = data.frame(
	a=round(runif(100,-20,30),2),
	b=round(runif(100,-40,50),2)
	)

# Give each cut list a name matching the column in dat that you
# want to use as criteria for making the cut.
# Create any number of cuts in each row.

CutList = list(
	a=c(-Inf,-10,10,Inf),
	b=c(-Inf,0,20,Inf)
	)

CutResults = mapply(cut,x=dat[,names(CutList)],CutList,SIMPLIFY=FALSE)
CutTable = as.data.frame(table(CutResults))

CutResultsDF = as.data.frame(CutResults)
head(CutResultsDF, n=15)

dat$aRange = CutResultsDF$a
dat$bRange = CutResultsDF$b
head(dat, 15)


# I don't want to do the following as it doesn't
# get managed automatically.

Subset1 = subset(subset(dat, aRange==CutTable$a[1]), bRange==CutTable$b[1])[1:2]
Subset2 = subset(subset(dat, aRange==CutTable$a[2]), bRange==CutTable$b[2])[1:2]
Subset3 = subset(subset(dat, aRange==CutTable$a[3]), bRange==CutTable$b[3])[1:2]
Subset4 = subset(subset(dat, aRange==CutTable$a[4]), bRange==CutTable$b[4])[1:2]

Subset1
Subset2
Subset3
Subset4

CutTable




More information about the R-help mailing list