[R] Complicated For Loop (to me)

agm. amurray at vt.edu
Tue Nov 10 03:51:55 CET 2009


Sorry, I've been trying to work around this and just got back to check my
email.

dput wasn't working too well for me because the data set also has 450
variables and I needed more time to figure out how to properly show you all
what you needed to know.

But to show you the idea, a very simple data set would be:

NWEIGHT  ETHNIC   RACE   SLUNCH   DIVISION .......
1234            0           1         1               1
2345            1           1         0               5
3243            0           3         1               3
   .                .           .          .                .
   .                .           .          .                .
   .                .           .          .                . 
   .                .           .          .                .
   

So basically, I already have the data subset by division and race. (I did
that the inefficient way by coding it by hand)

But now I need to calculate the percentage of each division (by race) that
participates in SLUNCH (a 0 1 variable)

So I am trying to avoid writing out code such as:

w.cd1.s <- sum(ifelse(white.cd1$SLUNCH==1, white.cd1$NWEIGHT,
0))/sum(white.cd1$NWEIGHT)
w.cd2.s <- sum(ifelse(white.cd2$SLUNCH==1, white.cd2$NWEIGHT,
0))/sum(white.cd2$NWEIGHT)
.... for all the variables.  

One other method that I tried, which gets me the "names" i need, but doesn't
put them into a dataframe (which I am currently trying to fix) is by using
this code:


names <- c("white","black","hispanic","asian")
regions <- c("cd1","cd2","cd3","cd4","cd5","cd6","cd7","cd8","cd9")
type <- c("l", "p", "r")
name.region <- c()
for (j in 1:length(names)){
	for(i in 1:length(regions)){
		for(k in 1:length(type)){
		name.holder <- paste(names[j],".",paste(regions[i],".", type[k], sep=""),
sep="")
		name.region <- c(name.region, name.holder)
		}
	}
}

(The "l", "p", "r" represent other variables that I am trying to do the same
thing as SLUNCH)

>From here I've been trouble-shooting how to switch these named variables
back into a data.frame context.  

Everyone's help has been really appreciated!  I've learned a lot today that
will hopefully move me slowly from using for loops to more efficient
functions.  I unfortunately am still learning those and have some knowledge
about how to use loops compared to almost no knowledge of the more powerful
functions like sapply, lapply, etc.  (I'm waiting on MASS4 to be returned to
the library to read it.)


Thanks!


John Kane-2 wrote:
> 
> I think that we probably need a sample database of your original data.  
> A few lines of the dataset would probably be enough as long as it was
> fairly representative of the overall data set.  See ?dput for a way of
> conveniently supply a sample data set.
> 
> Otherwise off the top of my head, I would think that you could just put
> all your subsets into a list and use lapply  but I'm simply guessing
> without seeing the data.
> 
> --- On Mon, 11/9/09, agm. <amurray at vt.edu> wrote:
> 
>> From: agm. <amurray at vt.edu>
>> Subject: Re: [R] Complicated For Loop (to me)
>> To: r-help at r-project.org
>> Received: Monday, November 9, 2009, 3:18 PM
>> 
>> I've looked through ?split and run all of the code, but I
>> am not sure that I
>> can use it in such a way to make it do what I need. 
>> Another suggestion was
>> using "lists", but again, I am sure that the process can do
>> what I need, but
>> I am not sure it would work with so many observations.
>> 
>> I might have been too simple in my code.  Let me try
>> to explain it more
>> clearly:
>> 
>> I've got a data set of 4500 observations.  I have
>> already subset it into
>> race/ethnicity (which I did by simple code).  Now I
>> needed to subset each
>> race/ethnicity again into 9 separate regions.  I again
>> did this by simple
>> code.
>> 
>> The problem is now, I need to calculate a percentage for
>> three different
>> variables for all 9 regions for each race.  I was
>> trying to do this through
>> a loop command.
>> 
>> So a snippet of my code is :
>> 
>> names <- c("white", "black", "asian", "hispanic")
>> for(j in 1:length(names)){
>> for(i in 1:9){
>> names[j].cd[i].es.wash <- subset(names[j].cd[i],
>> SLUNCH==1)
>> es.cd[i].names.w <-
>> sum(names.cd[i].es.wash$NWEIGHT)/sum(names.cd[i]$NWEIGHT)
>> }
>> }
>> 
>> 
>> Maybe that makes it clearer.  If not, I
>> apologize.  Thanks for the help that
>> I have already received.  It is greatly appreciated.
>> 
>> Tony
>> 
>> -- 
>> View this message in context:
>> http://old.nabble.com/Complicated-For-Loop-%28to-me%29-tp26269479p26272994.html
>> Sent from the R help mailing list archive at Nabble.com.
>> 
>> ______________________________________________
>> R-help at r-project.org
>> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained,
>> reproducible code.
>> 
> 
> 
>       __________________________________________________________________
> Make your browsing faster, safer, and easier with the new Internet
> Explorer® 8. Optimized for Yahoo! Get i
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://old.nabble.com/Complicated-For-Loop-%28to-me%29-tp26269479p26277512.html
Sent from the R help mailing list archive at Nabble.com.




More information about the R-help mailing list