[R] Complicated For Loop (to me)
Petr PIKAL
petr.pikal at precheza.cz
Tue Nov 10 08:27:06 CET 2009
Hi
You probably may use some of aggregate functions (by, tapply, aggregate)
aggregate(some.columns.of.data frame, list(SLUNCH, ETHNIC, RACE,
DIVISION), function(x) x/sum(x))
Untested on your data.
Regards
Petr
r-help-bounces at r-project.org napsal dne 10.11.2009 03:51:55:
>
> Sorry, I've been trying to work around this and just got back to check
my
> email.
>
> dput wasn't working too well for me because the data set also has 450
> variables and I needed more time to figure out how to properly show you
all
> what you needed to know.
>
> But to show you the idea, a very simple data set would be:
>
> NWEIGHT ETHNIC RACE SLUNCH DIVISION .......
> 1234 0 1 1 1
> 2345 1 1 0 5
> 3243 0 3 1 3
> . . . . .
> . . . . .
> . . . . .
> . . . . .
>
>
> So basically, I already have the data subset by division and race. (I
did
> that the inefficient way by coding it by hand)
>
> But now I need to calculate the percentage of each division (by race)
that
> participates in SLUNCH (a 0 1 variable)
>
> So I am trying to avoid writing out code such as:
>
> w.cd1.s <- sum(ifelse(white.cd1$SLUNCH==1, white.cd1$NWEIGHT,
> 0))/sum(white.cd1$NWEIGHT)
> w.cd2.s <- sum(ifelse(white.cd2$SLUNCH==1, white.cd2$NWEIGHT,
> 0))/sum(white.cd2$NWEIGHT)
> .... for all the variables.
>
> One other method that I tried, which gets me the "names" i need, but
doesn't
> put them into a dataframe (which I am currently trying to fix) is by
using
> this code:
>
>
> names <- c("white","black","hispanic","asian")
> regions <- c("cd1","cd2","cd3","cd4","cd5","cd6","cd7","cd8","cd9")
> type <- c("l", "p", "r")
> name.region <- c()
> for (j in 1:length(names)){
> for(i in 1:length(regions)){
> for(k in 1:length(type)){
> name.holder <- paste(names[j],".",paste(regions[i],".", type[k],
sep=""),
> sep="")
> name.region <- c(name.region, name.holder)
> }
> }
> }
>
> (The "l", "p", "r" represent other variables that I am trying to do the
same
> thing as SLUNCH)
>
> >From here I've been trouble-shooting how to switch these named
variables
> back into a data.frame context.
>
> Everyone's help has been really appreciated! I've learned a lot today
that
> will hopefully move me slowly from using for loops to more efficient
> functions. I unfortunately am still learning those and have some
knowledge
> about how to use loops compared to almost no knowledge of the more
powerful
> functions like sapply, lapply, etc. (I'm waiting on MASS4 to be
returned to
> the library to read it.)
>
>
> Thanks!
>
>
> John Kane-2 wrote:
> >
> > I think that we probably need a sample database of your original data.
> > A few lines of the dataset would probably be enough as long as it was
> > fairly representative of the overall data set. See ?dput for a way of
> > conveniently supply a sample data set.
> >
> > Otherwise off the top of my head, I would think that you could just
put
> > all your subsets into a list and use lapply but I'm simply guessing
> > without seeing the data.
> >
> > --- On Mon, 11/9/09, agm. <amurray at vt.edu> wrote:
> >
> >> From: agm. <amurray at vt.edu>
> >> Subject: Re: [R] Complicated For Loop (to me)
> >> To: r-help at r-project.org
> >> Received: Monday, November 9, 2009, 3:18 PM
> >>
> >> I've looked through ?split and run all of the code, but I
> >> am not sure that I
> >> can use it in such a way to make it do what I need.
> >> Another suggestion was
> >> using "lists", but again, I am sure that the process can do
> >> what I need, but
> >> I am not sure it would work with so many observations.
> >>
> >> I might have been too simple in my code. Let me try
> >> to explain it more
> >> clearly:
> >>
> >> I've got a data set of 4500 observations. I have
> >> already subset it into
> >> race/ethnicity (which I did by simple code). Now I
> >> needed to subset each
> >> race/ethnicity again into 9 separate regions. I again
> >> did this by simple
> >> code.
> >>
> >> The problem is now, I need to calculate a percentage for
> >> three different
> >> variables for all 9 regions for each race. I was
> >> trying to do this through
> >> a loop command.
> >>
> >> So a snippet of my code is :
> >>
> >> names <- c("white", "black", "asian", "hispanic")
> >> for(j in 1:length(names)){
> >> for(i in 1:9){
> >> names[j].cd[i].es.wash <- subset(names[j].cd[i],
> >> SLUNCH==1)
> >> es.cd[i].names.w <-
> >> sum(names.cd[i].es.wash$NWEIGHT)/sum(names.cd[i]$NWEIGHT)
> >> }
> >> }
> >>
> >>
> >> Maybe that makes it clearer. If not, I
> >> apologize. Thanks for the help that
> >> I have already received. It is greatly appreciated.
> >>
> >> Tony
> >>
> >> --
> >> View this message in context:
> >>
http://old.nabble.com/Complicated-For-Loop-%28to-me%29-tp26269479p26272994.html
> >> Sent from the R help mailing list archive at Nabble.com.
> >>
> >> ______________________________________________
> >> R-help at r-project.org
> >> mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained,
> >> reproducible code.
> >>
> >
> >
> > __________________________________________________________________
> > Make your browsing faster, safer, and easier with the new Internet
> > Explorer® 8. Optimized for Yahoo! Get i
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> --
> View this message in context:
http://old.nabble.com/Complicated-For-Loop-%
> 28to-me%29-tp26269479p26277512.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list