[R] A. Mani : Avoiding loops

Mon Aug 22 06:40:45 CEST 2005

On 20 Aug 2005 at 3:26, A. Mani wrote:

> On Friday 19 August 2005 11:54, Sean O'Riordain wrote:
> > Hi,
> > I'm not sure what you actually want from your email (following the
> > posting guide is a good way of helping you explain things to the
> > rest of us in a way we understand - it might even answer your
> > question!
> >
> > I'm only a beginner at R so no doubt one of our expert colleagues
> > will help me...
> >
> > > fred <- data.frame()
> > > fred <- edit(fred)
> > > fred
> >
> >   A B C D E
> > 1 1 2 X Y 1
> > 2 2 3 G L 1
> > 3 3 1 G L 5
> >
> > > fred[,3]
> >
> > [1] X G G
> > Levels: G X
> >
> > > fred[fred[,3]=="G",]
> >
> >   A B C D E
> > 2 2 3 G L 1
> > 3 3 1 G L 5
> >
> > so at this point I can create a new dataframe with column 3 (C) ==
> > "G"; either explicitly or implicitly...
> >
> > and if I want to calculate the sum() of column E, then I just say
> > something like...
> >
> > > sum(fred[fred[,3]=="G",][,5])
> >
> > [1] 6
> >
> >
> > now naturally being a bit clueless at manipulating stuff in R, I
> > didn't know how to do this before I started... and you guys only get
> > to see the lines that I typed in and got a "successful" result...
> >
> > according to section 6 of the "Introduction to R" manual which comes
> > with R, I could also have said
> >
> > > sum(fred[fred$C=="G",]$E)
> >
> > [1] 6
> >
> > Hmmm.... I wonder would it be reasonable to put an example of this
> > type into section 2.7 of the "Introduction to R"?
> >
> >
> > cheers!
> > Sean
> >
> > On 18/08/05, A. Mani <a_mani_sc_gs at vsnl.net> wrote:
> > > Hello,
> > >         I want to avoid loops in the following situation. There is
> > >         a
> > > 5-col dataframe with col headers alone. two of the columns are
> > > non-numeric. The problem is to calculate statistics(scores) for
> > > each element of one column. The functions depend on matching in
> > > the other non-numeric column.
> > >
> > > A  B  C  E  F
> > > 1  2  X  Y  1
> > > 2  3  G  L  1
> > > 3  1  G  L  5
> > > and so on ...30000+ entries.
> > >
> > > I need scores for col E entries which depend on conditional
> > > implications.
> > >
> > >
> > > Thanks,
> > >
> Hello,
>       Sorry about the incomplete problem. Here is a better version for
>       the
> problem: (the measure is not simple)
> The data frame is like
>   col1       col2            col3       col4        col5
>   <num>  <nonum>   <nonum>      <num>   <num>
>        A           B             C                  E           F  
> There are repeated strings in col3, col2. Problem : Calculate 
> Measure(Ci) = [No. of repeats of Ci *100] + [If (Bi, Ci) is same as
> (Bj, Cj) and 6>= Ej - Ei >=3 then add 100 else  10] .

Hi

I am not sure what exactly you would like to compute, 
**working** example could help. But if you want to do some 
computation for row "i" which depends on row "j", I suppose that 
you can not avoid loops. 

Generally you can use one of aggregate, tapply, by or ave for some 
computation split by factor. See help pages.

tapply(vector or data frame, list(factors), function)

is the standard form.

HTH
Petr

> 
> 
> Actually it is to stretched further by adding similar blocks.
> 
>  How do we use *apply or
> something else in the situation  ?
> 
> 
> In prolog it is extremely easy, but here it is not quite...
> 
> 
> A. Mani
> Member, Cal. Math. Soc
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
petr.pikal at precheza.cz