[R] group by

Alex Brown alex at transitive.com
Fri Dec 1 19:06:40 CET 2006


Hi Hans,

The short answer is yes.  I suspect you need to look at the 'by',  
'tapply' or 'aggregate' functions, depending upon what your data type  
is, exactly.

In general, it's best to come up with a really simple example which  
illustrates the part you don't know how to do.  If you can do that,  
someone will be able to come up with a simple solution.

-Alex Brown

On 1 Dec 2006, at 15:22, Hans-Juergen Eickelmann wrote:

>
> Dear R-community,
>
>
> I started using R  to control yield and output from different  
> factories by
> production week. A typical example is below.
>
> Location    Week  ShippedWafer      SortedWafer UnsortedWafer
> WaferYield  GoodDie
> A           47    9           4           5           0.476       -12
> B           40    5           5           0            
> -0.3262           -9
> B           48    2           1           1           5.092       18
>
>
> This output was generated from the following sample data. The  
> complete list
> can have more than 5K rows
>
> TransactionWeek   Shipdate    Partnumber  Testside    Lot          
> Wafer1
>       Wafer2            Yieldnorm   Chipnorm
> 47                11/20/2006  SWN3        A           12WAC00
> 3LU105SOG6  3LU105SOG6  17.231            60
> 47                11/20/2006  SWN3        A           12WAC00
> 3LU108SOE6  NA          NA          NA
> 40                10/3/2006   WN30        B           0ZQNC00
> 3XM063SOA1  3XM063SOA1  3.146       -12
> 40                10/3/2006   WN30        B           0ZQNC00
> 3XM072SOA3  3XM072SOA3  9.536       29
>
> I'm a newbee so I'm doing this step by step. 1st Site A,  than  
> siteB and
> combine this with rbind to C<-rbind(A,B);
> This code works however finally I would like to break up the data  
> even more
> and split it to Site, Week, Partnumber and Lot and here I'm lost.
>
> Is there a 'grouping by' function in R which allows this operation  
> much
> easier without 'hardwiring' the parameter like I did it?
>
>
> Code siteA
> Weekmin <- min(ship$TransactionWeek);
> Weekmax <- max(ship$TransactionWeek);
> Week <-Weekmin -1;
>
> repeat{
> Week <- Week +1;
> ship1  <- subset(ship, ship$TransactionWeek == Week &ship$Testside % 
> in%
> c("A"));
> ship2 <- subset(ship1,ship1$Yield != 0 );
> ship3 <- subset(ship1,is.na(ship1$Yield));
>
> Location <- "A";
> ShippedWafer <- nrow(ship1);
> SortedWafer <- nrow(ship1)-nrow(ship3);
> UnsortedWafer <- nrow(ship3);
> WaferYield <- mean(ship2$Yieldnorm, na.rm=TRUE);
> GoodDie <- sum(ship1$Chipnorm, na.rm=TRUE);
> assign(paste("week", Week, sep="."), data.frame(Location, Week,
> ShippedWafer,
>       SortedWafer, UnsortedWafer, WaferYield,GoodDie))
> if (Week == Weekmin) next
> line <- rbind(get(paste("week", Week-1, sep=".")),get(paste("week",  
> Week,
> sep=".")))
> assign(paste("week", Week, sep="."), data.frame(line))
>
> if (Week < Weekmax)next
> if (Week == Weekmax) break
> }
> A <- data.frame(get(paste("week", Week, sep=".")));
>
> Hans
>
> Hans-J Eickelmann
> ISC Technology Procurement Center Mainz, Germany
> email : Eickelma at de.ibm.com
> phone : +49-(0)6131-84-2516
> mobile: +49-(0)170-632-5596
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list