[R] Deleting certain observations (and their imprint?)

Stodola, Kirk kstodola at illinois.edu
Thu Nov 29 17:32:18 CET 2012


I'm manipulating a large dataset and need to eliminate some observations based on specific identifiers.  This isn't a problem in and of itself (using which.. or subset..) but an imprint of the deleted observations seem to remain, even though they have 0 observations.  This is causing me problems later on.  I'll use the dataset warpbreaks to illustrate, I apologize if this isn't in the best format

##Summary of warpbreaks suggests three tension levels (H, M, L)
> summary(warpbreaks)

     breaks      wool   tension
 Min.   :10.00   A:27   L:18   
 1st Qu.:18.25   B:27   M:18   
 Median :26.00          H:18   
 Mean   :28.15                 
 3rd Qu.:34.00                 
 Max.   :70.00
       
## Subset the dataset and keep only those observations with "L"
> wb.subset <- warpbreaks[which(warpbreaks$tension=="L"),]


##Summary of the subsetted data shows: L=18, M=0, H=0, Why is M and H still included?  
> summary(wb.subset)

     breaks      wool  tension
 Min.   :14.00   A:9   L:18   
 1st Qu.:26.00   B:9   M: 0   
 Median :29.50         H: 0   
 Mean   :36.39                
 3rd Qu.:49.25                
 Max.   :70.00     

##The subsetted dataset does not show M or H           
> wb.subset

Is there a way that M & H can be completely eliminated (i.e. they don't show up in summary)? The only way I found was to export the dataset and then reimport, which seems pretty cumbersome.  Thanks in advance for any help.  -Kirk



More information about the R-help mailing list