I'm manipulating a large dataset and need to eliminate some observations based 
on specific identifiers.  This isn't a problem in and of itself (using which.. 
or subset..) but an imprint of the deleted observations seem to remain, even 
though they have 0 observations.  This is causing me problems later on.  I'll 
use the dataset warpbreaks to illustrate, I apologize if this isn't in the best 
format

##Summary of warpbreaks suggests three tension levels (H, M, L)
> summary(warpbreaks)

     breaks      wool   tension
 Min.   :10.00   A:27   L:18   
 1st Qu.:18.25   B:27   M:18   
 Median :26.00          H:18   
 Mean   :28.15                 
 3rd Qu.:34.00                 
 Max.   :70.00
       
## Subset the dataset and keep only those observations with "L"
> wb.subset <- warpbreaks[which(warpbreaks$tension=="L"),]


##Summary of the subsetted data shows: L=18, M=0, H=0, Why is M and H still 
included?  
> summary(wb.subset)

     breaks      wool  tension
 Min.   :14.00   A:9   L:18   
 1st Qu.:26.00   B:9   M: 0   
 Median :29.50         H: 0   
 Mean   :36.39                
 3rd Qu.:49.25                
 Max.   :70.00     

##The subsetted dataset does not show M or H           
> wb.subset

Is there a way that M & H can be completely eliminated (i.e. they don't show up 
in summary)? The only way I found was to export the dataset and then reimport, 
which seems pretty cumbersome.  Thanks in advance for any help.  -Kirk

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to