[R] Help with subset

Peter Alspach Peter.Alspach at plantandfood.co.nz
Thu Jan 21 23:31:51 CET 2010


Tena koe Jerry 

myVars would be the unique values of Anlysis_Soil.  I guess Anlysis_Soil
is a factor, in which case

myVars <- levels(Anlysis_Soil)

The for() loop then steps through each of these in turn.

HTH ....

Peter Alspach

PS  If you'd like more details it might be better to contact me off
list.

P
> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Jerry Floren
> Sent: Friday, 22 January 2010 9:53 a.m.
> To: r-help at r-project.org
> Subject: Re: [R] Help with subset
> 
> 
> Thank you Peter. I am really new to this. The spreadsheet I 
> am working with has 12,379 rows with the first row consisting 
> of the variable names and
> 12,378 rows of data. There are seven columns, and the 7th 
> column is the only one with numerical data ("Results"). 
> 
> I need to match up the variable Results with the variable 
> "Anlysis_Soil", which is the type of test performed by the 
> labs on one of 20 different soil samples. Here are some 
> examples of the Anlysis_Soil variable:
> 
> Anlysis_Soil
> Bases-Aluminum KCL Extr-2008-116
> Bases-Aluminum KCL Extr-2008-116
> Bases-Aluminum KCL Extr-2008-117
> Bases-Aluminum KCL Extr-2008-118
> Bases-Aluminum KCL Extr-2008-118
> Bases-Aluminum KCL Extr-2008-119
> Bases-Aluminum KCL Extr-2008-120
> Bases-Aluminum KCL Extr-2008-120
> Bases-Aluminum KCL Extr-2009-101
> 
> Actually, I am not interested in any of the above, because 
> there are too few (less than 9). 
> 
> I think I need to first identify the unique Anlysis_Soil from 
> the entire list, and I thought using "list" might work:
> 
> > anlyses <- list(Anlysis_Soil)
> > str(anlyses)
> List of 1
>  $ : Factor w/ 1695 levels "Bases-Aluminum KCL 
> Extr-2008-116",..: 1 1 2 3 3
> 4 5 5 6 6 ...
> > 
> 
> It does correctly identify there are 1695 unique 
> "Anlysis_Soil" variables.
> However, "anlyses" contains all 12,378 "Anlysis_Soil" variables. For
> example:
> 
> print(anlyses)
> ...
> ...
> ...
> [12374] Soil pH & EC-Soil EC (1to2)-2009-115                  
>           
> [12375] Soil pH & EC-Soil EC (1to2)-2009-115                  
>           
> [12376] Soil pH & EC-Soil EC (1to2)-2009-115                  
>           
> [12377] Soil pH & EC-Soil EC (1to2)-2009-115                  
>           
> [12378] Soil pH & EC-Soil EC (1to2)-2009-115                  
>           
> 1695 Levels: Bases-Aluminum KCL Extr-2008-116 ...
> 
> And once again shows correctly that there are 1,695 unique 
> "Anlysis_Soil"
> variables.
> 
> Once the unique Anlysis_Soil variables are identified, I need 
> to determine the ones greater than 8, and I see how that 
> could be done with your code.
> 
> I am not clear what you mean by, "for (myV in myVars)" ? Is 
> myV the name of one of the unique variables that has at least 
> 9 Results? Is myVars the entire column of "Anlysis_Soil" ?
> 
> I am not sure if this is any clearer.
> 
> Thanks,
> 
> Jerry Floren
> Minnesota Department of Agriculture 
> 
> --
> View this message in context: 
> http://n4.nabble.com/Help-with-subset-tp1049883p1058242.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list