[R] Subset() within function: logical error

Steve Taylor steve.taylor at aut.ac.nz
Tue Jun 30 02:26:12 CEST 2015


Using return() within a for loop makes no sense: only the first one will be returned.

How about:
alldf.B = subset(alldf, stream=='B')  # etc...

Also, have a look at unique(alldf$stream) or levels(alldf$stream) if you want to use a for loop on each unique value.

cheers,
    Steve

-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Rich Shepard
Sent: Tuesday, 30 June 2015 12:04p
To: r-help at r-project.org
Subject: [R] Subset() within function: logical error

   Moving from interactive use of R to scripts and functions and have bumped
into what I believe is a problem with variable names. Did not see a solution
in the two R programming books I have or from my Web searches. Inexperience
with ess-tracebug keeps me from refining my bug tracking.

   Here's a test data set (cleverly called 'testset.dput'):

structure(list(stream = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L), .Label = c("B", "J", "S"), class = "factor"),
     sampdate = structure(c(8121, 8121, 8121, 8155, 8155, 8155,
     8185, 8185, 8185, 8205, 8205, 8205, 8236, 8236, 8236, 8257,
     8257, 8257, 8308, 8785, 8785, 8785, 8785, 8785, 8785, 8785,
     8847, 8847, 8847, 8847, 8847, 8847, 8847, 8875, 8875, 8875,
     8875, 8875, 8875, 8875, 8121, 8121, 8121, 8155, 8155, 8155,
     8185, 8185, 8185, 8205, 8205, 8205, 8236, 8236, 8236, 8257,
     8257, 8257, 8301, 8301, 8301), class = "Date"), param = structure(c(2L,
     6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L,
     6L, 7L, 2L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L,
     6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L,
     2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L
     ), .Label = c("Ca", "Cl", "K", "Mg", "Na", "SO4", "pH"), class = "factor"),
     quant = c(4, 33, 8.43, 4, 32, 8.46, 4, 31, 8.43, 6, 33, 8.32,
     5, 33, 8.5, 5, 32, 8.5, 5, 59.9, 3.46, 1.48, 29, 7.54, 64.6,
     7.36, 46, 2.95, 1.34, 21.8, 5.76, 48.8, 7.72, 74.2, 5.36,
     2.33, 38.4, 8.27, 141, 7.8, 3, 76, 6.64, 4, 74, 7.46, 2,
     82, 7.58, 5, 106, 7.91, 3, 56, 7.83, 3, 51, 7.6, 6, 149,
     7.73)), .Names = c("stream", "sampdate", "param", "quant"
), row.names = c(NA, -61L), class = "data.frame")

   I want to subset that data.frame on each of the stream names: B, J, and S.
This is the function that has the naming error (eda.R):

extstream = function(alldf) {
     sname = alldf$stream
     sdate = alldf$sampdate
     comp = alldf$param
     value = alldf$quant
     for (i in sname) {
         sname <- subset(alldf, alldf$stream, select = c(sdate, comp, value))
         return(sname)
     }
}

   This is the result of running source('eda.R') followed by

> extstream(testset)
Error in subset.data.frame(alldf, alldf$stream, select = c(sdate, comp,  :
   'subset' must be logical

   I've tried using sname for the rows to select, but that produces a
different error of trying to select undefined columns.

   A pointer to the correct syntax for subset() is needed.

Rich

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list