[R] Subset() within function: logical error
Rolf Turner
r.turner at auckland.ac.nz
Tue Jun 30 02:52:20 CEST 2015
If you want a pointer to the correct syntax for subset(), try
help("subset")!!!
The syntax of your "extstream" function is totally screwed up,
convoluted and over-complicated. Note that even if you had your "subset"
argument specified correctly, the return() call will give you only the
result from the *first* pass through the for loop.
That aside, the error message is perfectly clear: 'subset' must be
logical. Your "subset" argument is "stream" which is a factor.
You *could* redefine your "extstream" function as follows:
function(alldf) {
sname <- levels(alldf$stream)
rslt <- vector("list",length(sname))
names(rslt) <- sname
for (i in sname) {
rslt[[i]] <- subset(alldf, alldf$stream==i, sampdate:quant)
}
rslt
}
However you don't need to go through such contortions:
split(testset,testset$stream)
will give essentially what you want. If you wish to strip out the
redundant "stream" column from the data frames in the resulting list,
you could do that using lapply()
cheers,
Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
On 30/06/15 12:03, Rich Shepard wrote:
> Moving from interactive use of R to scripts and functions and have
> bumped
> into what I believe is a problem with variable names. Did not see a
> solution
> in the two R programming books I have or from my Web searches. Inexperience
> with ess-tracebug keeps me from refining my bug tracking.
>
> Here's a test data set (cleverly called 'testset.dput'):
>
> structure(list(stream = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label
> = c("B", "J", "S"), class = "factor"),
> sampdate = structure(c(8121, 8121, 8121, 8155, 8155, 8155,
> 8185, 8185, 8185, 8205, 8205, 8205, 8236, 8236, 8236, 8257,
> 8257, 8257, 8308, 8785, 8785, 8785, 8785, 8785, 8785, 8785,
> 8847, 8847, 8847, 8847, 8847, 8847, 8847, 8875, 8875, 8875,
> 8875, 8875, 8875, 8875, 8121, 8121, 8121, 8155, 8155, 8155,
> 8185, 8185, 8185, 8205, 8205, 8205, 8236, 8236, 8236, 8257,
> 8257, 8257, 8301, 8301, 8301), class = "Date"), param =
> structure(c(2L,
> 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L,
> 6L, 7L, 2L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L,
> 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L,
> 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L, 2L, 6L, 7L
> ), .Label = c("Ca", "Cl", "K", "Mg", "Na", "SO4", "pH"), class =
> "factor"),
> quant = c(4, 33, 8.43, 4, 32, 8.46, 4, 31, 8.43, 6, 33, 8.32,
> 5, 33, 8.5, 5, 32, 8.5, 5, 59.9, 3.46, 1.48, 29, 7.54, 64.6,
> 7.36, 46, 2.95, 1.34, 21.8, 5.76, 48.8, 7.72, 74.2, 5.36,
> 2.33, 38.4, 8.27, 141, 7.8, 3, 76, 6.64, 4, 74, 7.46, 2,
> 82, 7.58, 5, 106, 7.91, 3, 56, 7.83, 3, 51, 7.6, 6, 149,
> 7.73)), .Names = c("stream", "sampdate", "param", "quant"
> ), row.names = c(NA, -61L), class = "data.frame")
>
> I want to subset that data.frame on each of the stream names: B, J,
> and S.
> This is the function that has the naming error (eda.R):
>
> extstream = function(alldf) {
> sname = alldf$stream
> sdate = alldf$sampdate
> comp = alldf$param
> value = alldf$quant
> for (i in sname) {
> sname <- subset(alldf, alldf$stream, select = c(sdate, comp,
> value))
> return(sname)
> }
> }
>
> This is the result of running source('eda.R') followed by
>
>> extstream(testset)
> Error in subset.data.frame(alldf, alldf$stream, select = c(sdate, comp, :
> 'subset' must be logical
>
> I've tried using sname for the rows to select, but that produces a
> different error of trying to select undefined columns.
>
> A pointer to the correct syntax for subset() is needed.
More information about the R-help
mailing list