[R] What am I doing wrong with sapply ?

Steve Lianoglou mailinglist.honeypot at gmail.com
Thu May 26 08:13:49 CEST 2011


Hi,

On Thu, May 26, 2011 at 12:49 AM, eric <ericstrom at aol.com> wrote:
> Statement 9 using sapply does not seem to give the correct answer (or at
> least to me). Yet I do what I think is the same thing with statement 11 and
> I get the answer I'm looking for.
>
> 9 : s <-sapply(unlist(v[c(1:length(v))]), max)
> 11: for(i in 1 :length(v)) v1[i] <- max(unlist(v[i]))
>
> Shouldn't I get the same answer ?
>
>
> library(XML)
> rm(list=ls())
> url <-
> "http://webapp.montcopa.org/sherreal/salelist.asp?saledate=05/25/2011"
> tbl <-data.frame(readHTMLTable(url))[2:404, c(3,5,6,8,9)]
> names(tbl) <- c("Address", "Township", "Parcel", "SaleDate", "Costs");
> rownames(tbl) <- c(1:length(tbl[,1]))
> x <-tbl
> v <- gregexpr("( aka )|( AKA )",x$Address)
> s <-sapply(unlist(v[c(1:length(v))]), max)
> v1 <-numeric(length(v))
> for(i in 1 :length(v)) v1[i] <- max(unlist(v[i]))

There is an element in your list v that is of length 2, which is hosing you:

R> table(sapply(v, length))

  1   2
401   2

and as a result, the unlist(v) is turning into a vector that's longer
than your list, so `s` is longer than `v1`

Another way you might have stumbled on the problem is when you compare s and v1:

R> all(s == v1)
[1] FALSE
Warning message:
In s == v1 :
  longer object length is not a multiple of shorter object length

To "fix" the problem, you could use regexpr instead of gregexpr, which
only finds the first element of a match, and not all of them.

If you do that substitution, all(s == v1) will evaluate to TRUE.

HTH,
-steve
-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the R-help mailing list