[R] What am I doing wrong with sapply ?
Steve Lianoglou
mailinglist.honeypot at gmail.com
Thu May 26 08:13:49 CEST 2011
Hi,
On Thu, May 26, 2011 at 12:49 AM, eric <ericstrom at aol.com> wrote:
> Statement 9 using sapply does not seem to give the correct answer (or at
> least to me). Yet I do what I think is the same thing with statement 11 and
> I get the answer I'm looking for.
>
> 9 : s <-sapply(unlist(v[c(1:length(v))]), max)
> 11: for(i in 1 :length(v)) v1[i] <- max(unlist(v[i]))
>
> Shouldn't I get the same answer ?
>
>
> library(XML)
> rm(list=ls())
> url <-
> "http://webapp.montcopa.org/sherreal/salelist.asp?saledate=05/25/2011"
> tbl <-data.frame(readHTMLTable(url))[2:404, c(3,5,6,8,9)]
> names(tbl) <- c("Address", "Township", "Parcel", "SaleDate", "Costs");
> rownames(tbl) <- c(1:length(tbl[,1]))
> x <-tbl
> v <- gregexpr("( aka )|( AKA )",x$Address)
> s <-sapply(unlist(v[c(1:length(v))]), max)
> v1 <-numeric(length(v))
> for(i in 1 :length(v)) v1[i] <- max(unlist(v[i]))
There is an element in your list v that is of length 2, which is hosing you:
R> table(sapply(v, length))
1 2
401 2
and as a result, the unlist(v) is turning into a vector that's longer
than your list, so `s` is longer than `v1`
Another way you might have stumbled on the problem is when you compare s and v1:
R> all(s == v1)
[1] FALSE
Warning message:
In s == v1 :
longer object length is not a multiple of shorter object length
To "fix" the problem, you could use regexpr instead of gregexpr, which
only finds the first element of a match, and not all of them.
If you do that substitution, all(s == v1) will evaluate to TRUE.
HTH,
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the R-help
mailing list