[R-sig-Geo] Problem with %IN% ==FALSE when data frames are equal

Roger Bivand Roger.Bivand at nhh.no
Tue Jul 14 07:21:37 CEST 2009


On Mon, 13 Jul 2009, Jim Burke wrote:

> Hi everyone
>
> I have a issue with %IN% using FALSE deep within my
> R code processing. It errors if the comparison data
> frames are equal.
>
> DESIRED WORK AROUND: I would like examine my %IN% data
> frame fields before submitting then to the erroring
> block of code. Then appropriately trap this error.
> Any suggestions?

Exactly. Just calculate the condition vector first, then condition with 
for example if(any()), only subsetting with "[" if the condition is met. 
You'll have to decide what to do if it isn't.

Always try small examples interactively before writing complicated and (in 
this case) logically flawed scripts - you were assuming that at least one 
logical vector value would be true.

Note that %in% and match() do the same thing, so you can also do match() - 
check the ordering of arguments first! - and check for NAs in the returned 
index vector - or use which() on the condition, checking on the length of 
the output; this is what "[" does internally.

match(c("a", "c", "e"), letters)
match(c("A"), letters)
which(letters %in% c("a", "c", "e"))
which(letters %in% c("A"))
length(which(letters %in% c("A")))

Hope this helps,

Roger

>
> OVERVIEW: seems [sp$block %IN% df$ID==FALSE] chokes when
> sp and df contain equal variables. My normal processing is
> to take the larger sp and get all larger sp blocks that
> are not in the smaller sp. As I process the list, it
> goes well until the last item when both sets of data
> frame variables are equal.
>
>
> PROBLEM:
>
> ###############################################
> ## remove blocks that we just found in pct_blk_df
> ###############################################
> tmp_hd_census_blk_sp <- hd_census_blk_sp [ (hd_census_blk_sp$BLKIDFP00 %in% 
> pct_blk_df$ID==FALSE),]
>
> Error in lst[[i]] : subscript out of bounds
>> traceback()
> 2: .bboxCalcR(x at polygons)
> 1: hd_census_blk_sp[(hd_census_blk_sp$BLKIDFP00 %in% pct_blk_df$ID ==
>      FALSE), ]
>
> INPUTS:
>> hd_census_blk_sp$BLKIDFP00
> [1] 481130089001000 481130086032001 481130034002043 481130086032002
> [5] 481130034002046 481130086032004 481130034002047 481130086032000
> [9] 481130086032009 481130034002044 481130086032005 481130034002048
> [13] 481130089001013 481130086032007 481130086032010 481130089001014
> [17] 481130086032003
> 28480 Levels: 481130001001000 481130001001001 481130001001002 ... 
> 481130199004013
>> pct_blk_df$ID
> [1] 481130089001000 481130086032001 481130034002043 481130086032002
> [5] 481130034002046 481130086032004 481130034002047 481130086032000
> [9] 481130086032009 481130034002044 481130086032005 481130034002048
> [13] 481130089001013 481130086032007 481130086032010 481130089001014
> [17] 481130086032003
> 28480 Levels: 481130001001000 481130001001001 481130001001002 ... 
> 481130199004013
>
> Thanks,
> Jim Burke
>
> PS My thanks to Gledson Luiz Picharski for his help initially with this 
> "outside of %IN%" logic.
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list