[R] delete rows with duplicate numbers of opposite signs in the same column

arun smartpink111 at yahoo.com
Sat Nov 16 00:32:00 CET 2013


Hi,
Try:


library(plyr)
 fun1 <- function(dat){
 fun2 <- function(x) {indx <- x <0
  x1 <- x[!indx] %in% abs(x[indx])
  x2 <- abs(x[indx]) %in% x[!indx]
  x3 <- rbind(x[!indx][x1],x[indx][x2])
  x[!x %in% x3]}
 if(length(colnames(dat)) > 2) {
  lapply(colnames(dat)[-1], function(x) {
  dat1 <- cbind(dat[1],dat[x])
 ddply(dat1,.(Customer),colwise(fun2))
 })
 }
 else {
 ddply(dat,.(Customer),colwise(fun2))
 }
 }

dat1 <- structure(list(Customer = c("A", "A", "A", "B", "B", "B"), Consumption = c(100L, 
-100L, 150L, 20L, 30L, -30L)), .Names = c("Customer", "Consumption"
), class = "data.frame", row.names = c(NA, -6L))

dat2 <- structure(list(Customer = c("A", "A", "A", "B", "B", "B"), Consumption = c(100L, 
-100L, 150L, 20L, 30L, -30L), Column2 = c(30, -30, 40, 80, -40, 
40)), .Names = c("Customer", "Consumption", "Column2"), row.names = c(NA, 
-6L), class = "data.frame")

dat3 <- structure(list(Customer = c("A", "A", "A", "B", "B", "B", "B", 
"B"), Consumption = c(100, -100, 150, 20, 30, -30, 20, -40), 
    Column2 = c(30, 40, -30, -40, 80, 40, 20, -60)), .Names = c("Customer", 
"Consumption", "Column2"), row.names = c(NA, 8L), class = "data.frame")



fun1(dat1)
fun1(dat2)
fun1(dat3)

A.K.



Hi guys 

I am working on the dataset that there are some duplicates with 
opposite signs in the same column. But those pairs of opposites are 
errors, I have to delete them. For example: 

Customer    Consumption 
A                  100 
A                 -100 
A                  150 
B                   20 
B                   30 
B                  -30 

I have to get rid of those opposites for each 
customer(Consumption is one of the 13 variables in the dataset). This 
question troubles me for a long time, I really have no idea. 
Can anyone help me out or give me some hint? I really appreciate.



More information about the R-help mailing list