[BioC] Selecting Unique rows in multiple column data frames

Tue Nov 7 00:26:15 CET 2006

Pozdravljen Matjaž and hello Alex ;) It was nice to meet you in Goettingen!

> I have data frames with 2 columns of normalised microarray data (more that 10k
rows, custom-made array)
> with the following layout (not real data):
> ID        M
> ID1      -4.60138
> ID2      -3.28832
> ID is the oligo ID (spot-ID), M is the corresponding M-value.
> 
> Only one spot per block is present in replicates (4). Therefore I would like
to use one of the following 2 options:
> 

You can get rows that have the same ID with (x is a data.frame)

## get IDs
id <- x$id
## unique IDs
uId <- unique(x$id)
## loop over unique IDs
for(i in uId) {
  ## do whatever you want with rows that have the same ID
  x[i %in% id, ]
}

> 1. Average the M-values in rows that have the same ID and extract the data
table with both columns.

If I understand, this should work

for(i in uId) {
  tmp <- x[i %in% id, ]
  print(tmp) # or anything else
  mean(tmp$m, na.rm=TRUE)
}

-- 
Lep pozdrav / With regards, 
    Gregor Gorjanc
----------------------------------------------------------------------
University of Ljubljana     PhD student 
Biotechnical Faculty 
Zootechnical Department     URI: http://www.bfro.uni-lj.si/MR/ggorjan
Groblje 3                   mail: gregor.gorjanc <at> bfro.uni-lj.si           

SI-1230 Domzale             tel: +386 (0)1 72 17 861                  
Slovenia, Europe            fax: +386 (0)1 72 17 888                           

----------------------------------------------------------------------
"One must learn by doing the thing; for though you think you know it, 
 you have no certainty until you try." Sophocles ~ 450 B.C.