[R] Help reading table rows into lists

Gabor Grothendieck ggrothendieck at gmail.com
Sun Oct 10 19:19:27 CEST 2010


On Sun, Oct 10, 2010 at 11:40 AM, Alison Waller <alison.waller at embl.de> wrote:
> Hi all,
>
> I have a large table mapping thousands of COGs(groups of genes) to pathways.
> # Ex
> COG0001 patha   pathb   pathc
> COG0002 pathd   pathe
> COG0003 pathe   pathf   pathg   pathh
> ##
>
> I would like to combine this information into a big list such as below
> COG2PATHWAY<-list(COG0001=c("patha","pathb","pathc"),COG0002=c("pathd","pathe"),COG0003=c("pathf","pathg","pathh"))
>
> I am stuck and have tried various methods involving (probably mangled)
> versions of lappy and loops.
>
> Any suggestions on the most efficient way to do this would be great.
>

Try this:


Lines <- "COG0001 patha   pathb   pathc
COG0002 pathd   pathe
COG0003 pathe   pathf   pathg   pathh"
DF <- read.table(textConnection(Lines), header = FALSE,
         fill = TRUE, as.is = TRUE, na.strings = "")

library(reshape2)
m <- na.omit(melt(DF, 1))
result <- unstack(m, value ~ V1)

giving

> result
$COG0001
[1] "patha" "pathb" "pathc"

$COG0002
[1] "pathd" "pathe"

$COG0003
[1] "pathe" "pathf" "pathg" "pathh"


or

> acast(DF, value ~ V1)
      COG0001 COG0002 COG0003
patha patha   <NA>    <NA>
pathb pathb   <NA>    <NA>
pathc pathc   <NA>    <NA>
pathd <NA>    pathd   <NA>
pathe <NA>    pathe   pathe
pathf <NA>    <NA>    pathf
pathg <NA>    <NA>    pathg
pathh <NA>    <NA>    pathh
Levels: patha pathb pathc pathd pathe pathf pathg pathh

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list