[R] Help reading table rows into lists

Jeffrey Spies jspies at virginia.edu
Sun Oct 10 20:59:08 CEST 2010


To get just the list you wanted, Gabor's solution is more elegant, but
here's another using the apply family.  First, your data:

dat <- scan(file="/g/bork8/waller/test_COGtoPath.txt",what="character",sep="\n")

I expect dat to be a vector of strings where each string is a line of
values separated by tabs, which I think, by looking at your other
code, is what you get.

sapply(dat, function(x){
    tmp<-unlist(strsplit(x, '\t', fixed=T))
    out <- list(tmp[seq_along(tmp)[-1]])
    names(out) <- tmp[1]
    out
}, USE.NAMES=F)

The one difference between the two is that if you have a COG with no
pathways (might not be realistic or that big of a deal), this solution
will have the COG name in the list with a value of character(0) where
Gabor's will omit the COG completely. Again, probably not a big deal.

Cheers,

Jeff.

On Sun, Oct 10, 2010 at 11:40 AM, Alison Waller <alison.waller at embl.de> wrote:
> Hi all,
>
> I have a large table mapping thousands of COGs(groups of genes) to pathways.
> # Ex
> COG0001 patha   pathb   pathc
> COG0002 pathd   pathe
> COG0003 pathe   pathf   pathg   pathh
> ##
>
> I would like to combine this information into a big list such as below
> COG2PATHWAY<-list(COG0001=c("patha","pathb","pathc"),COG0002=c("pathd","pathe"),COG0003=c("pathf","pathg","pathh"))
>
> I am stuck and have tried various methods involving (probably mangled)
> versions of lappy and loops.
>
> Any suggestions on the most efficient way to do this would be great.
>
> Thanks,
>
> Alison
>
> Here is my latest attempt.
>
> #####
>
> line_num<-length(scan(file="/g/bork8/waller/test_COGtoPath.txt",what="character",sep="\n"))
> COG2Path<-vector("list",line_num)
> COG2Path<-lapply(1:(line_num-1),function(x)
> scan(file="/g/bork8/waller/test_COGtopath.txt",skip=x,nlines=1,quiet=T,what='character',sep="\t"))
>
> #####
>
> I am getting an error
>
> #####
>
>>COG2Path<-lapply(1:(line_num-1),function(x)
>> scan(file="/g/bork8/waller/test_COGtopath.txt",skip=x,nlines=1,quiet=T,what='character',sep="\t"))
> Error in file(file, "r") : cannot open the connection
> In addition: Warning message:
> In file(file, "r") :
>
> But if I do scan alone I don't get an error
>
> # then I suppose it looks like the easiest wasy to name the list variables
> is using unix to cut the first column out and then read that in.
> names(COG2Path)<-scan(file="/g/bork8/waller/test_col_names.txt",sep="\t",what="character")
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list