[R] reading data
arun
smartpink111 at yahoo.com
Fri Feb 15 19:05:48 CET 2013
HI,
No problem.
?c() for concatenate to vector or list().
If I use do.call(cbind,..) or do.call(rbind,...)
do.call(cbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))
# [,1] [,2] [,3] [,4] [,5] [,6]
#a1 List,11 List,11 List,11 List,11 List,11 List,11
do.call(rbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))
# a1
#[1,] List,11
#[2,] List,11
#[3,] List,11
#[4,] List,11
#[5,] List,11
#[6,] List,11
ie.
list within in a list
restrial<-lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})
str(restrial)
#List of 6
# $ :List of 1
#..$ a1:'data.frame': 6 obs. of 11 variables:
.#. ..$ Id: chr [1:6] "aAA" "aAAAA" "aA" "aAA" ...
#.. ..$ M : chr [1:6] "1" "1" "2" "1" ...
#. ..$ mm: int [1:6] 2 2 1 2 3 2
#. ..$ x : int [1:6] 739 2263 1 1965 3660 1972
-----------------------------------------------------------------
str(res)
#List of 6
# $ a1:'data.frame': 6 obs. of 11 variables:
# ..$ Id: chr [1:6] "aAA" "aAAAA" "aA" "aAA" ...
#..$ M : chr [1:6] "1" "1" "2" "1" ...
# ..$ mm: int [1:6] 2 2 1 2 3 2
# ..$ x : int [1:6] 739 2263 1 1965 3660 1972
-----------------------------------------------------------------
You mentioned about naming this to "group_a","group_b". etc..
names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
res2<-split(res,names(res))
res3<- lapply(res2,function(x) {names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
res3$group_a
$a1
# Id M mm x b u k j y p v
#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
#$a2
# Id M mm x b u k j y p v
#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
#$a3
# Id M mm x b u k j y p v
#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
A.K.
________________________________
From: Vera Costa <veracosta.rt at gmail.com>
To: arun <smartpink111 at yahoo.com>
Sent: Friday, February 15, 2013 12:39 PM
Subject: Re: reading data
Thank you very much and sorry my questions.
But this code isn't grouping for letters sure? I mean, a1,a2,a3 is the same group, (the first letter give me the name of the group)
Another question, in do.call, you did do.call (c,.....) .What is c?
Sorry
2013/2/15 arun <smartpink111 at yahoo.com>
HI,
>
>Just to add:
>
>
>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})) #it seems like one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>
> names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>res[grep("group_b",names(res))]
>
>I am not sure how you want the grouped data to look like. If you want something like this:
>res1<-do.call(rbind,res)
>res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x) {row.names(x)<-1:nrow(x);x})
>res2
>#$group_a
>
> # Id M mm x b u k j y p v
>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>#7 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>#8 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>#9 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>#10 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>#11 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>#12 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>#13 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>#14 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>#15 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>#16 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>#17 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>#18 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>
>
>#$group_b
> # Id M mm x b u k j y p v
>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>#7 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>#8 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>#9 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>#10 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>#11 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>#12 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>
>#$group_c
>
> # Id M mm x b u k j y p v
>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>
>
>#or if you want it like this:
>res2<-split(res,names(res))
>
>res2[["group_b"]]
>
>#$group_b
># Id M mm x b u k j y p v
>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>
>#$group_b
> # Id M mm x b u k j y p v
>#1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926
>#2 aAAAA 1 2 2263 0.0004000 2 2 AR 4 7640 8926
>#3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA
>#4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926
>#5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496
>#6 AA na 2 1972 0.0007000 11 3 AR 25 509 734
>
>Hope this helps.
>
>A.K.
>
>
>
>----- Original Message -----
>From: "veracosta.rt at gmail.com" <veracosta.rt at gmail.com>
>To: smartpink111 at yahoo.com
>Cc:
>Sent: Friday, February 15, 2013 9:15 AM
>Subject: reading data
>
>Hi,
>I post yesterday and you helped me. I have little problem.
>
>At first, I never worked with regular expressions...
>
>The code that you gave me it's ok, but my files are inside the folders a1,a2,a3. I try to explain better.
>
>I have one folder named "data". Inside this folder I have some other folders named "a1","a2","b1",b2",...and inside of each one of that I have some files. I want only the file "mmmmmm.txt" (in all folders I have One file with this name).
>The name of the folder give me the name of the group,but I need to read the file inside. And after, have "group_a", group_"b"...because I need to work with this data grouped (and know the name of the group).
>
>Thank you.
>
More information about the R-help
mailing list