[R] reading data

arun smartpink111 at yahoo.com
Fri Feb 15 19:05:48 CET 2013


HI,
No problem.
?c() for concatenate to vector or list().
If I use do.call(cbind,..) or do.call(rbind,...)

do.call(cbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))  
#   [,1]    [,2]    [,3]    [,4]    [,5]    [,6]   
#a1 List,11 List,11 List,11 List,11 List,11 List,11


 do.call(rbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))  
#     a1     
#[1,] List,11
#[2,] List,11
#[3,] List,11
#[4,] List,11
#[5,] List,11
#[6,] List,11
ie.
list within in a list

 restrial<-lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})
 str(restrial)
#List of 6
# $ :List of 1
  #..$ a1:'data.frame':    6 obs. of  11 variables:
  .#. ..$ Id: chr [1:6] "aAA" "aAAAA" "aA" "aAA" ...
  #.. ..$ M : chr [1:6] "1" "1" "2" "1" ...
  #. ..$ mm: int [1:6] 2 2 1 2 3 2
  #. ..$ x : int [1:6] 739 2263 1 1965 3660 1972
  -----------------------------------------------------------------
str(res)
#List of 6
# $ a1:'data.frame':    6 obs. of  11 variables:
 # ..$ Id: chr [1:6] "aAA" "aAAAA" "aA" "aAA" ...
  #..$ M : chr [1:6] "1" "1" "2" "1" ...
 # ..$ mm: int [1:6] 2 2 1 2 3 2
 # ..$ x : int [1:6] 739 2263 1 1965 3660 1972
-----------------------------------------------------------------

You mentioned about naming this to "group_a","group_b". etc..
 names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
res2<-split(res,names(res))

res3<- lapply(res2,function(x) {names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
 res3$group_a
$a1
#     Id  M mm    x         b  u  k  j    y        p    v
#1   aAA  1  2  739 0.1257000  2  2 AA    2     8867 8926
#2 aAAAA  1  2 2263 0.0004000  2  2 AR    4     7640 8926
#3    aA  2  1    1 0.0845435  2 AA  2 6790 734,1092   NA
#4   aAA  1  2 1965 0.0007000  4  3 AR    2    11616 8926
#5  aAAA  1  3 3660 0.0008600 18  3 AA    2    20392  496
#6    AA na  2 1972 0.0007000 11  3 AR   25      509  734

#$a2
#     Id  M mm    x         b  u  k  j    y        p    v
#1   aAA  1  2  739 0.1257000  2  2 AA    2     8867 8926
#2 aAAAA  1  2 2263 0.0004000  2  2 AR    4     7640 8926
#3    aA  2  1    1 0.0845435  2 AA  2 6790 734,1092   NA
#4   aAA  1  2 1965 0.0007000  4  3 AR    2    11616 8926
#5  aAAA  1  3 3660 0.0008600 18  3 AA    2    20392  496
#6    AA na  2 1972 0.0007000 11  3 AR   25      509  734

#$a3
 #    Id  M mm    x         b  u  k  j    y        p    v
#1   aAA  1  2  739 0.1257000  2  2 AA    2     8867 8926
#2 aAAAA  1  2 2263 0.0004000  2  2 AR    4     7640 8926
#3    aA  2  1    1 0.0845435  2 AA  2 6790 734,1092   NA
#4   aAA  1  2 1965 0.0007000  4  3 AR    2    11616 8926
#5  aAAA  1  3 3660 0.0008600 18  3 AA    2    20392  496
#6    AA na  2 1972 0.0007000 11  3 AR   25      509  734
A.K.
________________________________
From: Vera Costa <veracosta.rt at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Friday, February 15, 2013 12:39 PM
Subject: Re: reading data


Thank you very much and sorry my questions.

But this code isn't grouping for letters sure? I mean, a1,a2,a3 is the same group, (the first letter give me the name of the group)

Another question, in do.call, you did do.call (c,.....) .What is c?

Sorry



2013/2/15 arun <smartpink111 at yahoo.com>

HI,
>
>Just to add:
>
>
>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x) {names(x)<-gsub("^(.*)\\/.*","\\1",x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))  #it seems like one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>
> names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>res[grep("group_b",names(res))]
>
>I am not sure how you want the grouped data to look like.  If you want something like this:
>res1<-do.call(rbind,res)
>res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x) {row.names(x)<-1:nrow(x);x})
>res2
>#$group_a
>
> #     Id  M mm    x         b  u  k  j    y        p    v
>#1    aAA  1  2  739 0.1257000  2  2 AA    2     8867 8926
>#2  aAAAA  1  2 2263 0.0004000  2  2 AR    4     7640 8926
>#3     aA  2  1    1 0.0845435  2 AA  2 6790 734,1092   NA
>#4    aAA  1  2 1965 0.0007000  4  3 AR    2    11616 8926
>#5   aAAA  1  3 3660 0.0008600 18  3 AA    2    20392  496
>#6     AA na  2 1972 0.0007000 11  3 AR   25      509  734
>#7    aAA  1  2  739 0.1257000  2  2 AA    2     8867 8926
>#8  aAAAA  1  2 2263 0.0004000  2  2 AR    4     7640 8926
>#9     aA  2  1    1 0.0845435  2 AA  2 6790 734,1092   NA
>#10   aAA  1  2 1965 0.0007000  4  3 AR    2    11616 8926
>#11  aAAA  1  3 3660 0.0008600 18  3 AA    2    20392  496
>#12    AA na  2 1972 0.0007000 11  3 AR   25      509  734
>#13   aAA  1  2  739 0.1257000  2  2 AA    2     8867 8926
>#14 aAAAA  1  2 2263 0.0004000  2  2 AR    4     7640 8926
>#15    aA  2  1    1 0.0845435  2 AA  2 6790 734,1092   NA
>#16   aAA  1  2 1965 0.0007000  4  3 AR    2    11616 8926
>#17  aAAA  1  3 3660 0.0008600 18  3 AA    2    20392  496
>#18    AA na  2 1972 0.0007000 11  3 AR   25      509  734
>
>
>#$group_b
> #     Id  M mm    x         b  u  k  j    y        p    v
>#1    aAA  1  2  739 0.1257000  2  2 AA    2     8867 8926
>#2  aAAAA  1  2 2263 0.0004000  2  2 AR    4     7640 8926
>#3     aA  2  1    1 0.0845435  2 AA  2 6790 734,1092   NA
>#4    aAA  1  2 1965 0.0007000  4  3 AR    2    11616 8926
>#5   aAAA  1  3 3660 0.0008600 18  3 AA    2    20392  496
>#6     AA na  2 1972 0.0007000 11  3 AR   25      509  734
>#7    aAA  1  2  739 0.1257000  2  2 AA    2     8867 8926
>#8  aAAAA  1  2 2263 0.0004000  2  2 AR    4     7640 8926
>#9     aA  2  1    1 0.0845435  2 AA  2 6790 734,1092   NA
>#10   aAA  1  2 1965 0.0007000  4  3 AR    2    11616 8926
>#11  aAAA  1  3 3660 0.0008600 18  3 AA    2    20392  496
>#12    AA na  2 1972 0.0007000 11  3 AR   25      509  734
>
>#$group_c
>
> #    Id  M mm    x         b  u  k  j    y        p    v
>#1   aAA  1  2  739 0.1257000  2  2 AA    2     8867 8926
>#2 aAAAA  1  2 2263 0.0004000  2  2 AR    4     7640 8926
>#3    aA  2  1    1 0.0845435  2 AA  2 6790 734,1092   NA
>#4   aAA  1  2 1965 0.0007000  4  3 AR    2    11616 8926
>#5  aAAA  1  3 3660 0.0008600 18  3 AA    2    20392  496
>#6    AA na  2 1972 0.0007000 11  3 AR   25      509  734
>
>
>#or if you want it like this:
>res2<-split(res,names(res))
>
>res2[["group_b"]]
>
>#$group_b
>#     Id  M mm    x         b  u  k  j    y        p    v
>#1   aAA  1  2  739 0.1257000  2  2 AA    2     8867 8926
>#2 aAAAA  1  2 2263 0.0004000  2  2 AR    4     7640 8926
>#3    aA  2  1    1 0.0845435  2 AA  2 6790 734,1092   NA
>#4   aAA  1  2 1965 0.0007000  4  3 AR    2    11616 8926
>#5  aAAA  1  3 3660 0.0008600 18  3 AA    2    20392  496
>#6    AA na  2 1972 0.0007000 11  3 AR   25      509  734
>
>#$group_b
> #    Id  M mm    x         b  u  k  j    y        p    v
>#1   aAA  1  2  739 0.1257000  2  2 AA    2     8867 8926
>#2 aAAAA  1  2 2263 0.0004000  2  2 AR    4     7640 8926
>#3    aA  2  1    1 0.0845435  2 AA  2 6790 734,1092   NA
>#4   aAA  1  2 1965 0.0007000  4  3 AR    2    11616 8926
>#5  aAAA  1  3 3660 0.0008600 18  3 AA    2    20392  496
>#6    AA na  2 1972 0.0007000 11  3 AR   25      509  734
>
>Hope this helps.
>
>A.K.
>
>
>
>----- Original Message -----
>From: "veracosta.rt at gmail.com" <veracosta.rt at gmail.com>
>To: smartpink111 at yahoo.com
>Cc:
>Sent: Friday, February 15, 2013 9:15 AM
>Subject: reading data
>
>Hi,
>I post yesterday and you helped me. I have little problem.
>
>At first, I never worked with regular expressions...
>
>The code that you gave me it's ok, but my files are inside the folders a1,a2,a3. I try to explain better.
>
>I have one folder named "data". Inside this folder I have some other folders named "a1","a2","b1",b2",...and inside of each one of that I have some files. I want only the file "mmmmmm.txt" (in all folders I have One file with this name).
>The name of the folder give me the name of the group,but I need to read the file inside. And after, have "group_a", group_"b"...because I need to work with this data grouped (and know the name of the group).
>
>Thank you.
>   



More information about the R-help mailing list