[R] Extracting columns
arun
smartpink111 at yahoo.com
Thu Nov 8 22:12:54 CET 2012
HI Silvano,
I was using the sample() to create column names without a given seed. So, there was a possibility of having the same column names for list elements. In your case, there are fixed number of columns and I believe no two columns from all the 22 files have the same column name. The code will get all those columns that match the 100 column names from another file.
For example:
set.seed(423)
list1<-lapply(1:5,function(x) data.frame(matrix(sample(1:100,50,replace=TRUE),ncol=5)))
colnames(list1[[1]])<-sample(paste0("A",1:10),5,replace=FALSE)
colnames(list1[[2]])<-sample(paste0("A",11:20),5,replace=FALSE)
colnames(list1[[3]])<-sample(paste0("A",21:30),5,replace=FALSE)
colnames(list1[[4]])<-sample(paste0("A",31:40),5,replace=FALSE)
colnames(list1[[5]])<-sample(paste0("A",41:50),5,replace=FALSE)
coldat<-data.frame(col1=c("A21","A31","A47"))
res <- do.call(cbind, lapply(list1, function(x) x[colnames(x) %in%
coldat[ , 1]]))
res
# A21 A31 A47
#1 66 52 60
#2 78 89 32
#3 28 45 83
#4 45 33 94
#5 6 35 41
#6 89 31 32
#7 91 91 46
#8 30 73 12
#9 89 67 8
#10 11 8 97
A.K.
----- Original Message -----
From: Silvano Cesar da Costa <silvano at uel.br>
To: arun <smartpink111 at yahoo.com>
Cc:
Sent: Thursday, November 8, 2012 3:22 PM
Subject: Re: [R] Extracting columns
Thanks Arun, but the columns need be unique.
> # Arun:
> list1 <- lapply(1:5, function(x)
+ data.frame(matrix(sample(1:100, 50, replace=TRUE), ncol=5)))
>
> list1 <- lapply(list1, function(x) {
+ colnames(x) <- sample(paste0("A", 1:50), 5, replace=FALSE)
+ return(x)})
>
> coldat <- data.frame(col1=c("A9", "A35", "A7", "A30")) # colnames that
needs to be extracted
> res <- do.call(cbind, lapply(list1, function(x) x[colnames(x) %in%
coldat[ , 1]]))
> res
A9 A35 A35 A7
1 100 97 65 85
2 92 4 11 53
3 99 18 75 73
4 69 15 58 89
5 65 80 72 71
6 81 92 18 23
7 95 65 60 1
8 92 66 29 88
9 61 59 9 88
10 45 54 7 80
> HI,
>
> May be this helps:
>
> list1<-lapply(1:5,function(x)
> data.frame(matrix(sample(1:100,50,replace=TRUE),ncol=5)))
> list1<-lapply(list1,function(x)
> {colnames(x)<-sample(paste0("A",1:50),5,replace=FALSE)
> return(x)})
> coldat<-data.frame(col1=c("A9","A35","A7","A30")) #colnames that needs to
> be extracted
> res<-do.call(cbind,lapply(list1,function(x) x[colnames(x)%in%
> coldat[,1]]))
> res
> # A9 A35 A7 A30 A9 A7
> #1 42 56 10 67 14 3
> #2 98 42 49 38 6 97
> #3 7 67 10 15 15 80
> #4 85 82 24 97 2 95
> #5 64 8 49 77 17 9
> #6 57 60 4 39 4 89
> #7 86 41 90 50 80 61
> #8 70 84 23 46 32 61
> #9 11 29 42 76 100 100
> #10 92 19 28 38 72 87
>
> In your case, you have 22 files or dataframes. You can create a list of
> 22 dataframes and do the same step as above.
> list1<-list(A1,A2,A3,....,A22)
>
> A.K.
>
>
>
>
> ----- Original Message -----
> From: Silvano Cesar da Costa <silvano at uel.br>
> To: r-help at r-project.org
> Cc:
> Sent: Thursday, November 8, 2012 10:50 AM
> Subject: [R] Extracting columns
>
> Hi,
>
> I have 22 files (A1, A2, ..., A22) with different number of columns,
> totaling 10,000 columns: c1, c2, c3, ..., c10000
>
> I have another file with a list of 100 columns that I need to extract.
> These 100 columns are distributed in 22 files.
>
> How to extract the 100 columns of the 22 files?
>
> I have done it "manually" with the following commands, for example:
>
> cromo1 = read.table ("~ / cromo1.raw ', head = T)
> c1 = subset (cromo1, select = c ('c1', 'c50', 'C750'))
>
> in this case, I know that the columns c1, c50 and C750 are on cromo1.raw.
> See who need to apply the commands above 22 times.
>
> Is there a way to schedule these operations?
>
>
> ---------------------------------------------
> Silvano Cesar da Costa
>
> Universidade Estadual de Londrina
> Centro de Ciências Exatas
> Departamento de Estatística
>
> Fone: (43) 3371-4346
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
---------------------------------------------
Silvano Cesar da Costa
Universidade Estadual de Londrina
Centro de Ciências Exatas
Departamento de Estatística
Fone: (43) 3371-4346
---------------------------------------------
More information about the R-help
mailing list