[R] Replacing values in dataframes

Petr PIKAL petr.pikal at precheza.cz
Mon Sep 21 12:08:30 CEST 2009


Hi

I believe you are quite near.

r-help-bounces at r-project.org napsal dne 21.09.2009 11:38:29:

> 
> Thank you so much for trying to help me. 
> Thus, I still can't get it to work. 
> I will clearify a bit. If  you somehow have time helping me I would much 
appreciate it.
> NAD and Prot.amount are both data.frames. There are several different 
> dataframes called NAD,NAD1, NAD2 etc. that I would like to run the loop 
over. 
> Prot.amount has all the samplenames as its row.names and the correct 
values as
> Prot.amount[,1]. Thus, the NADs are data.frames with values for the 
samples, 
> they DO NOT contain all the samples.
> > NAD[1:3,1:3]
> Sample.Id Main.abs..1 Main.abs..2
> 148 10a 0.04836 0.04994
> 167 11a_1109 0.32245 0.36541
> 173 11b_1109 0.29293 0.32815
> > Prot.amount[1:3,1]
> > 10a 11a_1109 11b_1109
> 15.516 38.248 42.297

get rownames to a column named Sample.Id

Prot.amount$Sample.id <- row.names(Prot.amount)

> > dim(Prot.amount)
> [1] 30  1
> > dim(NAD)
> [1] 23 12
> The thing I want to do is to replace the sample name with the correct 
> Prot.amount.  So that in all dataframes where I have for instance sample 
10a i
> want to have a column with its corresponding prot.amount, here 15.516.
> 
> Sample.id=row.names(Prot.amount)
> 
> gives a vector with all the correct sample.id's
> new.id=Prot.amount[1:nrow(Prot.amount),] gives all the samples and their 

> correct prot.amounts. 
> > new.id[1:8]
>      10a 11a_1109 11b_1109 12a_1109 12b_1109  1a_1109       2a  2a_1009 
>   15.516   38.248   42.297   36.134   25.467   28.184    9.927    2.242 
>  iddf <- data.frame((Sample.id=row.names(Prot.amount)), 
new.id=Prot.amount
> [1:nrow(Prot.amount),])
> Thus, 
> 
> newNAD <- merge( NAD, iddf) 

newNAD <- merge( NAD, Prot.amount, all.x=T) 

It shall merge all records from NAD together with values from Prot.amount.

Regards
Petr

> 
> > newNAD[4,]
>   Sample.Id Main.abs..1 Main.abs..2 Main.abs..3 Main.abs..4 Main.abs..5 
> Main.abs..6 Main.abs..7 Main.abs..8 Main.abs..9
> 4  12a_1109     0.26291     0.26794     0.27809     0.28948     0.29654  
 
> 0.3051     0.31388      0.3223     0.33066
>   Main.abs..10 Main.abs..11 X.Sample.id...row.names.Prot.amount.. new.id
> 4      0.33806      0.34577                                   10a 15.516
> 
> 
> 
> Produces the correct looking dataframe, only that the first sample.id 
and 
> corresponding Prot.amount gets merged to all rows. 
> 
> 
> 
> So what I need to do is a loop. This is my suggestion: 
> 
> 
> 
> Changing<-function(A){
> 
> tmp<-mat.or.vec(nr=nrow(A),nc=1)
> newNAD <-mat.or.vec(nr=nrow(A),nc=1)
> 
> for (j in 1:nrow(A)) {
> tmp[j] <- data.frame((A$Sample.id=row.names(Prot.amount)), 
(A$new.id=Prot.amount[j,]))
>  newNAD <- merge(A[j,], tmp[j])
>  }
>  newNAD
> }
> 
> 
> 
> Thus, of course it doesn't work that easily. This is becouse 
> 
> > Changing(NAD)
> Error in `$<-.data.frame`(`*tmp*`, "Sample.id", value = c("10a", 
"11a_1109",  : 
>   replacement has 30 rows, data has 23
> 
> 
> 
> meaning the dataframe NAD with the samples have 23 samples, the 
Prot.amounts 
> file have 30 different samples and their corresponding values. How do I 
> include into the loop that I only want to replace those Prot.amount 
values 
> found in the NAD dataframe? 
> 
> 
> 
> Thank you so much!
> 
> 
> 
> > Date: Sat, 19 Sep 2009 21:35:31 -0700
> > To: monnire at hotmail.com; r-help at r-project.org
> > From: macq at llnl.gov
> > Subject: Re: [R] Replacing values in dataframes
> > 
> > What I would probably do is along these lines:
> > 
> > iddf <- data.frame(Sample.id=names(Prot.amount), 
new.id=Prot.amount[1,])
> > 
> > newNAD <- merge( NAD, iddf)
> > 
> > This is not tested, but it looks right to me, 
> > assuming I understand the structure of what 
> > you're trying to do.
> > 
> > I'm also assuming that NAD has more than three 
> > rows, and that Prot.amount has as many columns as 
> > NAD has rows. And that you just showed us the 
> > first three rows of NAD and first three columns 
> > of Prot.amount in order to keep the email simple.
> > 
> > One final note ... if Prot.amount is an object 
> > within R, it is *not* a file. You may have read 
> > it in from a file, of course, but it isn't a file 
> > inside R. I'm assuming it's a dataframe.
> > 
> > -Don
> > 
> > At 1:18 PM +0300 9/19/09, Monna Nygĺrd wrote:
> > >Hi,
> > >
> > >
> > >
> > >This is a question of a newbie getting into the exciting world of R.
> > >
> > >
> > >
> > >I have several dataframes in the same format as NAD:
> > >
> > >
> > >
> > >
> > >
> > >> NAD[1:3,1:3]
> > >
> > > Sample.Id Main.abs..1 Main.abs..2
> > >148 10a 0.04836 0.04994
> > >167 11a_1109 0.32245 0.36541
> > >173 11b_1109 0.29293 0.32815
> > >
> > >
> > >What I want to do is to replace the Sample.Id 
> > >with a corresponding number.The number i have in 
> > >another file,called Prot.amount
> > >
> > >
> > >
> > >> Prot.amount[1:3,1]
> > > 10a 11a_1109 11b_1109
> > > 15.516 38.248 42.297
> > >
> > >
> > >
> > >
> > >
> > >> row.names(NAD)<-(NAD[,1])
> > >> NAD$Sample.Id <- replace(NAD$Sample.Id, 
> > >>NAD$Sample.Id=="10a",Prot.amount["10a",1])
> > >
> > >> NAD[1:3,1:3]
> > > Sample.Id Main.abs..1 Main.abs..2
> > >10a 15.516 0.04836 0.04994
> > >11a_1109 11a_1109 0.32245 0.36541
> > >11b_1109 11b_1109 0.29293 0.32815
> > >
> > >
> > >
> > >So what I have tried to do is to write a 
> > >function that would allow me to replace the 
> > >values automatically of all dataframes. This I 
> > >just can't get to work. 
> > >
> > >
> > >
> > >Thank you so much in advance!
> > > 
> > >_________________________________________________________________
> > >[[elided Hotmail spam]]
> > >
> > > [[alternative HTML version deleted]]
> > >
> > >______________________________________________
> > >R-help at r-project.org mailing list
> > >https://*stat.ethz.ch/mailman/listinfo/r-help
> > >PLEASE do read the posting guide 
http://*www.*R-project.org/posting-guide.html
> > >and provide commented, minimal, self-contained, reproducible code.
> > 
> > 
> > -- 
> > ---------------------------------
> > Don MacQueen
> > Lawrence Livermore National Laboratory
> > Livermore, CA, USA
> > 925-423-1062
> > macq at llnl.gov
> > ---------------------------------
> 
> _________________________________________________________________
> [[elided Hotmail spam]]
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list