[R] Find and replace all the elements in a data frame

Sarah Goslee sarah.goslee at gmail.com
Thu Feb 17 18:25:47 CET 2011


Josh, you've made it far too complicated. Here's one simpler way (note
that I changed your read.table statement to make the values NOT factors,
since I wouldn't think you want that).

> x <- read.table(textConnection("locus1 locus2 locus3
+ A T C
+ A T NA
+ T C C
+ A T G"), header = TRUE, as.is=TRUE)
> closeAllConnections()
>
> x2 <- x
> x2[x2 == "A"] <- "A/A"
> x2[x2 == "T"] <- "T/T"
> x2[x2 == "G"] <- "G/G"
> x2[x2 == "C"] <- "C/C"
> x2
  locus1 locus2 locus3
1    A/A    T/T    C/C
2    A/A    T/T   <NA>
3    T/T    C/C    C/C
4    A/A    T/T    G/G

If you do for some reason want a factor, you'll need to adjust the
levels for each
column before doing this.

Sarah

On Thu, Feb 17, 2011 at 11:54 AM, Josh B <joshb41 at yahoo.com> wrote:
> Hi all,
>
> I'm having a problem once again, trying to do something very simple. Consider
> the following data frame:
>
> x <- read.table(textConnection("locus1 locus2 locus3
> A T C
> A T NA
> T C C
> A T G"), header = TRUE)
> closeAllConnections()
>
> I am trying to make a new data frame, replacing "A" with "A/A", "T" with "T/T",
> "G" with "G/G", and "C" with "C/C." Note also the presence of an "NA" (missing
> data) in the data frame, which should be carried over to the new data frame.
>
> Here is what I am trying, which fails miserably:
>
> x2 <- data.frame(matrix(nrow = nrow(x), ncol = ncol(x)))
>
> for (i in 1:nrow(x)){
>    for (j in 1:ncol(x)){
>        if(x[i, j] == 'A') {x2[i, j] <- 'A/A'} else{
>            if(x[i, j] == 'T') {x2[i, j] <- 'T/T'} else{
>                 if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{
>                    if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{x2[i, j] <- NA}
>                }
>           }
>       }
>    }
> }
>
> I get the following error message:
> Error in if (x[i, j] == "A") { : missing value where TRUE/FALSE needed
>
> So what am I doing wrong? If you can provide me with specific code that fixes
> the problem and gets the job done, that would be the most useful.
>
>
> Thanks very much in advance for your help!
>
> Sincerely,
> -----------------------------------
> Josh Banta, Ph.D
> Center for Genomics and Systems Biology
> New York University
> 100 Washington Square East
> New York, NY 10003
> Tel: (212) 998-8465
> http://plantevolutionaryecology.org
>
>
>
>        [[alternative HTML version deleted]]




-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list