[R] Comparison of two files with multiple arguments

Michael Bedward michael.bedward at gmail.com
Tue Oct 12 05:19:15 CEST 2010


Hello,

Here's one way to do it. It assumes dat has character values, not factors.

dat2 <- matrix(0, nrow(dat), ncol(dat))
dat2[ is.na(dat) ] <- NA
dat2[ apply(dat, 2, function(x) grepl(",", x)) ] <- 2
dat2[ apply(dat, 2, function(x) x != ref) ] <- 1

Michael


On 12 October 2010 13:24, burgundy <sauburn at yahoo.com> wrote:
>
> Hello,
>
> I have an example file which can be generated using:
>
> dat <- read.table(tc <- textConnection(
> 'T T,G G T
> C NA G G
> A,T A A NA'), sep="")
>
> I also have a reference file with the same number of rows, for example:
> G
> C
> A
>
> I would like to transform the file to numerical values using the following
> arguments:
> 1) Where data points have two letters separated by a comma, e.g. "T,G",
> replace with a "2"
> 2) Where single letter data points match the data point in the corresponding
> row of the reference file, replace with a "0"
> 3) Where single letter data points do not match the reference file, replace
> with a "1"
> 4) NA is left as NA
>
> In the example, the output file would look like:
>
> 1 2 0 1
> 0 NA 1 1
> 2 0 0 NA
>
> Any advice very much appreciated. Also, if you know of any good books or
> online courses that can help me to learn how to deal with these sorts of
> data handling queries, that is also great!
>
> Thank you
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Comparison-of-two-files-with-multiple-arguments-tp2991043p2991043.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list