[R] recoding genetic information using gsub

William Dunlap wdunlap at tibco.com
Fri Dec 5 21:10:29 CET 2014


Does the following do what you want?
> raw <- c("A/B", " /B", "A/", "/ ")
> tmp <- sub("^ */", "./", raw)
> cleaned <- sub("/ *$", "/.", tmp)
> cleaned
[1] "A/B" "./B" "A/." "./."

(The " *" is to allow optional spaces before or after the slash.)


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Dec 5, 2014 at 11:24 AM, Kate Ignatius <kate.ignatius at gmail.com>
wrote:

> I have genetic information for several thousand individuals:
>
> A/T
> T/G
> C/G  etc
>
> For some individuals there are some genotypes that are like this:  A/,
> C/, T/, G/ or even just / which represents missing and I want to
> change these to the following:
>
> A/ A/.
> C/ C/.
> G/ G/.
> T/ T/.
> / ./.
> /A ./A
> /C ./C
> /G ./G
> /T ./T
>
> I've tried to use gsub with a command like the following:
>
> gsub("A/","[A/.]", GT[,6])
>
> but if genotypes arent like the above, the command will change it to
> look something like:
>
> A/.T
> T/.G
> C/.G
>
> Is there anyway to be more specific in gsub?
>
> Thanks!
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list