[R] recoding genetic information using gsub

Sarah Goslee sarah.goslee at gmail.com
Fri Dec 5 20:30:49 CET 2014


Hi,

Briefly, you need to read about regular expressions. It's possible to
be incredibly specific, and even to do what you want with a single
line of code.

It's hard to be certain of exactly what you need, though, without a
reproducible example. See inline for one possibility.

On Fri, Dec 5, 2014 at 2:24 PM, Kate Ignatius <kate.ignatius at gmail.com> wrote:
> I have genetic information for several thousand individuals:
>
> A/T
> T/G
> C/G  etc
>
> For some individuals there are some genotypes that are like this:  A/,
> C/, T/, G/ or even just / which represents missing and I want to
> change these to the following:
>
> A/ A/.
> C/ C/.
> G/ G/.
> T/ T/.
> / ./.
> /A ./A
> /C ./C
> /G ./G
> /T ./T
>
> I've tried to use gsub with a command like the following:
>
> gsub("A/","[A/.]", GT[,6])

I don't understand why you put square brackets in, and you probably
want the end marker to distinguish
A/
from
A/A

gsub("A/$","A/.", GT[,6])


> but if genotypes arent like the above, the command will change it to
> look something like:
>
> A/.T
> T/.G
> C/.G
>
> Is there anyway to be more specific in gsub?


Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list