[R] recoding genetic information using gsub
Sarah Goslee
sarah.goslee at gmail.com
Fri Dec 5 20:30:49 CET 2014
Hi,
Briefly, you need to read about regular expressions. It's possible to
be incredibly specific, and even to do what you want with a single
line of code.
It's hard to be certain of exactly what you need, though, without a
reproducible example. See inline for one possibility.
On Fri, Dec 5, 2014 at 2:24 PM, Kate Ignatius <kate.ignatius at gmail.com> wrote:
> I have genetic information for several thousand individuals:
>
> A/T
> T/G
> C/G etc
>
> For some individuals there are some genotypes that are like this: A/,
> C/, T/, G/ or even just / which represents missing and I want to
> change these to the following:
>
> A/ A/.
> C/ C/.
> G/ G/.
> T/ T/.
> / ./.
> /A ./A
> /C ./C
> /G ./G
> /T ./T
>
> I've tried to use gsub with a command like the following:
>
> gsub("A/","[A/.]", GT[,6])
I don't understand why you put square brackets in, and you probably
want the end marker to distinguish
A/
from
A/A
gsub("A/$","A/.", GT[,6])
> but if genotypes arent like the above, the command will change it to
> look something like:
>
> A/.T
> T/.G
> C/.G
>
> Is there anyway to be more specific in gsub?
Sarah
--
Sarah Goslee
http://www.functionaldiversity.org
More information about the R-help
mailing list