[R] how to Subset based on partial matching of columns?
David L Carlson
dcarlson at tamu.edu
Thu Apr 9 21:56:29 CEST 2015
>From Sarah's data frame you can get what you want directly with the table() function which will create a table object, mydf.tbl. If you want a data frame you need to convert the table using as.data.frame.matrix() to make mydf.df. Finally combine the two data frames if your x column consists of unique values in ascending order to make mydf.all.
> mydf.tbl <- table(mydf$x, mydf$code)
> mydf.tbl
LGTY MY GM+ RS TY
1 0 1 0 0
2 1 0 0 0
3 0 0 1 0
4 0 0 0 1
> mydf.df <- as.data.frame.matrix(mydf.tbl)
> mydf.df
LGTY MY GM+ RS TY
1 0 1 0 0
2 1 0 0 0
3 0 0 1 0
4 0 0 0 1
> mydf.all <- data.frame(mydf, mydf.df)
> mydf.all
x code LGTY MY.GM. RS TY
1 1 MY GM+ 0 1 0 0
2 2 LGTY 1 0 0 0
3 3 RS 0 0 1 0
4 4 TY 0 0 0 1
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of samarvir singh
Sent: Thursday, April 9, 2015 8:50 AM
To: Sarah Goslee
Cc: r-help
Subject: Re: [R] how to Subset based on partial matching of columns?
Thank you. Sarah Goslee. I am rather new in learning R. So people like you
are great support. Really appreciate you, taking the time to correct my
mistakes. Thanks
On Thu 9 Apr, 2015 6:54 pm Sarah Goslee <sarah.goslee at gmail.com> wrote:
> Hi,
>
> Please don't put quotes around your code. It makes it hard to copy and
> paste. Alternatively, don't post in HTML, because it screws up your
> code.
>
> On Wed, Apr 8, 2015 at 8:57 PM, samarvir singh <samarvir1996 at gmail.com>
> wrote:
> > So I have a list that contains certain characters as shown below
> >
> > `list <- c("MY","GM+" ,"TY","RS","LG")`
>
> That's a character vector, not a list. A list is a specific type of object
> in R.
>
> > And I have a variable named "CODE" in the data frame as follows
> >
> > `code <- c("MY GM+", ,"LGTY", "RS","TY")`
>
> That doesn't work, and I have no idea what you expect to have there,
> so I'm deleting the extra comma. Also, your vector is named code, not
> CODE.
>
> code <- c("MY GM+", "LGTY", "RS","TY")
> x <- c(1:4)
>
> > 'x <- c(1:5)
> > `df <- data.frame(x,code)`
>
> You problably actually want
> mydf <- data.frame(x, code, stringsAsFactors=FALSE)
>
> Note I changed the name, because df() is a base R function.
>
>
> > Now I want to create 5 new variables named "MY","GM+","TY","RS","LG"
> >
> > Which takes binary value, 1 if there's a match case in the CODE variable
> >
> > df
> > x code MY GM+ TY RS LG
> > 1 MY GM+ 1 1 0 0 0
> > 2 0 0 0 0 0
> > 3 LGTY 0 0 1 0 1
> > 4 RS 0 0 0 1 0
> > 5 TY 0 0 1 0 0
>
> grepl() will give you a logical match
>
> data.frame(mydf, sapply(code, function(x)grepl(x, mydf$code)),
> stringsAsFactors=FALSE, check.names=FALSE)
>
> Sarah
>
>
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list