[R] R help

Sun Aug 7 01:19:53 CEST 2016

Hi Vladimir,
Do you want something like this?

vdat<-read.table(text="numberoftweet,tweet,locations,badwords
1,My cat is asleep,London,glum
2,My cat is flying,Paris,dashed
3,My cat is dancing,Berlin,mopey
4,My cat is singing,Rome,ill
5,My cat is reading,Budapest,sad
6,My cat is eating,Amsterdam,annoyed
7,My cat is hiding,Copenhagen,crazy
8,My cat is fluffy,Vilnius,terrified
9,My cat is annoyed,Athens,sick
10,My cat is exercising,Ankara,mortified
11,My cat is dreaming,Kracow,irked
12,My cat is mopey,Vienna,uneasy
13,My cat is glum,Brussels,upset",
sep=",",header=TRUE,stringsAsFactors=FALSE)

badwords<-paste(vdat$badwords,collapse="|")

names(unlist(sapply(vdat$tweet,grep,pattern=badwords)))

Jim

On Sat, Aug 6, 2016 at 12:07 AM, Вова Грабарник <v.grabarnik at gmail.com> wrote:
> Dear R command,
>
> I was wondering if I could ask you recommendations on my problem if that is
> fine with you.
> Basically, I have a data frame with 5 columns and 10 000 tweets
> recorded(rows). Those columns are: numberofatweet(number), tweet (actual
> textual tweet), locations(from where tweet sent), badwords(words that
> should not be used on twitter, that is just a column irrespective the
> number of a tweet and it contains only 80 rows with one word recorded in
> one cell.
> My question is whether it is possible to select only the rows which would
> contain such tweets, where in column "tweet"(actual text) there was one of
> those words from badwords column present. I tried to use grep and grepl,
> but nothing seems to be working.
>
> Thank you in advance,
> Vladimir
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.