[R] piping in only specific parts of a certain column
Drake Gossi
dr@ke@go@@| @end|ng |rom gm@||@com
Thu Jul 2 00:47:42 CEST 2020
Hello!
Question. I'm dealing with a large excel sheet that I'm trying to tidy
and then visualize, and I'm wondering how I might specify the data I'm
visualizing.
Here's the data frame I'm working with:
> str(unclean_data)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 1909 obs. of 9 variables:
$ unique identifier: num 1 1 1 1 1 1 1 1 1 1 ...
$ question : num 1 2 2 2 2 2 2 3 3 3 ...
$ grid text : chr "******* and his family have lived and
worked in ******* for 6 years." "******* contributes to public safety
while also organizing community events. He said he hosts Trunk or
Treat, en"| __truncated__ "******* did not know the origin or history
of ******* PD, but he said it is integral to the safety of the area."
"The ******* PD ensures safety, he said, while also familiarizing
themselves with the town’s people. He said ev"| __truncated__ ...
>
The most important column is the $grid text one, and I know how to extract that:
> text_df_APPLIED <- tibble(line = 1:1909, text = unclean_data$`grid text`)
But my question is, what if I only wanted to extract stuff from the
$grid text column that was itself only correlated with the number 3 in
the $question column? So, instead of visualizing or rather tidying the
whole $grid text column, I want to only tidy a smaller portion of it,
only that which is indexed to the number 3 is the $question column.
Is there a way to do that in this line of code:
> text_df_APPLIED <- tibble(line = 1:1909, text = unclean_data$`grid text`)
Or do I have to FIRST shorten the $`grid text` column (shorten it to
only that which is indexed to 3 in the $question column) BEFORE I even
begin to tidy it?
I'm working with these libraries right now, if it helps:
library(tidytext)
library(dplyr)
library(stringr)
D
More information about the R-help
mailing list