[R] Data Structure to Unnest_tokens in tidytext package
Eric Berger
er|cjberger @end|ng |rom gm@||@com
Wed Dec 11 16:22:56 CET 2019
Hi Sarah,
I looked at the documentation that you linked to. It contains the step
text_df <- tibble(line = 1:4, text = text)
before it does the step
text_df %>%
unnest_tokens(word, text)
So you may be missing a step.
Best,
Eric
On Tue, Dec 10, 2019 at 9:05 PM Sarah Payne <spaynebu using gmail.com> wrote:
>
> Hi--I'm fairly new to R and trying to do a text mining project on a novel
> using the tidytext package. The novel is saved as a plain text document and
> I can import it into RStudio just fine. For reference I'm trying to do
> something similar to section 1.3 of this tidy text tutorial
> <https://www.tidytextmining.com/tidytext.html>, except I'm working with one
> novel instead of many. So I import the novel and then run:
>
> "tidy_novel <- quicksandr %>%
> unnest_tokens (word, text)"
>
> I get the following error:
>
> Error in check_input(x) :
> Input must be a character vector of any length or a list of character
> vectors, each of which has a length of 1.
>
> typeof(novel) returns "list" and str(novel) returns
>
> Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame': 955 obs. of 1
> variable:
> $ FOR E. S. I.: chr "FOR E. S. I." "My old man died in a fine big house.
> My ma died in a shack. I wonder where I'm gonna die, Being neither white
> nor black?'" "LANGSTON HUGHES" "ONE" ...
> - attr(*, "problems")=Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 8 obs. of
> 5 variables:
> ..$ row : int 530 726 733 836 853 886 889 942
> ..$ col : chr NA NA NA NA ...
> ..$ expected: chr "1 columns" "1 columns" "1 columns" "1 columns" ...
> ..$ actual : chr "2 columns" "2 columns" "2 columns" "2 columns" ...
> ..$ file : chr "'quicksandr.txt'" "'quicksandr.txt'"
> "'quicksandr.txt'" "'quicksandr.txt'" ...
> - attr(*, "spec")=
> .. cols(
> .. `FOR E. S. I.` = col_character()
> .. )
> >
>
> I'm just importing the text file and then trying to run the unnest_tokens
> function, so maybe I'm missing a step in between? I seem to need my text
> file in a different format, so would appreciate answers on how to do that.
> Thanks, and let me know if I need to provide more info!
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list