[R] Unlisting a nested dataset

Nathan Parsons n@th@n@|@p@r@on@ @end|ng |rom gm@||@com
Tue Oct 16 21:50:20 CEST 2018


Ista - I provided data, code, and the error being returned as per reproducible r protocol. I did not include packages, however. unnest_tokens is from the TidyText package, map/map_chr are from purrr, and everything else is from tidyverse(dplyr/tidyr/etc.)

Not sure what else I can provide to make this more clear.

--

Nate Parsons
Pronouns: He, Him, His
Graduate Teaching Assistant
Department of Sociology
Portland State University
Portland, Oregon

503-725-9025
503-725-3957 FAX
On Oct 16, 2018, 12:35 PM -0700, Ista Zahn <istazahn using gmail.com>, wrote:
> Hi Nate,
>
> You've made it pretty difficult to answer your question. Please see
> https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> and follow some of the suggestions you find there to make it easier on
> those who want to help you.
>
> Best,
> Ista
> On Mon, Oct 15, 2018 at 10:56 PM Nathan Parsons
> <nathan.f.parsons using gmail.com> wrote:
> >
> > I’m attempting to do some content analysis on a few million tweets, but I can’t seem to get them cleaned correctly.
> >
> > I’m trying to replicate the process outlined here: https://stackoverflow.com/questions/46734501/opposite-of-unnest-tokens
> >
> > My code:
> >
> > tweets %>%
> > unnest_tokens(word, text, token = 'tweets') %>%
> > filter(!word %in% stop_words$word) %>%
> > nest(word) %>%
> > mutate(text = map(data, unlist),
> > text = map_chr(text, paste, collapse = " ")) -> tweets
> >
> > Unfortunately, I keep getting:
> >
> > Error in mutate_impl(.data, dots) :
> > Evaluation error: cannot coerce type 'closure' to vector of type 'character’.
> >
> > What am I doing wrong?
> >
> > Here’s what the dataset looks like:
> >
> > > glimpse(tweets)
> > Observations: 389,253
> > Variables: 12
> > $ status_id "x1047841705729306624", "x1046966595610927105", "x104709...
> > $ created_at "2018-10-04T13:31:45Z", "2018-10-02T03:34:22Z", "2018-10...
> > $ text "Technique is everything with olympic lifts ! @ Body By ...
> > $ lat 43.68359, 40.28412, 37.77066, 40.43139, 31.16889, 33.937...
> > $ lng -70.32841, -83.07859, -122.43598, -79.98069, -100.07689,...
> > $ county_name "Cumberland County", "Delaware County", "San Francisco C...
> > $ fips 23005, 39041, 6075, 42003, 48095, 6037, 6037, 55073, 482...
> > $ state_name "Maine", "Ohio", "California", "Pennsylvania", "Texas", ...
> > $ state_abb "ME", "OH", "CA", "PA", "TX", "CA", "CA", "WI", "TX", "A...
> > $ urban_level "Medium Metro", "Large Fringe Metro", "Large Central Met...
> > $ urban_code 3, 2, 1, 1, 6, 1, 1, 4, 1, 3, 2, 2, 1, 3, 6, 1, 1, 2, 3,...
> > $ population 277308, 184029, 830781, 1160433, 4160, 9509611, 9509611,...
> >
> > --
> >
> > Nate Parsons
> > Pronouns: He, Him, His
> > Graduate Teaching Assistant
> > Department of Sociology
> > Portland State University
> > Portland, Oregon
> >
> > 503-725-9025
> > 503-725-3957 FAX
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]




More information about the R-help mailing list