[R] create a heatmap for findAssocs results based on time

Elahe chalabi ch@l@bi@el@he @ending from y@hoo@de
Thu Nov 15 16:14:02 CET 2018


Hi all, 

I have the following data for which I create a document term matrix first and then I add the time available to the dtm. In order to see the correlations to the term "updat" in the different years, I would like to have a heat-map for findassoc in a way that x-axis shows the time. 


  
> library(tm)  
library(ggplot2)
  > dput(df) 
structure(list(Description = structure(c(5L, 8L, 6L, 4L, 1L, 
2L, 7L, 9L, 10L, 3L), .Label = c("general topics done", "keep the general topics updated", 
"rejected topic ", "several topics in hand", "this is a genetal topic", 
"topic 333555 needs to be updated", "topic 5647 is handed over", 
"topic is updated", "update the topic ", "updating the topic is done " 
), class = "factor")), class = "data.frame", row.names = c(NA, 
-10L))
> corpus=Corpus(VectorSource(df$Description)) 
> corpus=tm_map(corpus,tolower)
> corpus=tm_map(corpus,removePunctuation)
corpus=tm_map(corpus,removeWords,c(stopwords("english")))
> corpus=tm_map(corpus,stemDocument,"english")
> frequenciescontrol=DocumentTermMatrix(corpus)
frequenciescontrol$time=c("2015","2015","2015","2015","2015","2016","2016","2016","2016","2016")
findAssocs(frequenciescontrol, "updat", 0.01)


Heatmap looking: y axis-> all the words correlated to "updat"      x axis: years               legend:correlation  

Thanks for any help.
Elahe!



More information about the R-help mailing list