Accepted Solution

Wordcloud with r

Frequent Visitor
Datamijn Frequent Visitor
Frequent Visitor

Re: Wordcloud with r

I got the same error when working on my own word cloud. I solved it by adding the 

'require("package")' for all the packages used. In my case it worked after adding the require call, but the end of the script is slightly different than the example here.






docs <- Corpus(VectorSource(dataset$Column1))

# Convert the text to lower case
docs <- tm_map(docs, content_transformer(tolower))
# Remove numbers
docs <- tm_map(docs, removeNumbers)
# Remove english common stopwords
#docs <- tm_map(docs, removeWords, stopwords("dutch"))
# Remove your own stop word
# specify your stopwords as a character vector
docs <- tm_map(docs, removeWords, c("het", "met", "ons", "dit", "hem", "als", "dat", "heb"))
# Remove punctuations
docs <- tm_map(docs, removePunctuation)
# Eliminate extra white spaces
#docs <- tm_map(docs, stripWhitespace)
# Text stemming
#docs <- tm_map(docs, stemDocument)

#Build a term-document matrix
dtm <- TermDocumentMatrix(docs)
m <- as.matrix(dtm)
v <- sort(rowSums(m),decreasing=TRUE)
d <- data.frame(word = names(v),freq=v)
head(d, 10)

wordcloud(words = d$word, freq = d$freq, min.freq = 1,
max.words=200, random.order=FALSE, rot.per=0.35,
colors=brewer.pal(8, "Dark2"))



Super User
Super User

Re: Wordcloud with r

When I tried the original code, it worked with older versions of R, 3.2.3 but not newer versions of R, 3.3.1 and 3.4.2. The Corpus construction was returning only numbers which were then stripped out by the rest of the code causing problems. So, I hacked together a variation of the original code based upon comments and have a working version here:



datain =[,1])))
words <- Corpus(VectorSource(dataset$text))

words <- tm_map(words, stripWhitespace)
words <- tm_map(words, content_transformer(tolower))
words <- tm_map(words, removeNumbers)
words <- tm_map(words, removePunctuation)
words <- tm_map(words, removeWords, stopwords("english"))
words <- tm_map(words, stemDocument)

wordcloud(words, scale=c(5,0.75), max.words=50, random.order=FALSE, rot.per=0.35, use.r.layout=FALSE, colors=brewer.pal(8, "Dark2"))

The main change is the construction of the Corpus, VectorSource(dataset) becomes VectorSource(dataset$text)


This is based upon the original PBIX file included in the original post.

Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!

Moderator boefraty

Re: Wordcloud with r

Thanks a lot, we will review the PBIX. Excpecially because the R-engine is going to be upgraded in service soon