Wordcloud with r
08-19-2016 01:15 AM - last edited 12-13-2017 00:06 AM
08-19-2016 01:15 AM - last edited Wednesday by boefraty
Although there is a custom visualisation for Wordcloud now, I did this one with r in Power BI Desktop as you get a greater degree of control over it:
library(tm) library(wordcloud) words <- Corpus(VectorSource(dataset)) words <- tm_map(words, stripWhitespace) words <- tm_map(words, content_transformer(tolower)) words <- tm_map(words, removeNumbers) words <- tm_map(words, removePunctuation) words <- tm_map(words, removeWords, stopwords("english")) words <- tm_map(words, stemDocument) wordcloud(words, scale=c(5,0.5), max.words=50, random.order=FALSE, rot.per=0.35, use.r.layout=FALSE, colors=brewer.pal(8, "Dark2"))
I have attached a sample .pbix file with the text from Wikipedia's 'big data' article and the R visualisation.
Solved! Go to Solution.
12-05-2017 06:00 AM
When I tried the original code, it worked with older versions of R, 3.2.3 but not newer versions of R, 3.3.1 and 3.4.2. The Corpus construction was returning only numbers which were then stripped out by the rest of the code causing problems. So, I hacked together a variation of the original code based upon comments and have a working version here:
require(tm) require(wordcloud) require(RColorBrewer) datain = as.data.frame(table(as.character(dataset[,1]))) words <- Corpus(VectorSource(dataset$text)) words <- tm_map(words, stripWhitespace) words <- tm_map(words, content_transformer(tolower)) words <- tm_map(words, removeNumbers) words <- tm_map(words, removePunctuation) words <- tm_map(words, removeWords, stopwords("english")) words <- tm_map(words, stemDocument) wordcloud(words, scale=c(5,0.75), max.words=50, random.order=FALSE, rot.per=0.35, use.r.layout=FALSE, colors=brewer.pal(8, "Dark2"))
The main change is the construction of the Corpus, VectorSource(dataset) becomes VectorSource(dataset$text)
This is based upon the original PBIX file included in the original post.
08-23-2016 03:11 PM
Getting an error when trying to render in PowerBI.com. Works fine in PowerBI Desktop...
[Monitor] Loaded provider: [WindowsLibOsProvider] [Monitor] Loaded provider: [SocketStreamProvider] [Monitor] Loaded provider: [DnsStreamProvider] [Monitor] Loaded provider: [HttpServerStreamProvider] [Monitor] Loaded provider: [ClockStreamProvider] [Monitor] Loaded provider: [ResourceManager] [Monitor] Loaded Security Monitor Profile: [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\Profile\0.dbconfig] [Monitor] Writable folder: [file:///Users] => [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\Users] [Monitor] Writable folder: [file:///Results] => [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\Results] [Monitor] Writable folder: [file:///Work] => [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\Work] [Monitor] R/O folder: [file:///ThirdParty] => [C:\WFRoot\SBRole.0\Fabric\work\Applications\ASAzureApp_App0\temp\R_Root_13.0.1605.329] [Monitor] R/O folder: [file:///Script] => [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\Script] [Monitor] R/O folder: [file:///InputData] => [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\InputData] [Monitor] Scratch folder: [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\Scratch] [Monitor:WARNING] Cannot find dependency: "Windows.0.0.0.0.0" x64 [Monitor:WARNING] Unable to resolve package dependencies: 0x80070490 [Monitor:ERROR] Unable to create application sandbox environment; the most likely cause of this error is a missing package dependency. Please confirm that all application dependencies are present. (Status=0x80070490) [Monitor:ERROR] Failed to launch application: 0x80070490 [Monitor:WARNING] Failed to retrieve application's exit code! [Monitor] Done.
Please try again later or contact support. If you contact support, please provide these details.
09-04-2016 05:24 AM
Thank you very much for the input and sorry for late response.
Can I please get PBIX with data and R code, which is not working for you in server? It worked for me on toy data sample.
09-21-2016 07:07 PM
@wbob - this seems an incomplete sample and perhaps why it is generating confusion. I dont think that code alone can be used in PBI Desktop without a source datatset.
Can you please post your PBIX file?
09-26-2016 02:06 AM
Hi, Thanks for sharing.
I have a problem with special charaters braking the visualisation.
Can you help me what to do with it.
The R code is working when no special charaters present.
This is the Error Message i get:
R script error.
Error in type.convert(data[[i]], as.is = as.is[i], dec = dec, numerals = numerals, :
invalid input '@MSEmpresasLatam @DiegoBek exactamente! Es la era de la inteligencia #AzureML #SQLEnterprise #PowerBI #IoT ðŸ˜' in 'utf8towcs'
Calls: read.csv -> read.table -> type.convert
This is the R code i have:
dataset <- tm_map(dataset, removePunctuation)
dataset <-tm_map(dataset, removeWords, c('powerbi','PowerBI',stopwords('english')))
dataset <-tm_map(dataset, stemDocument)
wordcloud(dataset, max.words = 100, random.order = FALSE)
Thanks in advance if anyone could help.
09-26-2016 05:13 AM
the reason why i posted here because this is the R script showcase forum and this error is related to the R integration of PowerBI.
Kávási Mihály ( in English it is Mihály Kávási correctly will change from now)
11-30-2016 03:14 PM
Nice post :-)
Do you know how to make a word cloud out of a columns (the entire content) for example i have a colum with company names and number of cases?
The problem is that by altering the Dataset I cannot use the code again in Power BI ( I had to write colClasses = c("character", "numeric"), to tell R that my dataset has 2 type of columns.
> `dataset` = read.csv('C:/Users/PatricioLobos/AppData/Local/Radio/REditorWrapper_8fe60428-ecc0-478f-ab77-56803f9407f4/input_df_227d85ab-962e-4701-b1ac-ff6e53817c56.csv', colClasses = c("character", "numeric"), check.names = FALSE, encoding = "UTF-8", blank.lines.skip = FALSE);
+ # Original Script. Please update your script content here and once completed copy below section back to the original editing window #
> pal2 <- brewer.pal(8, 'Dark2')
> plot(wordcloud(dataset$Kundenavn, dataset$`Nr. Cases`, scale = c(8, .4), min.freq = 5, max.words = Inf, random.order = FALSE, rot.per = .15, colors = pal2))
12-05-2016 02:10 AM - edited 12-12-2016 07:48 AM
require(wordcloud) require(RColorBrewer) datain = as.data.frame(table(as.character(dataset[,2]))) pal2 <- brewer.pal(4,"Dark2") wordcloud(datain[,1],datain$Freq, scale=c(4,.3), min.freq=1, max.words=Inf, random.order=FALSE, rot.per=.15, colors=pal2)
The dataset is 2 columns: "ID", "Term" . If you have "Frequency" instead of "ID" no need to call "table" operator
11-13-2017 02:40 AM
I love the idea with a word cloud that you have more control over. However, when I open the pbix-file to try the visual I get the message "Can't display visuals", with the following details:
R script error.
Loading required package: NLP
Loading required package: methods
Loading required package: RColorBrewer
Error in strwidth(words[i], cex = size[i], ...) : invalid 'cex' value
Calls: wordcloud -> strwidth
In addition: Warning messages:
1: In max(freq) : no non-missing arguments to max; returning -Inf
2: In max(freq) : no non-missing arguments to max; returning -Inf
I have the packages installed. Any ideas?