Wordcloud with r

Frequent Visitor
3615 Views
Highlighted
Frequent Visitor
Posts: 2
Registered: ‎06-20-2016

Wordcloud with r

[ Edited ]

Although there is a custom visualisation for Wordcloud now, I did this one with r in Power BI Desktop as you get a greater degree of control over it:

 

 

library(tm)
library(wordcloud)

words <- Corpus(VectorSource(dataset))
words <- tm_map(words, stripWhitespace)
words <- tm_map(words, content_transformer(tolower))
words <- tm_map(words, removeNumbers)
words <- tm_map(words, removePunctuation)
words <- tm_map(words, removeWords, stopwords("english"))
words <- tm_map(words, stemDocument)
wordcloud(words, scale=c(5,0.5), max.words=50, random.order=FALSE, rot.per=0.35, use.r.layout=FALSE, colors=brewer.pal(8, "Dark2"))

 

I have attached a sample .pbix file with the text from Wikipedia's 'big data' article and the R visualisation.

Attachment
Attachment
Regular Visitor
Posts: 23
Registered: ‎07-31-2015

Re: Wordcloud with r

Getting an error when trying to render in PowerBI.com.  Works fine in PowerBI Desktop...

ERROR:

[Monitor] Loaded provider: [WindowsLibOsProvider] [Monitor] Loaded provider: [SocketStreamProvider] [Monitor] Loaded provider: [DnsStreamProvider] [Monitor] Loaded provider: [HttpServerStreamProvider] [Monitor] Loaded provider: [ClockStreamProvider] [Monitor] Loaded provider: [ResourceManager] [Monitor] Loaded Security Monitor Profile: [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\Profile\0.dbconfig] [Monitor] Writable folder: [file:///Users] => [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\Users] [Monitor] Writable folder: [file:///Results] => [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\Results] [Monitor] Writable folder: [file:///Work] => [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\Work] [Monitor] R/O folder: [file:///ThirdParty] => [C:\WFRoot\SBRole.0\Fabric\work\Applications\ASAzureApp_App0\temp\R_Root_13.0.1605.329] [Monitor] R/O folder: [file:///Script] => [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\Script] [Monitor] R/O folder: [file:///InputData] => [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\InputData] [Monitor] Scratch folder: [C:\Sandboxing\Sandboxes\5a4800e1-dd61-41e3-9bce-3fb922b5f659\Scratch] [Monitor:WARNING] Cannot find dependency: "Windows.0.0.0.0.0" x64 [Monitor:WARNING] Unable to resolve package dependencies: 0x80070490 [Monitor:ERROR] Unable to create application sandbox environment; the most likely cause of this error is a missing package dependency. Please confirm that all application dependencies are present. (Status=0x80070490) [Monitor:ERROR] Failed to launch application: 0x80070490 [Monitor:WARNING] Failed to retrieve application's exit code! [Monitor] Done.
Please try again later or contact support. If you contact support, please provide these details.

Moderator
Posts: 52
Registered: ‎08-10-2016

Re: Wordcloud with r

Dear erikskov,

Thank you very much for the input and sorry for late response.

Can I please get PBIX with data and R code, which is not working for you in server? It worked for me on toy data sample.

 

Best Regards,

B. (boefraty@microsoft.com)

 

 

 

 

Established Member
Posts: 165
Registered: ‎06-28-2015

Re: Wordcloud with r

@wbob - this seems an incomplete sample and perhaps why it is generating confusion.  I dont think that code alone can be used in PBI Desktop without a source datatset.

 

Can you please post your PBIX file?

Frequent Visitor
Posts: 4
Registered: ‎05-10-2016

Re: Wordcloud with r

Hi, Thanks for sharing.

 

I have a problem with special charaters braking the visualisation.

 

Can you help me what to do with it.

 

The R code is working when no special charaters present.

 

This is the Error Message i get:

Error Message:

R script error.
Error in type.convert(data[[i]], as.is = as.is[i], dec = dec, numerals = numerals, :
invalid input '@MSEmpresasLatam @DiegoBek exactamente! Es la era de la inteligencia #AzureML #SQLEnterprise #PowerBI #IoT 😍' in 'utf8towcs'
Calls: read.csv -> read.table -> type.convert
Execution halted

 

 

This is the R code i have:

library(tm)
library(SnowballC)
library(wordcloud)
dataset <-Corpus(VectorSource(dataset))
dataset <- tm_map(dataset, removePunctuation)
dataset <-tm_map(dataset, removeWords, c('powerbi','PowerBI',stopwords('english')))
dataset <-tm_map(dataset, stemDocument)
wordcloud(dataset, max.words = 100, random.order = FALSE)

 

Thanks in advance if anyone could help.

 

Best Regards,

Kávási Mihály

Established Member
Posts: 165
Registered: ‎06-28-2015

Re: Wordcloud with r

Hi Kávási,

 

It looks you are using the R Word Cloud, not the Power BI Custom Visual?  If that is the case then I suggest you re-post this e.g. to stackoverflow.com, tagged for R.

 

Frequent Visitor
Posts: 4
Registered: ‎05-10-2016

Re: Wordcloud with r

Hi,

 

the reason why i posted here because this is the R script showcase forum and this error is related to the R integration of PowerBI.

 

Best regards, 

Kávási Mihály ( in English it is Mihály Kávási correctly Smiley Happy will change from now)

Frequent Visitor
Posts: 6
Registered: ‎12-23-2015

Re: Wordcloud with r

Hi,

 

Nice post :-)

Do you know how to make a word cloud out of a columns (the entire content) for example i have a colum with company names and number of cases?

 

The problem is that by altering the Dataset I cannot use the code again in Power BI ( I had to write colClasses = c("character", "numeric"),  to tell R that my dataset has 2 type of columns.

 

Thanks!

 

> `dataset` = read.csv('C:/Users/PatricioLobos/AppData/Local/Radio/REditorWrapper_8fe60428-ecc0-478f-ab77-56803f9407f4/input_df_227d85ab-962e-4701-b1ac-ff6e53817c56.csv', colClasses = c("character", "numeric"), check.names = FALSE, encoding = "UTF-8", blank.lines.skip = FALSE);
+ # Original Script. Please update your script content here and once completed copy below section back to the original editing window #
>require(wordcloud)
> require(RColorBrewer)
> pal2 <- brewer.pal(8, 'Dark2')
> plot(wordcloud(dataset$Kundenavn, dataset$`Nr. Cases`, scale = c(8, .4), min.freq = 5, max.words = Inf, random.order = FALSE, rot.per = .15, colors = pal2))

Moderator
Posts: 52
Registered: ‎08-10-2016

Terms instead of words

[ Edited ]
require(wordcloud)
require(RColorBrewer)

datain = as.data.frame(table(as.character(dataset[,2])))

pal2 <- brewer.pal(4,"Dark2")

wordcloud(datain[,1],datain$Freq, 
          scale=c(4,.3),
          min.freq=1, max.words=Inf, 
          random.order=FALSE, 
          rot.per=.15, 
          colors=pal2)

 The dataset is 2 columns: "ID", "Term" . If you have "Frequency" instead of "ID" no need to call "table" operator  

 

 

Attachment