Text Analytics in Power BI - Extraction of key phrases from Facebook messages (part two)

by DataChant Member ‎11-10-2016 11:00 AM - edited ‎11-10-2016 11:15 AM

In the first part of the tutorial here we imported Clinton's and Trump's Facebook messages and their number of shares. In this part, we will extract key phrases from those messages using Microsoft Cognitive Services Text Analytics (AKA Azure Machine Learning).

 

When we are done, we will be able to create stunning reports that will correlate key phrases with our metrics. In our example: We will be able to see which key phrases triggered the highest number of shares, and slice them by positive, neutral or negative sentiment, as shown in this report (click here to view it on PowerBI.com):

 

Screenshot_59.png

 

 

Before we start

 

Make sure you have an account on Microsoft Cognitive Services (Signup here), and a (free) subscription to Text Analytics Preview. We'll need the API key that can be copied from here (as shown in the following screenshot).

 

Screenshot_26.png

 

Note: If you didn't follow the Part 1 here, and you wish to try this tutorial on your own data, please prepare a table with text messages and unique keys (e.g. Index). Make sure you don't have duplicate keys, and that you don't have any blank values as messages.  

 

Preparing the API queries - A parameter and a query function

 

In this section we'll prepare the advanced Power Query code that we'll need for the Key Phrase API.

 

Continue with the Power BI Desktop file that was created in Part 1. If the Query Editor is not open, click Edit Queries.

 

In the Query Editor's Home tab, click Manager Parameters, then select New Parameter.

 

In the Parameters dialog, set APIKey as Name and paste the API Key that you obtained from Microsoft Cognitive Services into the text box Current Value.

When you are done, click OK.

 

Screenshot_28.png

  

 

Note: Using a parameter here, will allow you to export the Power BI Desktop file as a template and share with colleagues and friends without sharing your API Key. Read here how to create Power BI Templates.

 

In the Query Editor's Home tab, click the drop down menu below the icon of New Source and select Blank Query. Still in Home tab, click Advanced Query and paste the following code:

 

(Source) =>
let
    JsonRecords = Text.FromBinary(Json.FromValue(Source)),
    JsonRequest = "{""documents"": " & JsonRecords & "}",
 
    JsonContent = Text.ToBinary(JsonRequest, TextEncoding.Ascii),
    Response =
        Web.Contents("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases?",
            [
                Headers = [#"Ocp-Apim-Subscription-Key"= APIKey,
                           #"Content-Type"="application/json", Accept="application/json"],
                Content=JsonContent
            ]),
    JsonResponse = Json.Document(Response,1252)
in
    JsonResponse

 

 

Click Done in the Advanced Editor, and rename the new query to GetKeyPhrases.

 Screenshot_31.png

 

 

Note: Now in Queries pane, you will notice that the new query GetKeyPhrases has an fx icon. The query above is not a normal query. It is a query function that receives a table as an input, converts it into a JSON format, sends it to Microsoft Cognitive Services and returns the JSON response.

 

Screenshot_32.png

 

Preparing your text for analysis

 

In this section, we'll prepare the text messages in the right format that will be accepted by the API.

 

Note: This is a crucial step that you'll need to reuse whenever you run text analytics on your own messages.

 

In Queries pane, right click on query Posts and select Reference. Change the new query to KeyPhrases.

 

Screenshot_33.png 

 

In Home tab, click Choose Columns, and select the columns message and id. When you are done click OK.

 

Screenshot_35.png

 

Important tips: If you run this tutorial on your own data. Follow these guidelines: Make sure you have two columns here: A column with unique keys for each message, and another column with the actual text. Make sure that the you use unique keys. Use the Query Editor's Remove Duplicates if needed. Also, make sure that you don't have empty values.

 

 

Rename the column message to text. This is crucial. Later on we convert the table into a JSON format with the name text.

 

Note: If you are running this tutorial on your own data, ensure that the column with the unique keys is named id.

 

In Add Column tab, click Custom Column.

Set language as New column name, and = "en" as Custom column formula. Then click OK.

 

Screenshot_44.png

 

Note: The Text Analytics Key Phrases API support the following languages: English (en), German (de), Spanish (es) and Japanese (ja). You can use any of these codes if your text is in those languages.

 

 

Group data to sub-tables of 1000 messages each

 

Microsoft Cognitive Services allow us to send 1000 different messages on each API call as part of the Key Phrases API. In this part, we'll group the table into sub-tables of 1000 rows.

 

In Add Column click Index Column.

 

Screenshot_37.png

 

Select the new column, Index, and in Transform tab, click Standard, then select Integer-Divide.

 

Screenshot_38.png

 

In Integer-Divide dialog, set 1000 as Value and click OK.

 

Screenshot_39.png

 

Now the values in column Index are all 0 in the first 1000 rows, all 1 in the next 1000 rows, and so forth.

Select the column Index, and click Group By in Transform tab.

 

Screenshot_40.png

 

In the Group By dialog, ensure that Index is selected in the drop down menu Group By.

Set sub-table as the text in New column name.

Select All Rows as Operation, and click OK.

 

Screenshot_41.png

  

After the grouping step, we have a new column sub-table that contains our tables in bulks of 1000 rows each (excluding the last row that has the remainders).

 

 

Screenshot_66.png

 

 

We can delete the column Index, and click Invoke Custom Function in Add Column tab.

 

In Invoke Custom Function dialog, set GetKeyPhrases as Function Query and select sub-table as the column. When you are done, click OK.

 

Screenshot_67.png

 

Delete the column sub-table, and click the expand button in the left side of the header GetKeyPhrases

Ensure all columns are selected in the expand pane, uncheck Use original column name as prefix, and click OK

 

Screenshot_69.png

 

Delete the column errors.

 

Note: For troubleshooting purposes, you can create a new query that will look into the column errors. In that column, you will find errors that are returned by Microsoft Cognitive Services. For example, if you have a row that contains a text message which is too big, you can learn about it in this column, and truncate the messages. The maximal allowed length for text messages is 10240 characters. To truncate the messages, select the text column, right click on the header and select Split Column, then select By Number of Characters...

 

Click on the expand button in the header of column documents, select all columns and click OK.

 

Screenshot_49.png

Click on the expand button in the header of column KeyPhrases.

 

Screenshot_50.png

 

Finally, we have a new column with a key phrase for each message id. You should note that we will usually have multiple key phrases per message, so our table is now expended to multiple rows per message id.

 

For a final cosmetic touch, let's capitalize each word in column KeyPhrases:

Select the column KeyPhrases, right click on its header, select Transform and then select Capitalize Each Word.

 

Screenshot_70.png

 

We have completed the extraction of key messages. It's time to move to the modeling and visualization.

In the Query Editor window, click Close & Apply.

 

 

Feeding Key Phrases to a WordCloud Custom Visual

 

We are almost done. It's time to move to the visualization part. We'll now learn how to create a simple WordCloud visual with our key phrases.

 

After completing the last two parts, we now have two tables Posts and KeyPhrases. We can create a relationship between column id in both tables (if Power BI Desktop didn't detect them automatically). To do it, go to Relationships view, and drag and drop the column id from Posts to KeyPhrases as shown here.

 

Screenshot_71.png

 

Download WordCloud (from here), and import it to your report (here is how to import custom visuals).

 

Click the Report view, and add a WordCloud to your report. Drag and drop the column KeyPhrases to Category and message to Value as shown in this screenshot:

 

 

 

Screenshot_56.png

 

In Format tab in Visualizations, turn off Stop Words and Rotate Text.

Note: Turning off Stop Words is crucial here. Without it, you will not be able to resize the WordCloud visual.

 

Screenshot_57.png

 Under section General, turn off Word-breaking.

 

Screenshot_58.png

You can now resize the WordCloud visual, and see the size of the key phrases is defined by the number of posts.

 

Next Steps:

 

  • You can now create a second WordCloud visual by copying and pasting the first one, and assign shares to Values instead of message. Now, in the second WordCloud the key phrases will be bigger if their posts got higher shares.
  • You can add Sentiment Analysis (Learn how here), and slice your WordClouds by sentiment as demonstrated in this report.
  • To get the actual report above, and apply both Sentiment Analysis and Key Phrases, subscribe to DataChant here.

 

Screenshot_59.png

 

 

Hope you enjoyed this tutorial.

 

Follow me on Twitter for updates and new Power BI tops & tricks.

Comments
by yoshihirok Member
on ‎11-13-2016 10:55 AM

Nice.

I like to use Cognitive Service with Power BI Desktop.

 

by extri13 Frequent Visitor
on ‎11-15-2016 12:38 AM

First of all, thanks for sharing this very good application of Power BI.

 

I got problem on the step of INVOKE CUSTOM FUNCTION : GetKeyPhrases.

The error message is like below. It seems I incorrectly input the credential of Text Analytics API.

Could you please give me any advice for fixing this problem and follow the extra step of this practice?

 

ERROR MESSAGE

"Error on the query. Web.Contents with DataSource.Error: Content option is only available in the case of anonymous connect.
Details:
DataSourceKind=Web
DataSourcePath=https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/

(Translated from Korean into English)

 

by DataChant Member
on ‎11-15-2016 05:24 AM

Hi Extri13,

Did you select the Anonymous option when you connected to Microsoft Cognitive Services? I suggest editing the credentials in the Data Source Settings. Set it to anonymous.

by mcolb88 Frequent Visitor
on ‎12-08-2016 12:12 PM

Trying to use your example above with for extracting key phrases on Lync (Skype for Business) chats. Followed all the steps but when I try to invoke the function, I get the folowing error:

 

Formula.Firewall: Query 'KeyPhrases' (step 'Invoked Custom Function') references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.

 

Here's the add column code :

= Table.AddColumn(#"Grouped Rows", "GetKeyPhrases", each GetKeyPhrases([#"sub-table"]))

 

If you have any ideas of where I may have gone wrong it would be greatly appreciated. This will be huge for our managers monitoring chatrooms to extreact common threads to suggest enhancements to our internal software.

by DataChant Member
on ‎12-08-2016 12:17 PM

Hi @mcolb88

 

Please enable Fast Combine in File -> Options & Settings -> Options -> Privacy.

After you do, click Refresh.

 

BTW, how did you extract the data from Skype?

 

Best regards,

Gil

by mcolb88 Frequent Visitor
on ‎12-08-2016 02:03 PM

Thanks for the quick reply @DataChant.

 

Tried that by enabliong Ignore the Privacy Level settings and potentially improve performance option and now I get the error:

 

An error occurred in the ‘’ query. DataSource.Error: Web.Contents with the Content option is only supported when connecting anonymously.
Details:
    DataSourceKind=Web
    DataSourcePath=https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases

 

I tried pasting the key directly inthe GetKeyPhrases function to no avail.

 

Sorry, this is new and sooo close, thanks to your help.

 

Mike

 

 

 

by mcolb88 Frequent Visitor
on ‎12-08-2016 02:08 PM

@DataChant,

My bad, missed the response above about anonymous on teh data source, it now works!

 

by pinikr Frequent Visitor
on ‎12-25-2016 04:58 AM

Thanks for that

After i make the group by 1000 i get this error

 

DataSource.Error: DataMarket: Request failed: The remote server returned an error: (400) Bad Request. (Bad Request)
Details:
DataSourceKind=DataMarket
DataSourcePath=https://api.datamarket.azure.com/Data.ashx/Bing/MicrosoftTranslator/v1/Translate
Url=https://api.datamarket.azure.com/Data.ashx/Bing/MicrosoftTranslator/v1/Translate?Text='hihihihih...;

 

 

What can i do?

by DataChant Member
on ‎12-25-2016 05:24 AM

Hi @pinikr,

 

From the error message it seems you are using a different API (Translator, and not Key Pharses).

If you are not following my exact scenario above, can you provide more context?

Can you share the M expression?

 

Thank you,

Gil

 

 

by pinikr Frequent Visitor
on ‎12-25-2016 05:32 AM

Thanks - this is the translator - not the text anlytics - sentiment... do we have translate from cognetive too?

by DataChant Member
on ‎12-25-2016 05:41 AM

Hi @pinikr,

 

Microsoft Cognitive Services has it: https://www.microsoft.com/en-us/translator/translatorapi.aspx

 

But it is more complicated to implement.

If you manage to implement it, I would love to host you as a guest author on my blog Smiley Happy

by pinikr Frequent Visitor
on ‎12-26-2016 04:50 AM

Hi Gil 

Currently i Uses the old one api.datamarket.azure.com - and its give me error, i will let you know.

Pini Krisher

www.sqlazure.co.il

by mrobson1975 Visitor
on ‎01-04-2017 09:42 AM

Hi,  Great solution (though it was a lot easier when some of these were part of the Data Marketplace).  One issue though; when you upload the file to the Power BI service you can't schedule a refresh as it refuses to connect to the Cognitive Services Anonymously and it throws this error:

 

by mrobson1975 Visitor
on ‎01-04-2017 09:43 AM

Hi,  Great solution (though it was a lot easier when some of these were part of the Data Marketplace).  One issue though; when you upload the file to the Power BI service you can't schedule a refresh as it refuses to connect to the Cognitive Services Anonymously and it throws this error:CognitiveError.PNG

 

 

 

 

by mrobson1975 Visitor
on ‎01-04-2017 09:44 AM

Hi,  Great solution (though it was a lot easier when some of these were part of the Data Marketplace).  One issue though; when you upload the file to the Power BI service you can't schedule a refresh as it refuses to connect to the Cognitive Services Anonymously and it throws the error below.  Were you able to get this to refresh from the service?CognitiveError.PNG

 

 

 

by DataChant Member
on ‎01-04-2017 02:22 PM

Hi @mrobson1975

 

The Web connector (which was used for the Text Analytics part) is not supported As-Is. You can try deploying the gateway, if you are a Power BI Pro user. I think it will work.

by mrobson1975 Visitor
on ‎01-05-2017 03:21 AM

Unfortunately not.  The Service and Gateway, although they support Web connections, they only appear to support Anonymous, Basic or Windows authentication.  Setting Anonymous leads to a 404 error when connecting to the Cognitive Service endpoint (presumably because the API Key is not in the header and no way to supply it as part of the gateway).

by DataChant Member
on ‎01-05-2017 05:42 AM

Hi @mrobson1975,

 

Could you please elaborate on the API key? In my article I explicitly added it to the header.

 

Thank you,

Gil

by 2hunters Occasional Visitor
on ‎01-27-2017 12:36 PM

any way to skip the text analytics call,if i just make a few corrections in original data, after I already waited 2 hours to get all the key phrases, when close and apply the query editor?

thanks

by andynz Frequent Visitor
on ‎02-12-2017 08:31 PM

HI Gil

 

I have the same message as mclob.  I have ensured I have ensured Privy settings are correct. Just cant get the last piece of the puzzle. Any other ideas which could be causing this?  Have been enjoying your posts on powerpivotpro and following your website too.

 

Thanks

 

Andy

 

 

Formula.Firewall: Query 'KeyPhrases' (step 'Invoked Custom Function') references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.

 

Here's the add column code :

= Table.AddColumn(#"Grouped Rows", "GetKeyPhrases", each GetKeyPhrases([#"sub-table"]))

by DataChant Member
on ‎02-12-2017 10:05 PM

Hi @andynz,

 

You can avoid the error above by enabling Fast Combine:

In File-->Options and Settings-->Options-->Current Workbook-->Privacy

Check the checkbox that starts with "Ignore...", then refresh again.

 

by andynz Frequent Visitor
‎02-12-2017 11:48 PM - edited ‎02-12-2017 11:49 PM

Thanks Gil, Finally got there - Great way to analyse our survey data and social media channels. Thanks again

by PowerBIArtist Frequent Visitor
on ‎02-27-2017 07:08 PM

Great Post! I need some wisdom here Smiley Happy

 

Even with the KeyPhrases API I am getting below error. Any suggestions?

 

Failed to save modifications to the server. Error returned: 'OLE DB or ODBC error: [DataSource.Error] Web.Contents failed to get contents from 'https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases' (400): Bad Request.
'.

by PowerBIArtist Frequent Visitor
on ‎03-02-2017 11:56 AM

Never mind - I was using Divide instead of Divide-Integer.

by mrbajana Frequent Visitor
on ‎03-29-2017 01:54 PM

I got problem on the step of INVOKE CUSTOM FUNCTION : GetSentiment.

The error message is like below. It seems I incorrectly input the credential of Text Analytics API.

Could you please give me any advice for fixing this problem and follow the extra step of this practice?

 

The problem es when I used :

 

https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment

 

ERROR MESSAGE

"Error on the query. Web.Contents with DataSource.Error: Content option is only available in the case of anonymous connect.
 

by DataChant Member
on ‎04-22-2017 01:54 PM

@mrobson1975, you are right. It seems that Scheduled Data Refresh is not supported. I contacted Microsoft team for help. Will keep this thread updated. 

by Betsy Member
‎04-26-2017 11:21 AM - edited ‎04-26-2017 11:38 AM

@DataChant

 

Not sure if you still check this. Following the steps you outlined on my own data. When I get down to the step of expanding the getkeyphrases column, my documents and errors columns have List and Error in each row, instead of Record, as your document column screenshot shows. I can't then expand my documents column without getting an error (and it's not the standard expand, my choices are expand to new rows or extract values).

 

Any sense of where I went wrong?

 

Here are my steps:

 

let
Source = #"Response Table",
#"Filtered Rows" = Table.SelectRows(Source, each ([question type] = "Open ended answer")),
#"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each [text] <> null and [text] <> ""),
#"Added Index1" = Table.AddIndexColumn(#"Filtered Rows1", "Index", 1, 1),
#"Removed Other Columns" = Table.SelectColumns(#"Added Index1",{"text", "Index"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"Index", "id"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "language", each "en"),
#"Added Index" = Table.AddIndexColumn(#"Added Custom", "Index", 1, 1),
#"Integer-Divided Column" = Table.TransformColumns(#"Added Index", {{"Index", each Number.IntegerDivide(_, 1000), Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Integer-Divided Column", {"Index"}, {{"sub-table", each _, type table}}),
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Index"}),
#"Invoked Custom Function" = Table.AddColumn(#"Removed Columns", "GetKeyPhrases", each GetKeyPhrases([#"sub-table"])),
#"Removed Columns1" = Table.RemoveColumns(#"Invoked Custom Function",{"sub-table"}),
#"Expanded GetKeyPhrases" = Table.ExpandRecordColumn(#"Removed Columns1", "GetKeyPhrases", {"documents", "errors"}, {"documents", "errors"}),
#"Removed Columns2" = Table.RemoveColumns(#"Expanded GetKeyPhrases",{"errors"})
in
#"Removed Columns2"

 

Thanks,

Betsy

by DataChant Member
on ‎04-27-2017 06:54 AM

Hi @Betsy

 

Can you remove duplicates on column "Id"?

Before you apply the grouping?

 

Hope it helps,

Gil

 

by JMulloy Visitor
on ‎05-18-2017 08:53 AM

Need some help here. All of my key phrases have the same count.

by DataChant Member
on ‎05-18-2017 05:25 PM

Hi @JMulloy,

 

Can you share more details? Is the problem in the results from the service, or in feeding the visual?

by JMulloy Visitor
on ‎05-24-2017 02:44 PM

It was a problem with my data model and relationships. Changed the cardinality filter to bi-directional and it seemed to fix it.

by JMulloy Visitor
on ‎05-24-2017 02:46 PM

How do you format the GetKeyPhrases querry to account for split text fields to solve truncation errors?

by ksax Visitor
on ‎05-30-2017 08:00 PM

Hi Gil, great post and great effort from you indeed. I have followed all the steps and get all results as shown in your illustrative snapshot above until "close&apply" which will bring me to the main Power BI desktop screen. Nevertheless, below error message occurs then nothing happens....any advice please?

 

Thank you in advance

 

 

s,  image.png

by ksax Visitor
on ‎05-31-2017 01:01 AM

 Hello Gil, not sure what happened but the error occurred in my previous comment was fixed and I moved to next step. Now when I try to create a create a relationship between column id in both tables, it's unfeasible and following error message occurs. My dataset called word cloud and I suspect something is wrong with its "id" since it has this  "summation" sign beside it. please see below screenshot.

 

Thank you again and looking forwards for your advice

 

image.png

 

 

 

 

by letitiabiagi Frequent Visitor
on ‎08-03-2017 01:46 PM

I only receive Errors (no Records) when i run GetKeyPhrases on my text. 

error.PNG

 

"An error occurred in the ‘’ query. DataSource.Error: Web.Contents failed to get contents from 'https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases' (400): Bad Request
Details:
DataSourceKind=Web
DataSourcePath=https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases
Url=https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases"

 

Thanks

by hengsworld Frequent Visitor
‎09-12-2017 02:53 PM - edited ‎09-12-2017 03:15 PM

I get this issue shown below... I am using Anonymous source but it is giving me this error.

 

Capture.PNG

by Bamak Frequent Visitor
on ‎10-08-2017 11:10 PM

Hi @DataChant,

 

It seems that the new Service IURL is --> https://westcentralus.api.cognitive.microsoft.com/text/analytics/v2.0

But when I use it (with the good key) -, I observe this message : 

 

An error occurred in the ‘’ query. DataSource.Error: Web.Contents failed to get contents from 'https://westcentralus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases' (400): Bad Request
Details:
DataSourceKind=Web
DataSourcePath=https://westcentralus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases
Url=https://westcentralus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases

 

Old URL (https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/) generates anonymous connection issue 

 

Regards

by perkses Frequent Visitor
‎10-26-2017 12:12 PM - edited ‎10-26-2017 12:18 PM
by Bamak Frequent Visitor
on ‎11-04-2017 06:13 AM

Hi parkes

 

Unfortunately, I still have the issue

message

 

An error occurred in the ‘’ query. DataSource.Error: Web.Contents failed to get contents from 'https://westcentralus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases' (400): Bad Request
Details:
DataSourceKind=Web
DataSourcePath=https://westcentralus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases
Url=https://westcentralus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases

 

query function


Response =
Web.Contents("https://westcentralus.api.cognitive.microsoft.com/text/analytics/v2.0/keyPhrases?",
[
Headers = [#"Ocp-Apim-Subscription-Key"= APIKey,
#"Content-Type"="application/json", Accept="application/json"],
Content=JsonContent
]),

....

 

Regards