In my last community blog here we learn how to apply sentiment analysis in Power BI with Microsoft Cognitive Services Text Analytics API (aka Azure Machine Learning) and with no help from IT, developers or data scientists. In the follow-up articles we drilled down into different aspects of the Sentiment Analysis implementation in Power BI, and shared the technique in several community events here and here.
In today's blog post we will take text analytics to its next level, and extract key phrases from text messages. To make it interesting, we will pull Facebook messages from the public pages of Clinton and Trump, extract their key phrases, and correlate it with engagement levels and sentiment in cool visualizations.
Following the tutorial, you will be able to create the following report (click here to view it on PowerBI.com):
Before we start
Make sure you have an account on Microsoft Cognitive Services (Signup here), and a (free) subscription to Text Analytics Preview. We'll later need the API key that can be copied from here (as shown in the following screenshot).
In today's article we'll extract key phrases from text messages using the Key Phrase API that can be tested here. In the example below we applied the API on the text "I had a wonderful experience! The rooms were wonderful and the staff were helpful." and got the key phrases "wonderful experience", "rooms", and "staff" in return.
Import Data from Facebook
In the first part of our tutorial we'll import all of Trump's and Clinton's Facebook posts as our external data source for analysis. If you prefer to skip this part and try this tutorial on your own data skip to Part 2.
Ready to start?
Open Power BI Desktop and click File, then select Options and settings, and select Options.
In the Options dialog, select Privacy under CURRENT FILE, and select Ignore the Privacy Levels and potentially improve performance. Then click OK.
In Home tab click on the Get Data icon in Home tab.
In Get Data dialog, select Online Services, then select Facebook and click Connect.
If you see the Connecting to a Third-Part Service dialog, click Continue.
In Facebook dialog, edit the first text box and change it from Meto DonaldTrump. Then click OK.
If this is your first time using the Facebook connector in Power BI Desktop, proceed with the login and permission approval steps.
In the Query Editor window, make sure that the Formula Bar is visible (To activate it check Formula Bar in View tab).
Inside the Formula Bar, add the following suffix after the text "DonaldTrump" and before the ending double quotes: /Posts?fields=message,created_time,shares&limit=100
Here is the entire formula line you should have now:
Press Enter when you are done, and make sure you see the preview of the posts, as shown here:
Note: You may have noticed that we used the parameter limit=100 above. Don't worry, we will extract all the posts. By default, Power BI Desktop tries to extract the data in pages of 250 rows. This will usually be rejected by Facebook platform with an error, so the workaround is to explicitly override the page limit to 100.
Now, let's remove all the empty text messages, as we focus today on text analytics.
Click on the filter icon in the header of column message and select Remove Empty.
Now let's remove all duplicates rows (Yes, there are few duplicate rows in the data for some strange reason, and if we keep those rows, we'll get an error in the API response).
Right click on the header of column id and select Remove Duplicates.
Next step is to extract the number of shares for each post.
Click on the expand icon in the header of column shares, and click OK in the expand pane.
Rename the column to shares.
Click on the left icon in the header of column shares and select Whole Number.
Click on the left icon in the header of column created_time and select Date/Time/Timezone.
Next, we'll create a new column with a link to the actual post on Facebook. This is a practical trick that will allow us to navigate later from Power BI to Facebook on specific posts that are shown in a sliced table.
In the tab Add Column, click Custom Column.
In the Add Custom Column dialog, set facebook link as the New column name, and set the following formula in Custom column formula:
= "https://facebook.com/" & [id]
When you are done, click OK.
Now, let's add a new column. In the Query Editor, Add Column tab click Custom Column again.
Set Candidateas New column name, and the following formula in Custom column formula:
When you are done, click OK.
It's time to apply the same logic on Clinton's Facebook page. We'll duplicate the query above and modify it to get the data from Clinton's page, and have a Candidate column with Clinton as the value. Here are the steps:
Rename the query to Trump.
In the Queries pane, right click on the query Trump and select Duplicate.
Rename the duplicated query to Clinton, and select it. Click on the wheel cog icon of the last step in APPLIED STEPS. The Add Custom Column dialog will appear.
Change the Custom column formula from = "Trump" to = "Clinton". Click OK when you are done.
When you are done, click OK.
Click the first step (Source) in APPLIED STEPS, and modify the URL inside Facebook.Graph("...") from DonaldTrumpto HillaryClinton.
In the next steps, we'll append Trumpand Clintonin a new query.
Right click on the query Clinton in Queries pane, and select Reference.
Rename the new query to Posts.
Select Posts iand in Home tab, click Append Queries, then click Append Queries.
In the Append dialog, select Trump as the Table to append, and click OK. (Trumpwas selected here because we started with a reference to Clinton, so we already have her data).
Now, since we no longer need the base queries Trump& Clinton, we can disable them from loading in the report as standalone tables.
In the Queries pane, right click on Trump, and select Enable Load. This operation will remove the check mark from Enable Load. Now loading is disabled to that query. It will still be used in query Postswhich has a reference to Trump. Repeat the last step on query Clinton.
We have finished part 1 of the tutorial and learned how to extract Facebook Posts and number of shares. In the next part of the tutorial we'll handle the Key Phrase extraction.