Text Analytics in Power BI - Extraction of key phrases from Facebook messages (part one)

by DataChant Member on ‎11-10-2016 10:51 AM - last edited on ‎11-10-2016 11:45 AM by Power BI Admin Jessica

 

In my last community blog here we learn how to apply sentiment analysis in Power BI with Microsoft Cognitive Services Text Analytics API (aka Azure Machine Learning) and with no help from IT, developers or data scientists. In the follow-up articles we drilled down into different aspects of the Sentiment Analysis implementation in Power BI, and shared the technique in several community events here and here.

 

In today's blog post we will take text analytics to its next level, and extract key phrases from text messages. To make it interesting, we will pull Facebook messages from the public pages of Clinton and Trump, extract their key phrases, and correlate it with engagement levels and sentiment in cool visualizations.

 

Following the tutorial, you will be able to create the following report (click here to view it on PowerBI.com):

 

Screenshot_59.png

 

 

 

Before we start

Make sure you have an account on Microsoft Cognitive Services (Signup here), and a (free) subscription to Text Analytics Preview. We'll later need the API key that can be copied from here (as shown in the following screenshot).

 

Screenshot_26.png

In today's article we'll extract key phrases from text messages using the Key Phrase API that can be tested here. In the example below we applied the API on the text "I had a wonderful experience! The rooms were wonderful and the staff were helpful.and got the key phrases "wonderful experience", "rooms", and "staff" in return. 

 

Screenshot_61.png

 

Import Data from Facebook

 

In the first part of our tutorial we'll import all of Trump's and Clinton's Facebook posts as our external data source for analysis. If you prefer to skip this part and try this tutorial on your own data skip to Part 2.

 

Ready to start?

 

Open Power BI Desktop and click File, then select Options and settings, and select Options.

In the Options dialog, select Privacy under CURRENT FILE, and select Ignore the Privacy Levels and potentially improve performance. Then click OK.

 

In Home tab click on the Get Data icon in Home tab.

In Get Data dialog, select Online Services, then select Facebook and click Connect.

If you see the Connecting to a Third-Part Service dialog, click Continue.

In Facebook dialog, edit the first text box and change it from Me to DonaldTrump. Then click OK.

 

Screenshot_4.png

 

 

If this is your first time using the Facebook connector in Power BI Desktop, proceed with the login and permission approval steps.

 

In the Query Editor window, make sure that the Formula Bar is visible (To activate it check Formula Bar in View tab).

Inside the Formula Bar, add the following suffix after the text "DonaldTrump" and before the ending double quotes: /Posts?fields=message,created_time,shares&limit=100

 

Here is the entire formula line you should have now:

 

= Facebook.Graph("https://graph.facebook.com/v2.2/DonaldTrump/Posts?fields=message,created_time,shares&limit=100")

 

Press Enter when you are done, and make sure you see the preview of the posts, as shown here:

 

Screenshot_6.png

 

Note: You may have noticed that we used the parameter limit=100 above. Don't worry, we will extract all the posts. By default, Power BI Desktop tries to extract the data in pages of 250 rows. This will usually be rejected by Facebook platform with an error, so the workaround is to explicitly override the page limit to 100.

 

Now, let's remove all the empty text messages, as we focus today on text analytics.

Click on the filter icon in the header of column message and select Remove Empty.

 

Screenshot_7.png

 

 

Now let's remove all duplicates rows (Yes, there are few duplicate rows in the data for some strange reason, and if we keep those rows, we'll get an error in the API response).

 

Right click on the header of column id and select Remove Duplicates.

 

 

Screenshot_8.png

 

Next step is to extract the number of shares for each post.

Click on the expand icon in the header of column shares, and click OK in the expand pane.

 

 

Screenshot_9.png

 

Rename the column to shares.

Click on the left icon in the header of column shares and select Whole Number.  

 

Screenshot_10.png

 

Click on the left icon in the header of column created_time and select Date/Time/Timezone

 

Screenshot_11.png

 

Next, we'll create a new column with a link to the actual post on Facebook. This is a practical trick that will allow us to navigate later from Power BI to Facebook on specific posts that are shown in a sliced table.

 

In the tab Add Column, click Custom Column.

 

Screenshot_12.png

 

In the Add Custom Column dialog, set facebook link as the New column name, and set the following formula in Custom column formula:

 

 

= "https://facebook.com/" & [id]

 

When you are done, click OK.

 

Screenshot_13.png

 

Now, let's add a new column. In the Query EditorAdd Column tab click Custom Column again.

Set Candidate as New column name, and the following formula in Custom column formula:

 

 

= "Trump"

 

When you are done, click OK.

 

Screenshot_14.png

 

It's time to apply the same logic on Clinton's Facebook page. We'll duplicate the query above and modify it to get the data from Clinton's page, and have a Candidate column with Clinton as the value. Here are the steps:

  

Rename the query to Trump.

In the Queries pane, right click on the query Trump and select Duplicate.

 

Screenshot_16.png

 

Rename the duplicated query to Clintonand select it. Click on the wheel cog icon of the last step in APPLIED STEPS. The Add Custom Column dialog will appear.

Change the Custom column formula from = "Trump" to = "Clinton". Click OK when you are done.

 

= "Clinton" 

When you are done, click OK.

 

Click the first step (Source) in APPLIED STEPS, and modify the URL inside Facebook.Graph("...") from DonaldTrump to HillaryClinton. 

 

= Facebook.Graph("https://graph.facebook.com/v2.2/HillaryClinton/Posts?fields=message,created_time,shares&limit=100")

When you are done, press Enter.

 

In the next steps, we'll append Trump and Clinton in a new query.

Right click on the query Clinton in Queries pane, and select Reference.

 

Screenshot_22.png

Rename the new query to Posts.

Select Posts iand in Home tab, click Append Queries, then click Append Queries.

 

Screenshot_24_01.png

 

In the Append dialog, select Trump as the Table to append, and click OK. (Trump was selected here because we started with a reference to Clinton, so we already have her data).

 

Screenshot_24_02.png

 

 

Now, since we no longer need the base queries Trump Clinton, we can disable them from loading in the report as standalone tables.

In the Queries pane, right click on Trump, and select Enable Load. This operation will remove the check mark from Enable Load. Now loading is disabled to that query. It will still be used in query Posts which has a reference to Trump. Repeat the last step on query Clinton.

 

Screenshot_24.png

 

 

We have finished part 1 of the tutorial and learned how to extract Facebook Posts and number of shares. In the next part of the tutorial we'll handle the Key Phrase extraction. 

 

Continue reading here.

Comments
by ramonviaceli Visitor
on ‎07-04-2017 12:50 PM

Hello,

 

I'm Getting the following error:

 

DataSource.Error: Facebook: (#17) User request limit reached

 

How can I solve this?

 

Thanks in advance.