cancel
Showing results for 
Search instead for 
Did you mean: 

Geocoding using R and Google Maps in Power BI

 

Maps in Power Bi have improved significantly over past few releases and work fine across desktop, service and mobile app. However I recently saw a post by user where he had several addresses listed in the dataset and when plotted on map he realised number of plottings on map seemed little less than actual records in the dataset. Upon investigation it was found some addresses where not listed on Bing Maps but they were on Google Maps. If there was latitude longitude values it would have been easier since they would come up on Bing maps regardless of the address. That prompted me to explore the idea of geocoding and reverse geocoding in Power BI.

 

For those who don’t know, Geocoding is process of converting address into latitude and longitude coordinates and as name suggests reverse geocoding is process of converting latitude and longitude coordinates into address. So in this post we are going to look at using R and Google Maps API to perform geocoding in power bi. We will use sample dataset containing points of interest within city of Melbourne and obtain geographic coordinates for those. First we will use R script integration in Power BI to create and call R function that queries Google maps API and returns back with the coordinates for supplied point of interest. We will then convert it to power bi function which will be used to create additional columns of coordinates in our original data source of places of interest.

 

 

a.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Prerequisites and Limitations

 

 

Prerequisite for this technique is to have R environment setup with ‘HTTR’ package installed which we will use to send API requests over http. If you are new to R in Power BI you should follow this article to get started.

 

 

Google Maps geocoding API doesn’t require API key to query them and so no action is required there to sign up to obtain API key. Google’s free geocoding API currently allows 2,500 requests per day and 50 requests per second. That is why I’ve used dataset that is small in size and fits in criteria but higher limits are available with paid versions.

 

 

Getting Started

 

 

1. Supress warning for running native database queries. This is required to supress warnings that power bi generates each time power query calls R function. If you don’t follow this step is also fine but then for each time call is made (ie for each row of places of interest) you will have to allow query to be executed. To supress warnings go to File -> Options and settings -> Options and untick Native Database Queries under Security tab as shown below.

 

 

1.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2. Import and load contents of csv file into query editor.

 

 

 3.PNG

 

 

 

 

 

 

 

 

 

 

 

3. Create R Script data source by going to New Source -> Other -> R Script. Copy and paste below in Execute R Script dialog box. As it can be seen script uses HTTR package within R to make URL calls and uses GET (case sensitive) method to retrieve information back. We create a function called geocode in R that execute GET method of HTTR package and passes in address as parameter appended to the API calls URL. Result returned back contains series of information so we filter out only what we’re interested in which is latitude and longitude coordinates, convert it to a data frame and then at the end of function return back that data frame.

 

 

library(httr)
google_api <- "https://maps.googleapis.com/maps/api/geocode/json"
geocode <- function(address, verbose=FALSE) {
r <- GET(google_api, query=list(address=address))
stop_for_status(r)
result <- content(r)
first <- result$results[[1]]
df <- as.data.frame(list(lat=first$geometry$location$lat, lon=first$geometry$location$lng))
return(df)
}
result <- geocode("Melbourne")

 

 

4. Click OK to load resulted data frame from R function which is table into Power BI. Here we have used city of Melbourne as an example to execute geocode function.

 

 

4.PNG

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5. Next we have to create a function in power bi (not R function) that can be invoked for each of addresses in our original dataset. For the resulting query in power bi from above steps, right click and click Create Function. At this stage you will get a warning saying No Parameters Found. Just ignore that and click Create. Give appropriate name to the function such as ‘geocoding’ and click OK.

 

 

5.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

6.PNG

 

 

 

 

 

 

 

 

 

 

 

 

 

6. At this stage you will have two queries and function called “geocoding” in power bi, one with your original dataset and query from R script. You can now right click and delete R script query as we no longer need to use, we will use power bi☻ function instead.

 

 

7. Geocoding function currently has hardcoded value “Melbourne” for the address and so each time you invoke that function it will return same coordinates. We need to parameterise it to be able to pass addresses from our original dataset. So select function and select Advanced Editor under Query section on Home tab.

 

 

7.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

8. Make following edits in M code to make function require parameter and replace part of the function that passes in address to R function with that parameter as shown below. Ensure you enter three double quotes before and after '&paramter&' as shown in second screenshot bleow. This is to ensure addresses are wrapped in double quotes during the function call.

 

 

 

8.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

9.png

 

 

 

 

 

 

 

 

9. We are now ready with function that will take in address from our dataset and return back with coordinates. Switch back to POISmall dataset and under Add Column tab click Add Custom Column.

 

 

10. We will use custom column to invoke geocoding function for each row in FeatureName column and supply value of FeatureName column as parameter to that function using below code. To ensure addresses are matched with Australian addresses we will append text “Melbourne, Australia” at the end of each point of interest.

 

 

10.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

11. As a result of above step provided everything works fine there will be a new column added. Click on icon next to column name to expand and tick both lat and long column and click OK.

 

 

11.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

12. Ensure new columns are defined correctly as latitude and longitudes under Data Category to be ready for plotting onto map.

 

 

13. Finally if you publish this to power bi service it will all work fine except when you try refresh data source you may get error such as ‘Unable to combine data due to privacy levels for data source cannot be combined together. Please rebuild this data combination’. This is known issue about Privacy Levels and you need to follow steps described in this article by Adam Saxton to enable Fast Combine feature within Power BI Personal Gateway in order for Refresh to work. Once done refresh will work fine and any changes in your data source ie new addresses being added will be returned back with their geographic coordinates.

Comments

Nice tips for R and API in Power BI.

Thank you.

I try to get geocode by Google Map Api with M without R.

let
api = () => Json.Document(Web.Contents("https://maps.googleapis.com/maps/api/geocode/json?address=" & address & "&sensor=false")),
source = Function.InvokeAfter(api, #duration(0, 0, 0, 0.2), results = source[results], results1 = results{0}, geometry = results1[geometry], location = geometry[location], #"Converted to Table" = Record.ToTable(location), #"Pivoted Column" = Table.Pivot(#"Converted to Table", List.Distinct(#"Converted to Table"[Name]), "Name", "Value", List.Sum) in #"Pivoted Column"

@yoshihirok Nice. Reason I choose R is since Google maps free API has a limit on number of queries you can send over a period of time, using R you can build smarts that can wait for that period of time and start sending queries again. I didn't do a demo for that here because it was too complex and out of context for this blog post.

Hi, @ankitpatira
Yes, Google maps free API need to wait.
So, I use Function.InvokeAfter function for wait.


For example: for waiting under 5 api call per second.
Function.InvokeAfter( api, #duration(0, 0, 0, 0.2) )

 

I think some caution is needed to avoid inadvertently breaching Google's Terms of Service for their API.  For example "As long as your site is generally accessible to consumers without charge, you may use the Google Maps API." (from https://developers.google.com/maps/faq#tos ).

 

I'm no lawyer, but it seems likely that you are only complying with that if you use Power BI's Publish to Web feature.

 

There are alternative APIs that allow non-public use. I've used OpenCage succesfully on a couple of projects:  https://geocoder.opencagedata.com/

Absolutely. Purpose of this article is to just show technical capability. Anyone wanting to make use for any purpose should definately check terms and conditions.

Awesome article, @ankitpatira. Noob question here (I'm a zero in anything R, API or programming at all): would it be possible, given records of two addresses (home address, work address), to get the commuting time between one another the way Google/Bing Map do? I can manually enter them one by one, but the trick would be to get it for the whole list. I imagine columns similar to your "lat" and "lon" but something like "HomeToWorkCar", "HomeToWorkPublic", "WorkToHomeCar", "WorkToHomePublic", displaying hours. I read about the daily limitations on the free API. I imagine the above is doable, just would like to know if it's way harder than what you posted. Thanks and regards!

Hi there, 

 

I am trying to perform log transformation on the some of my columns. I could do this via R script but not sure how to use it. Any suggestions or is there any documentation available for this?

 

@ankitpatiraIt worked perfectly but created a line of error in the query. Any idea how to fix it? Thank you for help!

An error occurred in the 'geocoding' query. DataSource.Error: ADO.NET: R. Script Error Error in result $ results [[1]]: Index out of bounds Calls: geocode Execution stopped Details:     DataSourceKind = R     DataSourcePath = R     Message = R. Script Error Error in result $ results [[1]]: Out of bounds Calls: geocode Execution interrupted     ErrorCode = -2147467259     ExceptionType = Microsoft.PowerBI.Radio.RScriptRuntimeException

 

Hi,

 

Can anyone help me to replace Google Maps Api with any open source geocoding api(eg. Gisgraphy or Mapzen)?

 

I need to do it to overcome Google Maps limitation issue.

 

My tabel has > 250 ths of records to find latitude and longitude.

 

thank you for your help.

 

pawel 

Good job. Malayalam Calendar