Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
tracyng0905
Frequent Visitor

Configure Power BI Data Gateway to Data Lake Gen 2


Hi all,

 

I would like to setup an data source on Power BI Service to connect Data Lake Gen 2. The on-premise gateway is installed in a VM and tested connection ok. however, what should I input for the Server and Full Path in below screen. Thanks!

 

PS: Because the data lake has firewall turned-on so power bi can't connect to data lake directly. What I am trying to do is to install the on-premise data gateway inside a VM which is under a VNet and PBI connect to data lake via this gateway .

 

tracyng0905_0-1604382234457.png

 

Regards,

Tracy

 

1 ACCEPTED SOLUTION
kmca
Regular Visitor

With some help from colleagues we made this work today. 

1) So the first thing was the data lake firewall settings.

I had set the private vnet data lake firewall to allow access from selected networks.
My vnet firewall allows host to host connection so Azure Storage explorer on my Gateway VM connects to the Data Lake fine.  But that is not enough. On the data lake firewall make sure you have added the subnet ( your private vnet) and not left this blank. Should look something like this.

kmca_1-1604961888412.png

2) My gateway installed with a local servce account. For now I left this alone, but believe a service account and spn registration are the way forward to support delegation.  

 

3) Adding a data source in the Power BI Manage Gateways, brings up the screen in your post.

Server is the dfs URL for the container:  https://<storage_account>.dfs.core.windows.net

Full Path is the resource part of the URI https://<storage_account>.dfs.core.windows.net/<path> 

My container name is /test and leaving it at that brings back a list of files at the get data stage. 
For Authentication I used the key option and copied the access key from the azure portal: Storage Account-> Access Keys (Not SAS Tokens)

 

At this point my first sucessful save worked.

Click on your cluster name and it expands to list your data sources and "test all" should report sucess above each form.

 

Capture.PNG

 

4) In Power BI and your workspace add  New and choose Data Flow.

Choose Add New Entries

Scroll or search data sources and select data lake gen 2

This form appears and it does not work too well.

The url is the https://<>..dfs.core.windows.net/<container>  // Copy from Azure Storage explorer

The drop down lists your gateway.

And the authentication throws lots of strange errors, and despite that works if you click on Next. 

 Capture.PNG

 

I think this is all still in preview so good luck with support. 

 

Regards

Keith

 

 

View solution in original post

9 REPLIES 9
kmca
Regular Visitor

With some help from colleagues we made this work today. 

1) So the first thing was the data lake firewall settings.

I had set the private vnet data lake firewall to allow access from selected networks.
My vnet firewall allows host to host connection so Azure Storage explorer on my Gateway VM connects to the Data Lake fine.  But that is not enough. On the data lake firewall make sure you have added the subnet ( your private vnet) and not left this blank. Should look something like this.

kmca_1-1604961888412.png

2) My gateway installed with a local servce account. For now I left this alone, but believe a service account and spn registration are the way forward to support delegation.  

 

3) Adding a data source in the Power BI Manage Gateways, brings up the screen in your post.

Server is the dfs URL for the container:  https://<storage_account>.dfs.core.windows.net

Full Path is the resource part of the URI https://<storage_account>.dfs.core.windows.net/<path> 

My container name is /test and leaving it at that brings back a list of files at the get data stage. 
For Authentication I used the key option and copied the access key from the azure portal: Storage Account-> Access Keys (Not SAS Tokens)

 

At this point my first sucessful save worked.

Click on your cluster name and it expands to list your data sources and "test all" should report sucess above each form.

 

Capture.PNG

 

4) In Power BI and your workspace add  New and choose Data Flow.

Choose Add New Entries

Scroll or search data sources and select data lake gen 2

This form appears and it does not work too well.

The url is the https://<>..dfs.core.windows.net/<container>  // Copy from Azure Storage explorer

The drop down lists your gateway.

And the authentication throws lots of strange errors, and despite that works if you click on Next. 

 Capture.PNG

 

I think this is all still in preview so good luck with support. 

 

Regards

Keith

 

 

Does it work with pwbi pro or pwbi premium (not by capacity) ???

kmca
Regular Visitor

PowerBI and Power BI premium work the same with this pattern.

@kmca ,

Your solution works for my case. Thanks for your sharing!!

 

Best Regards,

Tracy

Just one more thing to note:

Creating a DataLake on a VNET will automatically add a private end point for  <>.blob.core.windows.net

But it does not create a private end point for <>.dfs.core.windows.net and that will remain a public service end point.

To fix that issue in the portal select the storage account you created and added to the VNET, and select "private end point connections" and add a new private end point for DFS.  Make sure that you select the same region as your data lake, or the subnet drop down will not find anything useful. And choose the resource and sub resource as your data lake and "dfs" respectively.  That will move the DFS URL to your private VNET.  

kmca
Regular Visitor

Did you get this to work?

I am trying to do the same: connect to a datalake on a vnet using a data gateway.

I have 400 errors (unable to connect to the data source), and it is not clear what information the configuration form requires:
Data Source Name : Assume this is just a label so any name will do (a-z)

Data Source Type: This is a drop down list and I select Data Lake Gen2

Server:   What is this? I have tried the storage URL https://.... with and without the container ending.

Full path: Assume this is the URL including container https:......../test

Authentication Method: I am using Oauth and it does time out and I have to log in again. 

 

Shame there is no documentation for this screen. 

 

Regards

Keith
 

no solution so far.... cant find any document about this as well....hope anyone can help here. Or should I raise a support call/ticket?

v-shex-msft
Community Support
Community Support

HI @tracyng0905,

AFAIK, current azure data lake gen2 seems not required gateway to refresh, you can take a look at the following document to know more about this:

Power BI data sources 

if you are working with a report who mixed different types of data sources, you only need to turn on the 'allow gateway to manage cloud Datasource' to use gateway access to this data source instead of adding it to the gateway datasources.

Merge or append on-premises and cloud data sources 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

@v-shex-msft ,

Since the data lake is under V-Net and fireware is turned-on. Power BI is not allowed to connect to Data Lake because PBI is not a trusted Microsoft services... I tried to install a data gateway on VM and setup an on-premise data source on PBI Service to solve it. However, i am not sure what information is required to input for the Server and Full Path. 

tracyng0905_0-1604855967106.png

tracyng0905_2-1604856345069.png

 

 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors