Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
Dhruv_Ranpura
Frequent Visitor

Getting On-Prem data from SQL Server to Microsoft Fabric using Dataflow (gen2)

Hi,
I am trying to get the On-prem data from  SQL Server to fabric using data factory-> Dataflows(gen2). Below are the steps which I am following:
1. Hvae installed Gateway with latest version which is working fine.
2. Created one lakehouse.

3. Creating Dataflows(gen2) ->selected Get Data->SQL server database -> connecting with server and database using credentials->selected one table-> chose lakehouse in data destination -> go with Lakehouse(none) option.(attached in snapshot)

Dhruv_Ranpura_1-1710141444045.jpeg

 

4.In choose destination target pane I am selecting the lakehouse which I have created earlier in step 2. 

And when I click on next I am getting error : Something went wrong while retrieving the list of tables.(atached in snapshot).

Dhruv_Ranpura_2-1710141474360.jpeg

 


So why I am getting this error?  It would be great if you guide.

 

1 ACCEPTED SOLUTION

@Dhruv_Ranpura so guys, ok so here's the deal (I am learning a lot about all this firewall stuff, hence why all the back and forth).  

 

port 1434 is irrelevant. 

 

One more thing for future readers:  The endpoint listed on the MS doc page (*.datawarehouse.pbidedicated.windows.net*.datawarehouse.fabric.microsoft.com,  *.dfs.fabric.microsoft.com) are wildcard FQDNs.  So it depends on the firewall what you can or cannot setup from what I've been told.  

 

In our case, we specified the 0.0.0.0/TCP/1433 for outgoing traffic and that did the trick.

 

Oh man, what confusion... finally we got to an understanding. 😥  The Microsoft documentation should seriously be rewritten on this point, because the people who are going to read it are Fabric users, not network admins.

View solution in original post

17 REPLIES 17
anros
Employee
Employee

Hi @Dhruv_Ranpura@Element115,

 

Could you please reproduce the issue and provide me through a private message the Session Id of the authoring session ?

You can find this information in the Power Query editor under the Options dialog, Diagnostics section, Session ID (you can right click > Copy).


This way I can dig deeper on this issue.

 

Thanks,

Antoine

Hi @Dhruv_Ranpura 

 

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others.
Otherwise, will respond back with the more details and we will try to help.


Thanks.

@v-cboorla-msft Element115 shared information with me through PM as requested.

I've not yet been through it and will very soon. @Dhruv_Ranpura sharing this information can help us troubleshoot what's happening in your specific case.

I'll update the thread with the relevant information as I discover them.

@Dhruv_Ranpura so guys, ok so here's the deal (I am learning a lot about all this firewall stuff, hence why all the back and forth).  

 

port 1434 is irrelevant. 

 

One more thing for future readers:  The endpoint listed on the MS doc page (*.datawarehouse.pbidedicated.windows.net*.datawarehouse.fabric.microsoft.com,  *.dfs.fabric.microsoft.com) are wildcard FQDNs.  So it depends on the firewall what you can or cannot setup from what I've been told.  

 

In our case, we specified the 0.0.0.0/TCP/1433 for outgoing traffic and that did the trick.

 

Oh man, what confusion... finally we got to an understanding. 😥  The Microsoft documentation should seriously be rewritten on this point, because the people who are going to read it are Fabric users, not network admins.

@Element115 If I want to check this firewall rule where I can check ? Can you please guide?

I'm the wrong guy to ask tbh.  Would be better to ask your sysadmin.  Besides we don't all use the same brands of firewalls and there seems to be many differences between them when it comes to UI and settings etc.

@Dhruv_Ranpura I'm gonna have an update on that info this PM.... waiting for my sysadmin... sounds like there might be an additinal port other than 1433 that needs to be open.

v-cboorla-msft
Community Support
Community Support

Hi @Dhruv_Ranpura 

 

Thanks for using Microsoft Fabric Community.

Apologies for the inconvenience that you are facing here.

The error you are encountering when trying to retrieve the list of tables in Microsoft Fabric Dataflow Gen2 could be due to a due to Gateway Configuration.

Ensure that the gateway is correctly configured with an open port 1433, as this is necessary for the gateway to send data to the Lakehouse this might help you.

For details please refer : Network issues with port 1433

If the issue still persists please do let us know. Glad to help.

 

I hope this information helps.

 

Thanks

I got a question re this GW config business.  In another thread, @pqian_MSFT explained how the data flows between a datasource, the GW, and the destination when the source is an on-prem DB. 

Now, I remember reading somewhere that there might be issued arising because of some protocol called TLS.  I didn't see anything in the GW to address potential TLS issues.  

 

Is there something we need to configure re the TLS protocol either at the GW or firewall level?  Or the TLS payload is simply encapsulated in a TCP/IP packet and all we need worry about (firewall-wise) is the TCP rule?

@Dhruv_Ranpura @v-cboorla-msft 

That's not it. I had the same exact issue for days.  Everything was working fine before. And when it was working fine, my gateway had already been configured according to the Microsoft doc.  As a matter of fact, the issue still happens today when using only ONE DFg2. So that cannot be the answer. 

Screenshot 2024-03-11 152704.jpg

 

 

However, if @Dhruv_Ranpura  you use 2 chained DFg2, that is, one extracts the data from the on-prem DB, and the second one uses the first DF as a source to load the data into the lakehouse, then this works. The chaining is setup as follows:

 

DF1 needs staging enabled AND no data destination AND you need to publish it

DF2 needs staging disabled AND a data destination, ie your lakehouse AND you need to publish it. (I say 'needs staging disabled' because if enabled you would duplicate your data and use twice the Fabric storage, that is, once when the data is copied into the default staging lakehouse automatically provisioned by Fabric, and twice when storing the data into your user created lakehouse.)

 

@v-cboorla-msft Here is the case# 2401190040000098 I have with support for this goddamn issue that hasn't been fixed in months now.  I guess we must hope that the release of the new OPGW (on-prem gateway) will fix all this and allow on-prem data extraction to be performed with one DFg2 instead of 2.

 

Thanks @Element115  for your response.
I have been trying the method suggested by you. I have created one Dataflow(Gen2) with no data destination selection and published it. Now, I created another Dataflow(Gen2) and choose source as Dataflows (Fabric) and selected Dataflow1 and its table and when click on next I am getting this error.

Dhruv_Ranpura_0-1710236751214.jpeg

I have checked that my SQL server is working fine on port 1433. So, why I am getting this error? Please guide.

You might find this excellent explanation from @pqian_MSFT about how data flows back and forth when using a data GW to an on-prem DB useful:  Re: Dataflow refresh started failing today MashupE... - Microsoft Fabric Community

@Dhruv_Ranpura Also, in your firewall rules, do you have this:

 

  • Protocol: TCP
  • Endpoints: *.datawarehouse.pbidedicated.windows.net, *.datawarehouse.fabric.microsoft.com, *.dfs.fabric.microsoft.com
  • Port: 1433

On-premises data gateway considerations for data destinations in Dataflow Gen2 - Microsoft Fabric | ...

@Dhruv_Ranpura @anros Please ignore this previous msg.  I just found out that by deleting all the endpoints from the firewall, and replacing the respective rules with only one rule opening TCP/1433 for MS-SQL Server service, it fixed all this connectivity nightmare. (DATAFLOW and ON-PREM DB connectivity solution - Microsoft Fabric Community)

Why does the doc say to add the endpoints, I have no idea.  Delete is your friend in this case.  

 

 

@Dhruv_Ranpura I see... so it issues the same error msg + now the handshake error. The latter then could mean it's not able to see your on-prem DB through the gateway.

 

When you are inside Power Query/Dataflow Gen2, in your first dataflow, can you see the data come in, that is, does Power Query show you a table of your data?

Yes, @Element115 when I created first DFGen2 I am able to see the data . It is showing the tables.

Dhruv_Ranpura_0-1710309142782.jpeg

Then I created another DFGen2 and while selecting the Dataflow 1 as a input option getting the error of pre-login hanshake error.

OK so the DF is able to talk to the DB successfully via the GW, otherwise there wouldn't be any data coming through to the first DF.  So it's when it goes back to the GW and to the data destination that something funky happens with the TLS protocol.  In other words, 

 

DF2--->GW--->LH

 

which is weird because it doesn't do that for me, and I use the exact same architecture, which in my case got rid of all these weird errors:

 

on-prem DB--->GW--->DF1 ---> DF2--->GW--->LH

 

Also, from what I have seen, there is no real tweaking available at the GW UI for TLS.  Only the firewall rules.  Speaking of which, my sysadmin couldn't get this endpoint to respond: *.dfs.fabric.microsoft.com, as if it were misconfigured.  But ETL still works, so not sure how essential it is to have this endpoint in the firewall rules.

 

I sure wish Microsoft would provide some in detail and specific guidance on this, if only with a detailed and cleared doc write-up, ie more than just what can be found here:  On-premises data gateway considerations for data destinations in Dataflow Gen2 - Microsoft Fabric | ...  I mean, we are in the weeds here, and if this product is meant for BI or data people but NOT network engineers, well then... we got a problem now, don't we?

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayFabricCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.