Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
khift363
Frequent Visitor

A Troubled Union -- difficulties working with multiple data sources with identical formats

I am currently working on a report tracking items as they pass through fulfillment going from a set of shippers to a set of receivers and then ultimately to their final destination. All shippers have the same, standardized data format -- and the same with all receivers -- so operating a union across their collective data to turn them into one table is logically very easy after some basic formatting and filtering is done. The problem is, each shipper and receiver has their data on their own distinct servers, and as a result from what I'm reading PowerQuery does not appear to be able to execute a Union across these distinct data sources and still maintain the ability to fold queries. And to make matters worse, the data is simply too large to be able to practically import all of it with every import.

 

The data structure is as below. There's no logic or pattern to the shipments -- all shippers ship to all receivers, so there isn't a way to arrange that data without unioning them logically.

khift363_0-1677593863113.png

 

If DirectQuery is right out because it can't union across distinct sources, and full Import is out because the data is impractically large, is there some way to get Incremental Refresh set up for this data set even though the final table is not DirectQuery compatible? Each individual source is DirectQuery compatible -- can Incremental Refresh just query each source one at a time with a folded query including current date/time info, then do the union within Power BI? Or, alternatively and more brutally and surely worse performance -- can I just have each source exist as its own Incremental Refresh table within Power BI and do the unions in DAX on the report side? I'm sure the performance would be terrible, but I need something here.

 

Long term I really want to work with the folks in my company's tech department to try and see if logical views containing these individual buildings can be created for reporting services -- they're all under the same company umbrella and exist within the same system, just under different servers/instances -- but short term I need a report made sooner rather than later so I've got to work with the hand I have been dealt.

1 ACCEPTED SOLUTION
Greg_Deckler
Super User
Super User

@khift363 Couple of thoughts, First, yes you could use UNION in DAX to create calculated tables. But, this would effectively kind of double the size of your data model which sounds like it would be bad in this case. You could potentially use Dataflows to each of the source systems. You could turn on the Advanced compute engine so that you could use DirectQuery with those dataflows. You can also turn on incremental refresh for dataflows (Premium, PPU, etc). Then perhaps you could either create a new dataflow that unions together the base dataflows or perhaps use DirectQuery to connect to the dataflows and then use Aggregation tables to aggregate the data. Or, you use the dataflows to create a datamart where you can union the tables together and such. Lots of possibilities I think if you start leveraging the features of the Service like datamarts, dataflows, etc.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

1 REPLY 1
Greg_Deckler
Super User
Super User

@khift363 Couple of thoughts, First, yes you could use UNION in DAX to create calculated tables. But, this would effectively kind of double the size of your data model which sounds like it would be bad in this case. You could potentially use Dataflows to each of the source systems. You could turn on the Advanced compute engine so that you could use DirectQuery with those dataflows. You can also turn on incremental refresh for dataflows (Premium, PPU, etc). Then perhaps you could either create a new dataflow that unions together the base dataflows or perhaps use DirectQuery to connect to the dataflows and then use Aggregation tables to aggregate the data. Or, you use the dataflows to create a datamart where you can union the tables together and such. Lots of possibilities I think if you start leveraging the features of the Service like datamarts, dataflows, etc.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors