cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
wikap
New Member

Performance challenge: get rid of unnessecary rows ...

Hi Community.

I have build a model, which is performing really bad and would ask for you help to optimize the challenge. Data source is Dynamics 365 Sales / Dataverse (I cannot use the December 2020 updated connector, since it excludes multiple text lines - so I'm on the next newest CDS connector).

 

🎓Situation :

  • I have two tables: Accounts And Contracts and would like to do some nice reporting on the Contracts table. Contract table is big - i.e. 100.000 rows. Account table is also big - i.e. 50.000 rows. And each contract is done with an Account (lookup field/guid).
  • We are only using a small subset of Accounts for contracts, and I would like the Power BI report Account table to contain ONLY the Accounts we are doing Contracts with. Why: because users in dropdowns should be able to see/select only the Accounts we are engaged with and I don't wan' all other accounts also due to GDPR considerations.
  • I already considered bi-directional relations, but can't use this, since I use bi-directional relations on other table relations

🐢Here is what I've done, that performs bad - hope you can follow:

  • I'm pulling out all Contracts from source in a Contract Table. Keeping only the columns I need for reporting.
  • I'm pulling out all Accounts in a Temp_account table
  • Then I'm creating a new Account report table in this tricky way:
    • Create a copy og the Contract table (using 'reference')
    • Remove all columns except the single Account GUID column (wik_accountid)
    • Remove duplicates across the GUID column ('select distinct' behind the scene I assume)
    • Removed rows with null and ''
    • Then I MERGED with the Temp_account table in a "Left-outer" merge to get this final Account table containing only rows matching the Contracts' Account lookup field (need to expand it to get the Columns after the merge)

And the problem is this way puts huge performance effort on my Power BI Desktop:

Maybe the problem is that it needs to fetch the full Contract table twice because I use the reference copy?

Maybe merge is just slow?

Maybe the problem is that I always fetch all Columns from Dynamics Sales / Dataverse only to remove most of them afterwards - I wish I could specify only the needed columns in the source query instead of reading all.

 

😷So my question is:

Could there could be another way getting this reduced row table, that performs better in a Power BI context?

 

I know I'ts probably hard to follow my situation and help me improve performance. But I would really appreciate help, since it is more of a general problem to me I often face, and maybe also to other PowerBI'ers in this comunity.

7 REPLIES 7
V-pazhen-msft
Community Support
Community Support

@wikap 

It is unlikely to come out with a solution your requirement especailly for someone without deeper understanding of the model. I would suggest you to read the doc: Data reduction techniques for Import modeling - Power BI | Microsoft Docs. There are just a few techniques but really a good start for data reduction. You can use Google or Bing to find more videos and blogs about model optimization.


Paul Zheng _ Community Support Team
If this post helps, please Accept it as the solution to help the other members find it more quickly.

Dear Paul. Thanks for your suggestion - I understand it's difficult, I think it's the merge of copied tables, that gives the model the headache, and I was hoping that someone had run in to the same issue, and could say that instead of the merge you should use this better performing feature in PowerBI. But I understand it was a long shot, but appreciate you read it and answered. /Henrik

AlB
Super User III
Super User III

Hi @wikap 

Can you share a pbix with dummy data, with the tables you use keeping their structure but with just a few rows? So that the process you are following is easier to understand and we can try to modify it

Please mark the question solved when done and consider giving a thumbs up if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

 

Dear AIB, unfortunately I cannot give you the model, since it's a customer case. But thanks for reading my post and for trying to help. Stefano below gives an intresting suggestion, but of course it does not clean out all the nessacary rows. Maybe you have an idea on my column question, in the reply I offered Stefano? 

Thanks!

StefanoGrimaldi
Solution Sage
Solution Sage

hey, actually almost 1m rows ist that big of a dataset as to make the report to bad performce, defently a first step would be reducing the columns called from the  source as not the same 1m row x 30 columns vs 1m row x 20 columns, it does a good diference (its a best practice to just bring from source column that actually are being use for something on the report), merge step consume a good deal of resource on the report depending on the extend of the merging its doing. 

to avoid all this merging and creations do the follow: 

- load the accound and contract table to your data model without merging creating new table etc, 

- do  a relantionship normal between those two table

- on the account table do a new column that count the amount of times that account numbers appears in the contract table if its mayor to 0 assing 1 if 0 assign 0

- use that new column as a report filter and select for it to appear only the account with values on that new column of 1. this will delete a lot of that work around you did and should help a great deal with your performance. 

Wow ... thats a intresting suggestion Stefano. I think I will try to remodel and implement the count feature. Of course the rows will still be there, but there is a chance it will simplify data load and remove the performance headache from especially dataload.

 

Do you know if it is possible to query ONLY selected columns from Dataverse/Dynamics 365 Sales, since this is of course a huge headache also as you mention. The standard way to connect to a table does not open for column selection ... here you see the connection lines from advanced editor:

 

CLASSIC CDS CONNECTOR:
let
Source = Cds.Entities("https://HIDDEN.api.crm4.dynamics.com/api/data/v9.1/", null),
Navigation = Source{[Group = "entities"]}[Data],
#"Navigation 1" = Navigation{[EntitySetName = "wik_akkrediterings"]}[Data],
#"Removed other columns" = Table.SelectColumns(#"Navigation 1",{"createdon", "wik_akkrediteringid", "wik_andredeltagendekollegaer", "wik_antalinvolveredestuderende", "wik_by", "wik_egnetpraktiksted", "wik_egnetpraktiksted_display"})

NEW DATAVERSE CONNECTOR:
let
Source = CommonDataService.Database("HIDDEN.crm4.dynamics.com"),
dbo_incident = Source{[Schema="dbo",Item="incident"]}[Data],
#"Removed Other Columns" = Table.SelectColumns(dbo_incident,{"incidentid", "subjectid", "title"})

 

Thanks for your contribution 🙂

havent try on dynamic 365 to bring only specific columns from the source, so wont be able to tell that much sorry. 

Helpful resources

Announcements
PBI User Groups

Welcome to the User Group Public Preview

Check out new user group experience and if you are a leader please create your group!

MBAS Attendee Badge

Claim Your Badge & Digital Swag!

Check out how to claim yours today!

secondImage

Are You Ready?

Test your skills now with the Cloud Skills Challenge.

Top Solution Authors