Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
jnickell
Helper V
Helper V

Help with optimizing PowerQuery XML handling

First I'll admit this is a personal project to hopefully improve my own understanding of Power BI at work.  I have a personal finance software (GnuCash) that stores it's data in an XML format.

So far my attempts to extract format the data into usable tables for PowerBI have resulted in some very slow performing PowerQuery queries, and what I'm most concerned about is that the queries seem to loop, calling themselves multiple times while it tries to descend through the heirarchy of the XML file. 

 

I realize this might not be a whole lot to go off of, but would appreciate if someone could give me pointers if I'm bad choices.

Below are two of the queries.  The first is the "root" query, then I reference that query to dig into the transactions, accounts, transaction splits, etc.  Transactions is the one I've posted, I have a separate for each of these "sections" of the book.

The source GnuCash file is 45 MB (~11 yrs of data), but PowerQuery consumes much more than this trying to parse through the file.

let
    Source = Xml.Tables(File.Contents("C:\Users\me\OneDrive\Documents\Finances\PowerBI\GnuCash Source\my_Finances.gnucash")),
    Table0 = Source{0}[Table],
    Table1 = Table0{1}[Table],
    #"Changed Type" = Table.TransformColumnTypes(Table1,{{"Attribute:version", type text}})
in
    #"Changed Type"
let
    Source = book,
    #"Removed Other Columns" = Table.SelectColumns(Source,{"transaction"}),
    #"Expanded transaction" = Table.ExpandTableColumn(#"Removed Other Columns", "transaction", {"http://www.gnucash.org/XML/trn"}, {"http://www.gnucash.org/XML/trn"}),
    #"Expanded http://www.gnucash.org/XML/trn" = Table.ExpandTableColumn(#"Expanded transaction", "http://www.gnucash.org/XML/trn", {"id", "currency", "date-posted", "date-entered", "description", "slots", "splits", "num"}, {"id", "currency", "date-posted", "date-entered", "description", "slots", "splits", "num"}),
    #"Expanded id" = Table.ExpandTableColumn(#"Expanded http://www.gnucash.org/XML/trn", "id", {"Element:Text"}, {"Element:Text"}),
    #"Renamed Columns" = Table.RenameColumns(#"Expanded id",{{"Element:Text", "Transaction ID"}}),
    #"Expanded currency" = Table.ExpandTableColumn(#"Renamed Columns", "currency", {"http://www.gnucash.org/XML/cmdty"}, {"http://www.gnucash.org/XML/cmdty"}),
    #"Expanded http://www.gnucash.org/XML/cmdty" = Table.ExpandTableColumn(#"Expanded currency", "http://www.gnucash.org/XML/cmdty", {"space", "id"}, {"space", "id"}),
    #"Expanded date-posted" = Table.ExpandTableColumn(#"Expanded http://www.gnucash.org/XML/cmdty", "date-posted", {"http://www.gnucash.org/XML/ts"}, {"http://www.gnucash.org/XML/ts"}),
    #"Expanded http://www.gnucash.org/XML/ts" = Table.ExpandTableColumn(#"Expanded date-posted", "http://www.gnucash.org/XML/ts", {"date"}, {"date"}),
    #"Expanded date-entered" = Table.ExpandTableColumn(#"Expanded http://www.gnucash.org/XML/ts", "date-entered", {"http://www.gnucash.org/XML/ts"}, {"http://www.gnucash.org/XML/ts"}),
    #"Expanded http://www.gnucash.org/XML/ts1" = Table.ExpandTableColumn(#"Expanded date-entered", "http://www.gnucash.org/XML/ts", {"date"}, {"date.1"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Expanded http://www.gnucash.org/XML/ts1",{{"date", "Posted Date"}, {"date.1", "Entered Date"}, {"id", "Currency"}}),
    #"Expanded slots" = Table.ExpandTableColumn(#"Renamed Columns1", "slots", {"Namespace:"}, {"Namespace:"}),
    #"Expanded Namespace:" = Table.ExpandTableColumn(#"Expanded slots", "Namespace:", {"slot"}, {"slot"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Namespace:",{"slot", "splits"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Posted Date", type datetimezone}, {"Entered Date", type datetimezone}, {"description", type text}, {"num", type text}, {"Currency", type text}, {"space", type text}}),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type", {{"description", ""}})
in
    #"Replaced Errors"
1 REPLY 1
v-yingjl
Community Support
Community Support

Hi @jnickell ,

Based on the query, I have noticed that it was all about expanding / removing / renaming columns , replacing values etc. 

Maybe you can refer these blogs and video about optimizing query in power query which could help you:

  1. Optimising The Performance Of Power Query Merges In Power BI, Part 1: Removing Columns 
  2. Power Query/M Optimisation: Getting The Maximum Value From A Column, Part 1 
  3. Power Query Performance Optimization 

 

Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors