Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
First I'll admit this is a personal project to hopefully improve my own understanding of Power BI at work. I have a personal finance software (GnuCash) that stores it's data in an XML format.
So far my attempts to extract format the data into usable tables for PowerBI have resulted in some very slow performing PowerQuery queries, and what I'm most concerned about is that the queries seem to loop, calling themselves multiple times while it tries to descend through the heirarchy of the XML file.
I realize this might not be a whole lot to go off of, but would appreciate if someone could give me pointers if I'm bad choices.
Below are two of the queries. The first is the "root" query, then I reference that query to dig into the transactions, accounts, transaction splits, etc. Transactions is the one I've posted, I have a separate for each of these "sections" of the book.
The source GnuCash file is 45 MB (~11 yrs of data), but PowerQuery consumes much more than this trying to parse through the file.
let
Source = Xml.Tables(File.Contents("C:\Users\me\OneDrive\Documents\Finances\PowerBI\GnuCash Source\my_Finances.gnucash")),
Table0 = Source{0}[Table],
Table1 = Table0{1}[Table],
#"Changed Type" = Table.TransformColumnTypes(Table1,{{"Attribute:version", type text}})
in
#"Changed Type"
let
Source = book,
#"Removed Other Columns" = Table.SelectColumns(Source,{"transaction"}),
#"Expanded transaction" = Table.ExpandTableColumn(#"Removed Other Columns", "transaction", {"http://www.gnucash.org/XML/trn"}, {"http://www.gnucash.org/XML/trn"}),
#"Expanded http://www.gnucash.org/XML/trn" = Table.ExpandTableColumn(#"Expanded transaction", "http://www.gnucash.org/XML/trn", {"id", "currency", "date-posted", "date-entered", "description", "slots", "splits", "num"}, {"id", "currency", "date-posted", "date-entered", "description", "slots", "splits", "num"}),
#"Expanded id" = Table.ExpandTableColumn(#"Expanded http://www.gnucash.org/XML/trn", "id", {"Element:Text"}, {"Element:Text"}),
#"Renamed Columns" = Table.RenameColumns(#"Expanded id",{{"Element:Text", "Transaction ID"}}),
#"Expanded currency" = Table.ExpandTableColumn(#"Renamed Columns", "currency", {"http://www.gnucash.org/XML/cmdty"}, {"http://www.gnucash.org/XML/cmdty"}),
#"Expanded http://www.gnucash.org/XML/cmdty" = Table.ExpandTableColumn(#"Expanded currency", "http://www.gnucash.org/XML/cmdty", {"space", "id"}, {"space", "id"}),
#"Expanded date-posted" = Table.ExpandTableColumn(#"Expanded http://www.gnucash.org/XML/cmdty", "date-posted", {"http://www.gnucash.org/XML/ts"}, {"http://www.gnucash.org/XML/ts"}),
#"Expanded http://www.gnucash.org/XML/ts" = Table.ExpandTableColumn(#"Expanded date-posted", "http://www.gnucash.org/XML/ts", {"date"}, {"date"}),
#"Expanded date-entered" = Table.ExpandTableColumn(#"Expanded http://www.gnucash.org/XML/ts", "date-entered", {"http://www.gnucash.org/XML/ts"}, {"http://www.gnucash.org/XML/ts"}),
#"Expanded http://www.gnucash.org/XML/ts1" = Table.ExpandTableColumn(#"Expanded date-entered", "http://www.gnucash.org/XML/ts", {"date"}, {"date.1"}),
#"Renamed Columns1" = Table.RenameColumns(#"Expanded http://www.gnucash.org/XML/ts1",{{"date", "Posted Date"}, {"date.1", "Entered Date"}, {"id", "Currency"}}),
#"Expanded slots" = Table.ExpandTableColumn(#"Renamed Columns1", "slots", {"Namespace:"}, {"Namespace:"}),
#"Expanded Namespace:" = Table.ExpandTableColumn(#"Expanded slots", "Namespace:", {"slot"}, {"slot"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Namespace:",{"slot", "splits"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Posted Date", type datetimezone}, {"Entered Date", type datetimezone}, {"description", type text}, {"num", type text}, {"Currency", type text}, {"space", type text}}),
#"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type", {{"description", ""}})
in
#"Replaced Errors"
Hi @jnickell ,
Based on the query, I have noticed that it was all about expanding / removing / renaming columns , replacing values etc.
Maybe you can refer these blogs and video about optimizing query in power query which could help you:
Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
102 | |
48 | |
19 | |
13 | |
11 |