Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
simonbernick
New Member

Covid-19 Vaccination Effort How to use table.buffer PLEASE HELP

I am working on the Covid-19 vaccination effort and I could really use some help as I do not normally write code. I am using power query to load data from a set of CSV files, but the data refreshes are taking too long.  I have turned off privacy settings, stopped background refresh, and combined all my queries into one query (below) to speed things up,  but its still taking 4 minutes to load 600k lines (500MB of CSV data).  

I want to use table.buffer to speed this up, but I can not figure out where to put it.  If I try putting it in front of the source like this table.buffer(Source = Folder.Files("T:\Restricted\2_Vaccination_Project\Reporting\Simon\Power Pivot Import"))  then I get an error message "Token Equal Expected".  

Please tell me where I am supposed to put table.buffer (please be explicit, I have not used power query before today)!

 

let
WithAddedKey =
Table.AddKey(
#"Names",
{"User Name"},
true),
Source = Folder.Files("T:\Restricted\2_Vaccination_Project\Reporting\Simon\Power Pivot Import"),
#"Filtered Rows" = Table.SelectRows(Source, each Text.StartsWith([Extension], ".csv")),
#"Filtered Hidden Files2" = Table.SelectRows(#"Filtered Rows", each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function2" = Table.AddColumn(#"Filtered Hidden Files2", "Transform File from Grab&Clean Data", each #"Transform File from Grab&Clean Data"([Content])),
#"Renamed Columns2" = Table.RenameColumns(#"Invoke Custom Function2", {"Name", "Source.Name"}),
#"Removed Other Columns2" = Table.SelectColumns(#"Renamed Columns2", {"Source.Name", "Transform File from Grab&Clean Data"}),
#"Expanded Table Column2" = Table.ExpandTableColumn(#"Removed Other Columns2", "Transform File from Grab&Clean Data", Table.ColumnNames(#"Transform File from Grab&Clean Data"(#"Sample File (2)"))),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Table Column2",{"Team", "Source.Name", "Invitee Name", "Invitee Time Zone", "Invitee accepted marketing emails", "Event Type Name", "Location", "Canceled By", "Cancellation reason", "Question 1", "Question 2", "Question 6", "Response 6", "Question 7", "Response 7", "UTM Campaign", "UTM Source", "UTM Medium", "UTM Term", "UTM Content", "Salesforce UUID", "Event Price", "Payment Currency", "Guest Email(s)", "Response 2", "Question 3", "Response 3", "Question 4", "Response 4", "Question 5", "Response 5"}),
#"Split Start Date/Time" = Table.SplitColumn(Table.TransformColumnTypes(#"Removed Columns", {{"Start Date & Time", type text}}, "en-US"), "Start Date & Time", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Start Date & Time.1", "Start Date & Time.2"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Start Date/Time",{{"Start Date & Time.1", type date}, {"Start Date & Time.2", type time}}),
#"Split Created Date Time" = Table.SplitColumn(Table.TransformColumnTypes(#"Changed Type1", {{"Event Created Date & Time", type text}}, "en-US"), "Event Created Date & Time", Splitter.SplitTextByEachDelimiter({" "}, QuoteStyle.Csv, false), {"Event Created Date & Time.1", "Event Created Date & Time.2"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Created Date Time",{{"Event Created Date & Time.1", type date}, {"Event Created Date & Time.2", type time}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type2",{{"Response 1", "Phone 2"}, {"Start Date & Time.1", "Start Date"}, {"Start Date & Time.2", "Start Time"}, {"Event Created Date & Time.1", "Created Date"}, {"Event Created Date & Time.2", "Created Time"}}),
#"Removed Duplicates" = Table.Distinct(#"Renamed Columns"),
#"Merged Queries" = Table.NestedJoin(#"Removed Duplicates", {"User Name"}, Names, {"User Name"}, "Names", JoinKind.LeftOuter),
#"Expanded Names" = Table.ExpandTableColumn(#"Merged Queries", "Names", {"Clean Name"}, {"Clean Name"}),
#"Remove Unclean Name" = Table.RemoveColumns(#"Expanded Names",{"User Name"}),
#"Reordered Columns" = Table.ReorderColumns(#"Remove Unclean Name",{"Clean Name", "Invitee First Name", "Invitee Last Name", "Invitee Email", "Text Reminder Number", "Phone 2", "Start Date", "Start Time", "End Date & Time", "Created Date", "Created Time", "Canceled"}),
#"Uppercased Text" = Table.TransformColumns(#"Reordered Columns",{{"Canceled", Text.Upper, type text}}),
#"Filtered Rows1" = Table.SelectRows(#"Uppercased Text", each [Canceled] = "FALSE"),
#"Filtered Rows2" = Table.SelectRows(#"Filtered Rows1", each [Clean Name] <> "Test"),
#"Grouped Rows" = Table.Group(#"Filtered Rows2", {"Clean Name", "Start Date", "Created Date"}, {{"Count", each Table.RowCount(_), type number}})
in
#"Grouped Rows"
4 REPLIES 4
CNENFRNL
Community Champion
Community Champion

@simonbernick , I ran into a similar situation not long time ago and here's my solution for your reference.

  1. in the first place imported all dataset into a database.
  2. then use Value.NativeQuery() to execute as much manipulation of dataset as possible at the db end, especially operations like row/column selection, table joins, etc;
  3. finally use PQ for complex data aggregation or format cleansing

even a simplest Access also worked like a charm in my scenario; some filtering operations took 3 minutes without db but only seconds while incorporating with db! It's worth trying.


Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension!

DAX is simple, but NOT EASY!

For reasons I can't explain, I can't move the data to a database (I wish I could) so I am looking for other solutions.  I really wish there was a way I could just buffer the csv files in my RAM.

HotChilli
Super User
Super User

table.buffer receives a table as a parameter so:

thebufferedTable = table.buffer(Source)     should work. Place that after the "Source =" step.

You can then refer to  thebufferedTable in the next step instead of Source.

It can sometimes be trial and error to use it though so be warned.

Good luck.

 

Also Merge can be slow in Power Query.  You can test this by duplicating the query and running it without the Merge onwards

 

It looks like buffering the source won't help me because its a folder not a file.  How do I get it to buffer each file in the folder, not the folder itself? Also I tried as you suggested, buffering a few different steps in the query (see below for an example) but when I did that the query would never finish refreshing (it loaded all the data, then just sat there doing nothing).

 

thebufferedTable = table.buffer(#"Invoke Custom Function2")

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors