Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
DebbieE
Community Champion
Community Champion

Power Query Editor Transform file helper Query folders

I have lots of files I have created from folders and now I have lots of these cluttering up power Query Editor

transform file.png

I really dont like it. Is there any way I can actually get rid of all these and get them down to just one, after its all been done by tweaking some code somewhere?

12 REPLIES 12
lbendlin
Super User
Super User

These are meant for beginners. Once you checked how they work you will realize how inefficient they are, for example they use the "first"  file as the blueprint for the column list of all the files - which may not be what you want. The also read that "first"  file twice, for no good reason.

 

Replace them with your own, more streamlined code (via Table.AddColumn and Table.ExpandTableColumn) and then you can delete the groups.

DebbieE
Community Champion
Community Champion

I wouldnt know how to do this. is there any documentation or anything I can use to replace the code and then get rid of these?

here's the general process:

 

- connect to a folder type data source (file share or sharepoint folder)

- filter for the files you want to ingest

- click the "Binary" link for one of the files, and run the minimum required transform (usually promote headers)

- examine the resulting Power Query code in Advanced Editor and extract the transform step as a function

- remove the steps for the single file

- add a column that calls the function for each of the files

- expand that column

- apply column type changes as needed

 

You can even specify the function inside the Power Query to keep everything in the same firewall partition.

 

What is your data source? File folder or SharePoint Document Library?

DebbieE
Community Champion
Community Champion

Oh my word. im just as confused now. they are in sharepoint but I have used Folder to get the data in

 

Ive tried going through this list but Im struggling . I just clicked Binary on a file and it just shows this. 

Sheet1.png

I guessed next to click the arrows on Table

But it hasnt brought through the file name. Which I really need and happened when I just used Folder. 

 

"and extract the transform step as a function

- remove the steps for the single file

- add a column that calls the function for each of the files

- expand that column

- apply column type changes as needed"

 

And this section, I havent got a clue what to do here..

 

 

This is the Advanced Editor Code from the original Get Data Folder that uses these helper files

 

let
Source = Folder.Files("C:\Users\Me\tenant\Project\Data"),
#"Filtered Rows" = Table.SelectRows(Source, each Text.Contains([Name], "Metric")),
#"Filtered Hidden Files1" = Table.SelectRows(#"Filtered Rows", each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File (11)", each #"Transform File (11)"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File (11)"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File (11)", Table.ColumnNames(#"Transform File (11)"(#"Sample File (11)"))),

 

Folder listing:

lbendlin_0-1683040079904.png

Let's collect all ExecutionAggregation files

lbendlin_1-1683040132944.png

Click the first Binary link. That ingests the CSV and already promotes the headers

lbendlin_2-1683040222600.png

We can simplify this to 

Table.PromoteHeaders(Csv.Document(Binary), [PromoteAllScalars=true])

and stuff that into a function.

 

 

let
    Source = Folder.Files(Folder),
    #"Filtered Rows" = Table.SelectRows(Source, each Text.Contains([Name], "ExecutionAggregation")),
    Ingest = (Binary)=> Table.PromoteHeaders(Csv.Document(Binary), [PromoteAllScalars=true])
in
    #"Filtered Rows"

 

Now we add a custom column that ingests each file

lbendlin_3-1683040469768.png

 

 

let
    Source = Folder.Files(Folder),
    #"Filtered Rows" = Table.SelectRows(Source, each Text.Contains([Name], "ExecutionAggregation")),
    Ingest = (Binary)=> Table.PromoteHeaders(Csv.Document(Binary), [PromoteAllScalars=true]),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Custom", each Ingest([Content]))
in
    #"Added Custom"

 

Then we can throw away all the meta data columns we don't need any more

lbendlin_4-1683040685062.pnglbendlin_5-1683040724628.png

 

And lastly we expand the Custom column to append all file contents.

lbendlin_6-1683040783668.png

 

That's it. Here is the entire code:

 

let
    Source = Folder.Files(Folder),
    #"Filtered Rows" = Table.SelectRows(Source, each Text.Contains([Name], "ExecutionAggregation")),
    Ingest = (Binary)=> Table.PromoteHeaders(Csv.Document(Binary), [PromoteAllScalars=true]),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Custom", each Ingest([Content])),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Name", "Custom"}),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"GatewayObjectId", "AggregationStartTimeUTC", "AggregationEndTimeUTC", "DataSource", "Success", "AverageQueryExecutionDuration(ms)", "MaxQueryExecutionDuration(ms)", "MinQueryExecutionDuration(ms)", "QueryType", "AverageDataProcessingDuration(ms)", "MaxDataProcessingDuration(ms)", "MinDataProcessingDuration(ms)", "Count"})
in
    #"Expanded Custom"

 

DebbieE
Community Champion
Community Champion

and stuff that into a function.

 

this is confusing me. i dont know what this part means. I know I can click a Query and Create function but I dont know how this all works with the given logic unfortunately

 

Im starting to wonder if this isnt too big and complex a job because I have a lot of queries to change

 

Ingest = (Binary)=> Table.PromoteHeaders(Csv.Document(Binary), [PromoteAllScalars=true])

 

As you know each step in Power Query has a name.  You can refer to the step in subsequent steps by using that name.

 

Here we create a step that is arbitrarily called "Ingest".  Then we provide an argument  (Binary) which transforms that step to a function.  "=>" completes the function header.  After that you can have any number of steps that can eventually return someting.  In this case what is returned is the CSV interpretation of the function argument, with the headers already promoted.  We need the headers later so we can then combine/append the results correctly. 

 

Instead of nesting the instructions you can go through the whole let ... in ...  syntax if that's more relatable. And you can also place the function outside this code.  There are many ways to do this, and ultimately it is down to personal preference.

DebbieE
Community Champion
Community Champion

Unfortunately this is not working for me

 

"Click the first Binary link. That ingests the CSV 

and already promotes the headers"

 

It doesnt do this. You have to click the arrows on the data column to bring through the data atfter clicking on Binary. And you have to do the column promotion youself.

 

the other thing it doesnt bring through because you are clicking on one file only is the actual file name.

 

Has anyone got another option I can use that will bring through the filr name. Which is what happens when you filter for the foles and then click on content to bring both files through?

Please provide a couple of sample CSV files (with the same structure).

DebbieE
Community Champion
Community Champion

Sorry i cant do that its not my data. 

 

Hopefully this walk through gives me what I need https://youtu.be/6cGou1-1FOo

 

DebbieE
Community Champion
Community Champion

Yes, this video works a treat

DebbieE
Community Champion
Community Champion

I will try and read through it all again but I fear Im not understanding this. Thats for trying though.

 

 

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel1

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.