cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Brian_M
Responsive Resident
Responsive Resident

Retain file name column when using Folder as a data source - Power Query M

Hi,

 

Am hoping for some help.  I will outline three steps below where I take a folder as a source, combine the binaries and then import. It all works fine, however, at the final step after I have combined and imported each binary I would like there to be a column which displays which file (e.g. Filename1.csv) that each row originated from. When you do the combine and import in Steps 2 & Step 3, the file metadata from Step1 (e.g. Filename, DateCreated) is no longer visible.

 

Has anyone found a strategy to add column(s) for file metadata, particularly the originating filename as a new column in the imported dataset?

 

The steps are outlined below.

Kind regards

Brian

 

Step 1:

Source = Folder.Files("C:\Users\XXX\YYYY"),

 

Content   Name

Binary     Filename1.csv

Binary     Filename2.csv

Binary     Filename3.csv

Binary     Filename4.csv

Binary     Filename5.csv

Binary     Filename6.csv

 

 

Step 2:

CombinedBinaries = Binary.Combine(#"Removed Other Columns"[Content])

 

-- Displays an CSV file icon  

-- 180159607 bytes

 

Step 3:

Imported = Csv.Document(#"Combined Binaries",[Delimiter=";", Columns=16, Encoding=1252, QuoteStyle=QuoteStyle.None])

--All the rows for the appropriate columns from each file are imported and appended, but I cannot

--see how to append a column for the originating filename.

1 ACCEPTED SOLUTION
ImkeF
Super User
Super User

Actually, it's pretty easy with this line of code:

 

= Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Content],[Delimiter=";", Encoding=1252])))

 

Replace your Step 2&3 with it.

It will retain all the metadata from the 1st step and add the content in an additional custom column.

If your csv's have all the same headers, you simply expand this column and the headers of the 1st file will be shown - and expand all other files on these cols as well.

 

If there are differnt cols - just come back & I'll post the code for the auto-expand of different headers.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

24 REPLIES 24
sarava8304
Frequent Visitor

Hi,

Can you advise me on using the Table name in the new column of the table, when we use the data from websource with multiple tables

Joachim_B
Regular Visitor

What would be the code if you were connecting to a folder with fixed width text files rather than CSV files?

Sorry, no idea as I haven't done this already. But you can easily find out by using the UI: Start the import from folder process and you will see a table popping up with all files listed with different metadata-field. Choose "Content" and click into one of its fields. A further dialogue might start but at the end you will be able to see the full code in the advanced editor.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

ImkeF
Super User
Super User

Actually, it's pretty easy with this line of code:

 

= Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Content],[Delimiter=";", Encoding=1252])))

 

Replace your Step 2&3 with it.

It will retain all the metadata from the 1st step and add the content in an additional custom column.

If your csv's have all the same headers, you simply expand this column and the headers of the 1st file will be shown - and expand all other files on these cols as well.

 

If there are differnt cols - just come back & I'll post the code for the auto-expand of different headers.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

Anonymous
Not applicable

Hey Imke! 

 

Thanks for all the support. 

I have a slighly different problem and since I'm new to M I don't know how to proceed. 

 

In my case, I need to keep the creation date of the document, as I may benefit from it for some calculations.

How can I transform the creation date into a new column in Power Querry?

 

Thanks

The column should be/stay there, if you don't use Combine Binaries but use the method I've described and add a column to retrieve the data from the "Content"-column.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

Hi @ImkeF,

 

I am doing something a little bit different and was wondering if you could help guide me in my use case. I have a SharePoint folder where different CSVs will be uploaded with different headers. Each csv will have at least one of the common merge headers, so there will always be "Key1", "Key2", or "Key3" as one of the headers - the rest might have extra data. I am trying to modify my power query to handle these cases, so that if the file only has "Key2" - just that column will be read in and the rest ignored. Or if it just has "Key1" and "Key2" - those will be read and the rest ignored.. etc

 

I created a sample file with 3 columns for the 3 keys for my default file, but I need help modifying my query because right now the new files just read the first 3 columns of data and doesn't search for "Key2" in the data. Any help on this use case?

 

Thanks in advance for the help!

You have to select the 3 columns in your function before you expand. Therefore you have to modify the Table.SelectColumns-command to cater for missing columns:

 

Table.SelectColumns(AppendToSource,{"Key1", "Key2", "Key3"}, MissingField.Ignore)

That way, any of the 3 key-columns will be selected and no error will be thrown if one or more of them are missing.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

jadams0304
Frequent Visitor

Hello, thanks for your help so far.  I'm curious if this solution works in PowerPivot?  

 

I've tried several variations of your solution and I can't seem to get it to work.

 

I want to retain on each row of the Binary Data CLE/CVO and Actual/Budget.

 

My folder paths:

C:\Users\User\OneDrive - Corporate\PowerPivot\TB\CLE\Actual\
C:\Users\User\OneDrive - Corporate\PowerPivot\TB\CVO\Actual\
C:\Users\User\OneDrive - Corporate\PowerPivot\TB\CVO\Budget\
C:\Users\User\OneDrive - Corporate\PowerPivot\TB\CVO\Budget\

 

 

My Binary Data:

Capture1.JPG

 

 

This is the code I tried:  

let
Source = Folder.Files("C:\Users\User\OneDrive - Corporate\PowerPivot\TB"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Folder Path],[Delimiter="\", Encoding=1252])))
in
#"Added Custom"

 

I ended up solving this myself.  I finally decided to use my head and look up table.expandtablecolumn which made it immediately obvious what I was doing wrong.  I'm new to this so I had no reference point for what was actually happening when I used table.expandtablecolumn.  

 

 

let
    Source = Folder.Files("C:\Users\User\OneDrive - Corporate\PowerPivot\TB"),
    #"Added Custom" = Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Content], [Delimiter=",", Encoding=1252]))),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom", "Folder Path"}),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"Period", "No", "Current Debit/(Credit)", "YTD Debit/(Credit)"}, {"Custom.Period", "Custom.No", "Custom.Current Debit/(Credit)", "Custom.YTD Debit/(Credit)"})
in
    #"Expanded Custom"

 

From here I'll split the Folder Path on "\" and then remove all the Folder Path columns besides the two containing CLE/CVO and Actual/Budget.

When I try this solution for every entry in the csvs I get a different row in the data for every filepath in the folder regardless of whether the entry appeared in it or not. This makes it impossible to find entries by specifying the csv which is my ultimate goal.

Sorry, but I have no clue what you want and what the current problem is you're facing.

 

Any chance to refer more specificly to the examples given and:

1) Point out where your requirement differs from the examples given and

2) Provide some examples of your case?

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi the code for the auto expand for different headers would be awesome!

Brian_M
Responsive Resident
Responsive Resident

Perfect thanks Imke.  At all my data came through in one column but after noticing the [Delimiter=";" and changing it to [Delimiter="," then I was sorted.

For anyone else reading, I'll try to describe the difference between my first attempt and Imke's solution. Imke avoids steo where I combined the binaries (and lost the filename), instead she adds a new column that contains the data from each file.

 

My Attempt -  loose filename...

let 

 

    //Lists all the files in the folder

    Source = Folder.Files("C:\Users\Brian\Desktop\FolderName")

 

    /*Code generated for me by clicking "Combine binaries" double down arrow on Content column

    Combines all the files into one binary, (NB: here I lose the filename which I wanted to keep 😞 ...) */

    MyStep1 = Binary.Combine(Source[Content]),

 

    /*Code generate for me by right clicking binary object and choosing Csv.

    Converts single binary object from Step1 above, expands all the data - but I am missing the filename that I wanted to keep.

   */

    MyStep2 = Csv.Document(MyStep1,[Delimiter=",", Columns=12, Encoding=1252, QuoteStyle=QuoteStyle.None])

in 

   MyStep2

 

Imke's Solution - keep filename...

let

    //Lists all the files in the folder

    Source = Folder.Files("C:\Users\Brian\Desktop\FolderName")

 

    /*Instead of combining binaries into one single binary and losing all the file metadata (like filename), use the

      Table.AddColumn - best of both worlds, keep the file metadata plus add a new column containing the data from 

       each file.*/

    ImkeStep1 = Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Content],[Delimiter=",", Encoding=1252])))

   

in

    ImkeStep1

 

pbakaric
Frequent Visitor

I am using the folder type for get data, how can I show the file name or file create date for each row of the merged data?

It is not clear to me what you are actually doing. Could you please share the M-code that has been created so far (from the advanced editor).

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

pbakaric
Frequent Visitor

Here is where I am at:

 

let
Source = Folder.Files("C:\Users\pbakaric\Google Drive\DFM Dashboard\Reports"),
#"Combined Binaries" = Binary.Combine(Source[Content]),
#"Imported CSV" = Csv.Document(#"Combined Binaries",[Delimiter=";", Columns=21, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Changed Type" = Table.TransformColumnTypes(#"Imported CSV",{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}, {"Column8", type text}, {"Column9", type text}, {"Column10", type text}, {"Column11", type text}, {"Column12", type text}, {"Column13", type text}, {"Column14", type text}, {"Column15", type text}, {"Column16", type text}, {"Column17", type text}, {"Column18", type text}, {"Column19", type text}, {"Column20", type text}, {"Column21", type text}}),
#"Removed Top Rows" = Table.Skip(#"Changed Type",1),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows"),
#"Removed Blank Rows" = Table.SelectRows(#"Promoted Headers", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null})))
in
#"Removed Blank Rows"

 

I tried incorporating your script but I couldn't seem to get it to work. I appreciate the help!!! 

Your code should look like this:

 

 

let 
Source = Folder.Files("C:\Users\pbakaric\Google Drive\DFM Dashboard\Reports"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Table.Skip(Csv.Document([Content], [Delimiter=";", Columns=4, Encoding=1252, QuoteStyle=QuoteStyle.None]),1))), 
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Name", "Custom"}), 
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"Datum", "Menge", "Wert", "Kommentar"}, {"Datum", "Menge", "Wert", "Kommentar"}) 
in 
#"Expanded Custom"


Check out the video as well, as there you'll find another method, showing how to use an explicit function to do that. This might be more convenient if there are more steps to perform an the nesting into the #"Added Custom"-steps becomes too confusing.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Morvaryt
Frequent Visitor

Hi ImkeF,

 

I'm following the steps outlined in the video to retain the file names and I'm receiving an error at the very last step, when I add a custom column:

= Table.AddColumn(Source, "Custom", each Folder([Content]))

 

The error I am receiving is as follows: "Expression error: The name 'Folder" wasn't recognized. Make sure it's spelled correctly."

 

Here's the function I created:

 

= (Content) =>
let
// Source = Folder.Files("C:XXXContentDrilldown"),
// #"C:XXXContentDrilldown csv" = Source{[#"Folder Path"="C:\XXXContentDrilldown\",Name="XXXContent Drilldown 20160501-20160531.csv"]}[Content],
#"Imported CSV" = Csv.Document(Content,[Delimiter=",", Columns=7, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Changed Type" = Table.TransformColumnTypes(#"Imported CSV",{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}}),
#"Removed Top Rows" = Table.Skip(#"Changed Type",6),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows")

in #"Promoted Headers"

 

I would appreciate any insight.

Thanks!

Margaryta

Hi Margaryta,

 The error not recognize the name "Folder" refers to the name of the function (the name of the query):

 

PBI_ImageFolder.png

So you either rename the function/query or edit your code to the name you gave it already.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Helpful resources

Announcements
UG GA Amplification 768x460.png

Launching new user group features

Learn how to create your own user groups today!

November Power BI Update 768x460.png

Check it Out!

Click here to read more about the November 2021 Updates!

M365 768x460.jpg

Microsoft 365 Collaboration Conference | December 7–9, 2021

Join us, in-person, December 7–9 in Las Vegas, for the largest gathering of the Microsoft community in the world.