Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
v3ntrue
Regular Visitor

Power Query Getting Header Names and File Names from Different CSV files

Hello, 

 

I have several different CSV files in a folder. I want to extract Header names and File Names from these files using Power Query.

 

End result should give me a table with 2 columns: FileName, HeaderName.

 

Also,

  • source files have different names
  • some header names in source files are different, some are identical
  • source files have different number of columns
  • source files are in tabular form (first row contains header names)

 

I thought I had it figured out using the code below. But it turns out some files have more than 21 columns in them and if I use the method below it cuts of at 21 and inludes only maximum of 21 headers per file. Also I am pretty sure there should be a better way to do this then DemotingHeaders to grab FirstRow.  Unfortunately I don't have access to SQL at the moment which would have simplified this task. Thank you! 

 

 

let 
 Source = Folder.Files("C:\Users\????\????\Desktop\?????\Data"), 
 #"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
 #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File from Crossphase_data", each #"Transform File from Crossphase_data"([Content])), 
 #"Removed Columns" = Table.RemoveColumns(#"Invoke Custom Function1",{"Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}), #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Transform File from Crossphase_data", "Custom"}}),
 #"Demote Headers" = Table.TransformColumns(#"Removed Columns", {{"Transform File from Crossphase_data", each Table.DemoteHeaders(_)}}), 
 #"First Row" = Table.TransformColumns(#"Demote Headers", {{"Transform File from Crossphase_data", each Table.First(_)}}), 
 #"Expanded Transform File from Crossphase_data" = Table.ExpandRecordColumn(#"First Row", "Transform File from Crossphase_data", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21"}, {"Transform File from Crossphase_data.Column1", "Transform File from Crossphase_data.Column2", "Transform File from Crossphase_data.Column3", "Transform File from Crossphase_data.Column4", "Transform File from Crossphase_data.Column5", "Transform File from Crossphase_data.Column6", "Transform File from Crossphase_data.Column7", "Transform File from Crossphase_data.Column8", "Transform File from Crossphase_data.Column9", "Transform File from Crossphase_data.Column10", "Transform File from Crossphase_data.Column11", "Transform File from Crossphase_data.Column12", "Transform File from Crossphase_data.Column13", "Transform File from Crossphase_data.Column14", "Transform File from Crossphase_data.Column15", "Transform File from Crossphase_data.Column16", "Transform File from Crossphase_data.Column17", "Transform File from Crossphase_data.Column18", "Transform File from Crossphase_data.Column19", "Transform File from Crossphase_data.Column20", "Transform File from Crossphase_data.Column21"}) 
in 
#"Expanded Transform File from Crossphase_data"

 

 

1 ACCEPTED SOLUTION
Jimmy801
Community Champion
Community Champion

Hello @v3ntrue 

 

i saw i just yet (and i was also advised by @Anonymous - thanks for this) that I didn't realize that the column numbers were hardcoded in the function.

So you should use this code instead 

 

let
    Source = Folder.Files("YourPathGoesHere"),
    GetHeaderOfCSV = (data)=>
    let
        Source = Csv.Document(data,[Delimiter=";" Encoding=65001, QuoteStyle=QuoteStyle.None]),
        Promote = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
        Header = Table.ColumnNames(Promote)
    in
        Header,
    AddHeader = Table.AddColumn
    (
        Source,
        "HeaderNames",
        each GetHeaderOfCSV ([Content])
    ),
    RemoveOther = Table.SelectColumns(AddHeader,{"HeaderNames", "Name"}),
    ExpandList = Table.ExpandListColumn(RemoveOther, "HeaderNames")
in
    ExpandList

 

and in case change the CSV-function to your needs (different delimiter, encoding etc.)

image.png

Now it should work

Give it a try and let me know

 

All the best

 

Jimmy

View solution in original post

9 REPLIES 9
Jimmy801
Community Champion
Community Champion

Hello @v3ntrue  and @AnkitBI

 

did the development at the same time, so to codes to choose from 🙂

let
    Source = Folder.Files("yourfoldergoeshere"),
    GetHeaderOfCSV = (data)=>
    let
        Source = Csv.Document(data,[Delimiter=";", Columns=4, Encoding=65001, QuoteStyle=QuoteStyle.None]),
        Promote = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
        Header = Table.ColumnNames(Promote)
    in
        Header,
    AddHeader = Table.AddColumn
    (
        Source,
        "HeaderNames",
        each GetHeaderOfCSV ([Content])
    ),
    #"Removed Other Columns" = Table.SelectColumns(AddHeader,{"HeaderNames", "Name"}),
    #"Extracted Values" = Table.TransformColumns(#"Removed Other Columns", {"HeaderNames", each Text.Combine(List.Transform(_, Text.From), ", "), type text})
in
    #"Extracted Values"

 

result looks like this

image.png

 

Copy paste this code to the advanced editor in a new blank query to see how the solution works.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

 

Hi @Jimmy801 ,

 

Thank you very much for your reply. Your code works just like you said it would.

 

I think I was not very clear in my initial ask. I wanted to have only one column name per row. Your solution places all column names in a single row. Basically I was looking for something like this,

-------------------------------------

|    File Name   |   ColumnName   |

-------------------------------------

|    A.csv           |   abc_column      |

|    A.csv           |   bac_column      |

|    A.csv           |   cfe_column       |

|    B.csv           |   eda_column      |

|    B.csv           |   edb_column      |

-------------------------------------

 

While your solution is not exactly what I was looking for, it is the closest one to my desired result and it gives me enough to move forward. I can simply use Text-To-Columns Excel functionality and then UnPivot columns with PQ to achieve my desired result.

 

Thank you 🙂

 

Jimmy801
Community Champion
Community Champion

Hello @v3ntrue 

 

then just use this code. Instead of combine the result into one row, the list is expaanded to new rows.

let
    Source = Folder.Files("YourPathGoesHere"),
    GetHeaderOfCSV = (data)=>
    let
        Source = Csv.Document(data,[Delimiter=";", Columns=4, Encoding=65001, QuoteStyle=QuoteStyle.None]),
        Promote = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
        Header = Table.ColumnNames(Promote)
    in
        Header,
    AddHeader = Table.AddColumn
    (
        Source,
        "HeaderNames",
        each GetHeaderOfCSV ([Content])
    ),
    RemoveOther = Table.SelectColumns(AddHeader,{"HeaderNames", "Name"}),
    ExpandList = Table.ExpandListColumn(RemoveOther, "HeaderNames")
in
    ExpandList

 

Copy paste this code to the advanced editor in a new blank query to see how the solution works. If this solution fits your need, copy and past a part of it and implement it in your query.

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

@Jimmy801 

 

I have tried updated solution and unfortunatelly the result looks different from what I expected.

 

This is what the result looks like with your new code,

 

------------------------------------------------------

|     Header Names               |     Name                  |

------------------------------------------------------

|          "A","B","C"...                |    A.csv                   |

|            (blank)                      |    A.csv                   | 

|          _1                               |    A.csv                    |    

|          _2                               |    A.csv                    | 

|          "E","F","G"...                |    B.csv                    |

|            (blank)                      |    B.csv                   | 

|          _1                               |    B.csv                    |    

|          _2                               |    B.csv                    | 

|          "H","J","K"...                |    C.csv                    |

|            (blank)                      |    C.csv                   | 

|          _1                               |    C.csv                    |    

|          _2                               |    C.csv                    | 

------------------------------------------------------

Jimmy801
Community Champion
Community Champion

Hello @v3ntrue 

 

i saw i just yet (and i was also advised by @Anonymous - thanks for this) that I didn't realize that the column numbers were hardcoded in the function.

So you should use this code instead 

 

let
    Source = Folder.Files("YourPathGoesHere"),
    GetHeaderOfCSV = (data)=>
    let
        Source = Csv.Document(data,[Delimiter=";" Encoding=65001, QuoteStyle=QuoteStyle.None]),
        Promote = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
        Header = Table.ColumnNames(Promote)
    in
        Header,
    AddHeader = Table.AddColumn
    (
        Source,
        "HeaderNames",
        each GetHeaderOfCSV ([Content])
    ),
    RemoveOther = Table.SelectColumns(AddHeader,{"HeaderNames", "Name"}),
    ExpandList = Table.ExpandListColumn(RemoveOther, "HeaderNames")
in
    ExpandList

 

and in case change the CSV-function to your needs (different delimiter, encoding etc.)

image.png

Now it should work

Give it a try and let me know

 

All the best

 

Jimmy

Thanks @Jimmy801 ! This time your solution worked exactly as expected. Perfect!

Jimmy801
Community Champion
Community Champion

Hello @v3ntrue 

 

I'm glad to hear that. Your feedback is highly appreciated.

 

BR

 

Jimmy

AnkitBI
Solution Sage
Solution Sage

Hi @v3ntrue 

Check if below code works for you. You may want to change Delimiter="," as per your csv ones.

 

let
    Source = Folder.Files("C:\Users\ankit_jain09\Desktop\Power BI\Power BI Community\New folder"),
    #"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".csv")),
    #"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Content", "Name"}),
    GetColumnNames = Table.AddColumn(#"Removed Other Columns", "ColumnNames", each Table.ColumnNames(Csv.Document([Content],[Delimiter=",",Encoding=1252, QuoteStyle=QuoteStyle.None]))),
    #"Expanded ColumnNames" = Table.ExpandListColumn(GetColumnNames, "ColumnNames"),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded ColumnNames",{"Content"})
in
    #"Removed Columns"

 

Thanks
Ankit Jain
Do Mark it as solution if the response resolved your problem. Do Kudo the response if it seems good and helpful.

Hi @AnkitBI ,

 

Thank you for your prompt reply.

 

Unfortunately your code does not seem to work as expected. In "ColumnName" column it gives values such as "Column1", "Column2", "Column3" etc. instead of actual column names from csv files. Basically it looks like this,

------------------------------------------

|     File Name     |     ColumnName     |

------------------------------------------

|          A.csv        |     Column1             | 

|          A.csv        |     Column2             |       

|          A.csv        |     Column3             |       

------------------------------------------

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors