Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
JChris
Helper II
Helper II

Folder source with CSVs, how to remove headers from each before merge?

I have the following strucutre:

 

  1. Folder "Lorem Ipsum"
    1. Document "2017-07-13.csv"
    2. Document "2017-07-14.csv"
    3. Document "2017-07-17.csv"
    4. (and so on...)

 

I use this folder as source inside Power BI. Every CSV inside this folder has a header, but I don't want them because after I merge there are some headers in the middle of the rows with actual data. I just want the rows so I can later add one single header using Power BI.

 

How can I remove the headers rom all CSVs before they are merged or something like that that will have the same effect?

1 ACCEPTED SOLUTION

OK, my Transform query looks like this:

 

let
    Source = Csv.Document(#"Sample File Parameter1",[Delimiter=",", Columns=3, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true])
in
    #"Promoted Headers"

This is the "Transform Sample File from "folder"" query.

 

You should also have a function (fx) that looks like this:

 

let
    Source = (#"Sample File Parameter1") => let
        Source = Csv.Document(#"Sample File Parameter1",[Delimiter=",", Columns=3, Encoding=1252, QuoteStyle=QuoteStyle.None]),
        #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true])
    in
        #"Promoted Headers"
in
    Source

Then you should have your main query that creates your data table that looks like this:

 

let
    Source = Folder.Files("c:\temp\powerbi\csvs"),
    #"Invoke Custom Function1" = Table.AddColumn(Source, "Transform File from csvs", each #"Transform File from csvs"([Content])),
    #"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
    #"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from csvs"}),
    #"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from csvs", Table.ColumnNames(#"Transform File from csvs"(#"Sample File"))),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"column1", Int64.Type}, {"column2", Int64.Type}, {"column3", Int64.Type}})
in
    #"Changed Type"

My CSV files are very simple, they look like this:

 

one.csv

column1,column2,column3
1,10,100
2,20,200
3,30,300

 

two.csv

column1,column2,column3
1,10,100
2,20,200
3,30,300

 

three.csv

column1,column2,column3
1,10,100
2,20,200
3,30,300

 

My data table looks like this:

Source.Namecolumn1column2column3

one.csv110100
one.csv220200
one.csv330300
three.csv110100
three.csv220200
three.csv330300
two.csv110100
two.csv220200
two.csv330300

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

7 REPLIES 7
009co
Helper IV
Helper IV

I am using this method for exactly the same purpose as the OP.

 

However, my use case has slight variation.

 

My folder contains multiple different types of files. I only want one type of file which I can filter by the name containing a specific string.

 

I want to apply this method on the filtered selection of files.

 

The challenge is defining the "Sample File" which in this method is:

 

= Source{0}[Content]

 

But how can I change that to the filtered list? Do I have to modify the "Sample File" query to filter the files instead of using "Source{0}[Content]"?

 

I will already be filtering the data in the main query that uses this Sample File, parameter and function to get all files. Seems somehow redundant to use same code in Sample File and the main query.

 

Any ideas, strategies appreciated!

Greg_Deckler
Super User
Super User

It sounds like you are not using a Folder query, which is what you should be using. Let me know if I am incorrect.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

I am using Folder query.

OK, well then it sounds like you aren't doing a Promoted Headers in your sample transform step. Go into Query Editor and look for the folder:

 

Transform File from "folder"

 

Within that folder should be a query called something like "Transform Sample File from "folder". Make sure that query does a "Promoted Headers" step.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

So, I went to the place you said and added the "Promoted Headers" to the sample file that was there. I applied it and refreshed and now all data is null inside my query...

OK, my Transform query looks like this:

 

let
    Source = Csv.Document(#"Sample File Parameter1",[Delimiter=",", Columns=3, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true])
in
    #"Promoted Headers"

This is the "Transform Sample File from "folder"" query.

 

You should also have a function (fx) that looks like this:

 

let
    Source = (#"Sample File Parameter1") => let
        Source = Csv.Document(#"Sample File Parameter1",[Delimiter=",", Columns=3, Encoding=1252, QuoteStyle=QuoteStyle.None]),
        #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true])
    in
        #"Promoted Headers"
in
    Source

Then you should have your main query that creates your data table that looks like this:

 

let
    Source = Folder.Files("c:\temp\powerbi\csvs"),
    #"Invoke Custom Function1" = Table.AddColumn(Source, "Transform File from csvs", each #"Transform File from csvs"([Content])),
    #"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
    #"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from csvs"}),
    #"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from csvs", Table.ColumnNames(#"Transform File from csvs"(#"Sample File"))),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"column1", Int64.Type}, {"column2", Int64.Type}, {"column3", Int64.Type}})
in
    #"Changed Type"

My CSV files are very simple, they look like this:

 

one.csv

column1,column2,column3
1,10,100
2,20,200
3,30,300

 

two.csv

column1,column2,column3
1,10,100
2,20,200
3,30,300

 

three.csv

column1,column2,column3
1,10,100
2,20,200
3,30,300

 

My data table looks like this:

Source.Namecolumn1column2column3

one.csv110100
one.csv220200
one.csv330300
three.csv110100
three.csv220200
three.csv330300
two.csv110100
two.csv220200
two.csv330300

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

It worked, I don't know why it was null before, but I returned from break and it was OK and my code was just like yours!

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.