cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
Helper II
Helper II

Getting data from CSV files that have slightly different columns each week

I have a CSV file, when stripped down to the barest form, that looks like this:

 

WW33

 

Job,WW31,WW32,WW33
JobA,4,4,4
JobB,2,6,3
JobC,2,4,7

 

 

As you can probably guess, this is weekly data, with each Work Week (WW) as column heading. The last column is the most recent WW, and although data goes back only 3 weeks here, in my real data it goes back 12 weeks. So far so good. The next week (WW34 in this case), I will again get a similar file, but the oldest week column (WW31 in this case) eliminated, and the new week added. So, my WW34 and WW35 files would looks like this:

 

WW34

 

Job,WW32,WW33,WW34
JobA,4,4,3
JobB,6,3,4
JobC,4,7,8

 

 

WW35

 

Job,WW33,WW34,WW35
JobA,4,3,2
JobB,3,4,1
JobC,7,8,5

 

 

Do note that I have no control over the format of data and cannot be changed.

 

My goal is to set up a PowerBI project where it reads data from a certain folder, and whenever someone drops the new weeks' file, it reads data and combines them all into a single data table that should look like this:

 

Expected

 

Job,WW31,WW32,WW33,WW34,WW35
JobA,4,4,4,3,2
JobB,2,6,3,4,1
JobC,2,4,7,8,5

 

 

Simply put, it should have as many columns as there are weeks in all files combined, and it shouldn't have duplicate columns. I'm not quite sure how I can achieve this. If I simply set PowerBI to read data from a folder and do 'Combine and Transform', this is what I get, which obviously isn't what I want.

 

Capture.PNG

 

Is there a way to get and format data the way I want?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Super User II
Super User II

Hi @Sach 

Try this. It's basically loading the files from the folder, transposing the tables, eliminating duplicates and transposing again

let
    Source = Folder.Files(FOLDER_WHERE_YOUR_DATA_IS),
    #"Added Custom" = Table.AddColumn(Source, "Custom", each Csv.Document([Content])),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each Table.Transpose([Custom])),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom1",{"Custom.1"}),
    #"Expanded Custom.1" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom.1", {"Column1", "Column2", "Column3", "Column4"}, {"Column1", "Column2", "Column3", "Column4"}),
    #"Removed Duplicates" = Table.Distinct(#"Expanded Custom.1", {"Column1"}),
    #"Transposed Table" = Table.Transpose(#"Removed Duplicates"),
    #"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Job", type text}, {"WW31", Int64.Type}, {"WW32", Int64.Type}, {"WW33", Int64.Type}, {"WW34", Int64.Type}, {"WW35", Int64.Type}})
in
    #"Changed Type"

 

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

View solution in original post

9 REPLIES 9
Highlighted
Super User III
Super User III

I would create a function (or modify the Transform Example File query) to unpivot all the Week columns, append the files together, and then remove rows with duplicate Week and Job.  You could then pivot it back out, but I recommend you keep it unpivoted.

 

If this works for you, please mark it as the solution.  Kudos are appreciated too.  Please let me know if not.

Regards,

Pat





Did I answer your question? Please mark my post as a solution! Kudos are also appreciated!

Proud to be a Super User!




Highlighted

Can you perhaps show me with an example? I'm pretty new to this and not quite sure what you mean.

Highlighted
Super User II
Super User II

Hi @Sach 

Try this. It's basically loading the files from the folder, transposing the tables, eliminating duplicates and transposing again

let
    Source = Folder.Files(FOLDER_WHERE_YOUR_DATA_IS),
    #"Added Custom" = Table.AddColumn(Source, "Custom", each Csv.Document([Content])),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each Table.Transpose([Custom])),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom1",{"Custom.1"}),
    #"Expanded Custom.1" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom.1", {"Column1", "Column2", "Column3", "Column4"}, {"Column1", "Column2", "Column3", "Column4"}),
    #"Removed Duplicates" = Table.Distinct(#"Expanded Custom.1", {"Column1"}),
    #"Transposed Table" = Table.Transpose(#"Removed Duplicates"),
    #"Promoted Headers" = Table.PromoteHeaders(#"Transposed Table", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Job", type text}, {"WW31", Int64.Type}, {"WW32", Int64.Type}, {"WW33", Int64.Type}, {"WW34", Int64.Type}, {"WW35", Int64.Type}})
in
    #"Changed Type"

 

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

View solution in original post

Highlighted

But if I did that, wouldn't I need to go and change the query every week whenever I add a new file to my folder?

Highlighted
Super User II
Super User II

@Sach 

Nope. It should work as you add more files. Try it out, first with the files you showed and then add another one and see. The only thing you might have to change is the last step in the code above, to make it a bit more flexible when changing the data types. Something like (for the last step above):

= Table.TransformColumnTypes(#"Promoted Headers", List.RemoveFirstN(List.Zip({Table.ColumnNames(#"Promoted Headers"), List.Repeat({Int64.Type},Table.ColumnCount(#"Promoted Headers"))}),1))

 This changes the column types by extracting their names rather than hardcoding those as we did earlier

 

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

Highlighted

Yes I see that! I have one more request/concern - although I showed only 3 Jobs, in reality there's hundreds, so is it possible to repeast the following line the same way?

 

#"Expanded Custom.1" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom.1", {"Column1", "Column2", "Column3", "Column4"}, {"Column1", "Column2", "Column3", "Column4"}),

 

 

Highlighted

OK, I do believe that I figured it out.

 

This will go through all the columns without having to specify their names.

 

#"Expanded Custom.1" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom.1", List.Union(List.Transform(#"Removed Other Columns"[Custom.1], each Table.ColumnNames(_)))),
Highlighted
Super User II
Super User II

@Sach 

Yep. Elegant solution  🙂

 

 

 

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

Highlighted

try if this (or a modified version)  function could be useful to your job.

 

 

 

 

let
    conc=(utab,ntab)=>
    let
    lstabs=List.Transform({1..ntab}, each "w"&Text.From(Number.From(Text.End(utab,2))-ntab+_-1)),
    Cws=Table.TransformRows(Expression.Evaluate(utab,#shared), (ru)=>Record.Combine(List.Transform(lstabs, each Expression.Evaluate(_,#shared){[Job=ru[Job]]})&{ru}))
    in
    Table.FromRecords(Cws)
in
   conc

 

 

 

 

the function takes in input a tablename (as text) and a number (the number of table you want to concat with the leader table. The names are supposed to be leadername+decreasing number as suffix. But if you have a list of table from other source there is no need to costruct the list names)

The result is a table which is a concatenation of all tables.

 

image.png

 

 

 

 

 

Helpful resources

Announcements
Community Conference

Power Platform Community Conference

Check out the on demand sessions that are available now!

Community Conference

Microsoft Power Platform Communities

Check out the Winners!

secondImage

Create an end-to-end data and analytics solution

Learn how Power BI works with the latest Azure data and analytics innovations at the digital event with Microsoft CEO Satya Nadella.

Top Solution Authors
Top Kudoed Authors