cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
dannyboc
Frequent Visitor

Power Query to repeat categories in rows for each date

Hi everyone,

 

I have a table of produced units by date, channel and type. I receive a daily excel file from a client, e.g:

 

plantas of date production date channel type 1type 2
michigan7/25/2021 7/20/2021 calls 12
ohio7/25/2021 7/20/2021 webpage 23
florida7/25/2021 7/20/2021 sales rep 23
Michigan7/25/2021 7/20/2021 webpage 12
Ohio7/25/2021 7/20/2021 sales rep 11
Florida7/25/2021 7/20/2021 calls 34
Michigan7/25/2021 7/20/2021 sales rep 12
Ohio7/25/2021 7/20/2021 calls 01
Florida7/25/2021 7/20/2021 webpage 21
Michigan7/25/2021 7/21/2021 calls 13
Ohio7/25/2021 7/22/2021 calls 23
Florida7/25/2021 7/22/2021 webpage 13

 

When there are no units producted type 1 and 2, the report does not specify that the values are 0, like in the example above. In the production date 7/21/2021, for example, Ohio and Florida didn't have any quantities from calls, sales reps and webpage.

 

How can I make a M query, so the channels "calls", "webpage", and "sales rep", are always shown for each one of the plants, and production dates, even if there are no quantities produced? The expected table would look like:

 

plantas of date production date channel type 1type 2
Michigan7/25/2021 7/20/2021 calls 12
Ohio7/25/2021 7/20/2021 webpage 23
Florida7/25/2021 7/20/2021 sales rep 23
Michigan7/25/2021 7/20/2021 webpage 12
Ohio7/25/2021 7/20/2021 sales rep 11
Florida7/25/2021 7/20/2021 calls 34
Michigan7/25/2021 7/20/2021 sales rep 12
Ohio7/25/2021 7/20/2021 calls 01
Florida7/25/2021 7/20/2021 webpage 21
Michigan7/25/2021 7/21/2021 calls 13
Ohio7/25/2021 7/21/2021 webpage 00
Florida7/25/2021 7/21/2021 sales rep 00
Michigan7/25/2021 7/21/2021 webpage 00
Ohio7/25/2021 7/21/2021 sales rep 00
Florida7/25/2021 7/21/2021 calls 00
Michigan7/25/2021 7/21/2021 sales rep 00
Ohio7/25/2021 7/21/2021 calls 00
Florida7/25/2021 7/21/2021 webpage 00
Michigan7/25/2021 7/22/2021 calls 00
Ohio7/25/2021 7/22/2021 webpage 00
Florida7/25/2021 7/22/2021 sales rep 00
Michigan7/25/2021 7/22/2021 webpage 00
Ohio7/25/2021 7/22/2021 sales rep 00
Florida7/25/2021 7/22/2021 calls 00
Michigan7/25/2021 7/22/2021 sales rep 00
Ohio7/25/2021 7/22/2021 calls 23
Florida7/25/2021 7/22/2021 webpage 13

 

Thank you

1 ACCEPTED SOLUTION

@dannyboc ,

 

Create a new blank query and paste this over the default code in Advanced Editor:

 

let
  // Define Date.Today
  Date.Today = Date.From(DateTime.LocalNow()),
  Source = { Number.From(Date.StartOfYear(Date.AddYears(Date.Today, -2)))..Number.From(Date.Today) },
  convToTable = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
  chgDateType = Table.TransformColumnTypes(convToTable, {{"Column1", type date}}),
  renCol = Table.RenameColumns(chgDateType,{{"Column1", "asOfDate"}}),
    addProductionDate = Table.AddColumn(renCol, "productionDate", each List.Transform({ Number.From(Date.StartOfYear([asOfDate]))..Number.From(Date.EndOfYear(Date.AddYears([asOfDate], 1))) }, each Date.From(_))),
    expandProductionDate = Table.ExpandListColumn(addProductionDate, "productionDate"),
    addPlantCJ = Table.AddColumn(expandProductionDate, "plant", each plant),
    expandPlantCJ = Table.ExpandTableColumn(addPlantCJ, "plant", {"plant"}, {"plant"}),
    addChannelCJ = Table.AddColumn(expandPlantCJ, "channel", each channel),
    expandChannelCJ = Table.ExpandTableColumn(addChannelCJ, "channel", {"channel"}, {"channel"}),
    addUniqueRowID = Table.AddColumn(expandChannelCJ, "uniqueRowID", each Text.Combine({Text.From(Number.From([asOfDate]), "en-GB"), Text.From(Number.From([productionDate]), "en-GB"), [plant], [channel]}, "-"), type text),
    chgTypes = Table.TransformColumnTypes(addUniqueRowID,{{"productionDate", type date}, {"plant", type text}, {"channel", type text}})
in
  chgTypes

 

 

You will need your plant unique list and channel unique list already set up in Power Query (and called exactly 'plant' & 'channel') for this to work.

 

You can now follow the steps I took to generate this template table.

 

Pete

View solution in original post

6 REPLIES 6
BA_Pete
Super User
Super User

Hi @dannyboc ,

 

I would start by building the template table before adding in the values via merge in Power Query, or relationships in the data model.

Start by identifying how you will dynamically create your [as of date] and [production date] columns. There's not enough information in your post for me to understand where these come from, but you should be able to use your calendar table or similar to create two columns that contain every combination of [as of date] and [production date] that you need in your table.

Once you have those, then you will want to crossjoin your plants and channels into this table. You will need a plants table that contains a unique list of all the plants you want to see, and the same for channels.

In your template table, add a new column, call it 'plants', and in the code window just type the name of the table that contains all the plant names. Expand this new column and you will see that it duplicates your date columns for each unique plant value.

Do the same for channels.

Now you have a template table that contains all possible combinations of these dimensions.

From here you would create a unique Id field for each row in both your template table and your fact table, maybe by combining the values of each field into a single field, then either merge on your fact table to get the values where they exist and replace nulls with 0, or use your unique ID field in the data model to relate this template table to your fact table for the same effect.

 

Pete

dannyboc
Frequent Visitor

Hi Pete,

 

Thank you for the idea you provided. I was hoping there was a query I could run to generate such table.

 

The data comes from an ERP as a daily excel file, unfortunately there is no other way to get the data at this time. I save the daily files in a SharePoint folder, so Power BI can merge them.

I will generate a template table first.

 

Do you know how to generate a combination of "as of date" and "production date", starting with as of date (January 1, 2019), until today? For each "as of date" there are production dates of the as of date year, up to the next year. E.g. the "as of date" of January 1st, 2019 contains production dates from January 1st, 2019 up to December 31, 2020.

 

"as of date" January 2, 2019 contains production dates from January 1,2019 up to December 31,2020.

 

Please let me know, I have done research but nothing comes close.

 

Thank you

@dannyboc ,

 

Create a new blank query and paste this over the default code in Advanced Editor:

 

let
  // Define Date.Today
  Date.Today = Date.From(DateTime.LocalNow()),
  Source = { Number.From(Date.StartOfYear(Date.AddYears(Date.Today, -2)))..Number.From(Date.Today) },
  convToTable = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
  chgDateType = Table.TransformColumnTypes(convToTable, {{"Column1", type date}}),
  renCol = Table.RenameColumns(chgDateType,{{"Column1", "asOfDate"}}),
    addProductionDate = Table.AddColumn(renCol, "productionDate", each List.Transform({ Number.From(Date.StartOfYear([asOfDate]))..Number.From(Date.EndOfYear(Date.AddYears([asOfDate], 1))) }, each Date.From(_))),
    expandProductionDate = Table.ExpandListColumn(addProductionDate, "productionDate"),
    addPlantCJ = Table.AddColumn(expandProductionDate, "plant", each plant),
    expandPlantCJ = Table.ExpandTableColumn(addPlantCJ, "plant", {"plant"}, {"plant"}),
    addChannelCJ = Table.AddColumn(expandPlantCJ, "channel", each channel),
    expandChannelCJ = Table.ExpandTableColumn(addChannelCJ, "channel", {"channel"}, {"channel"}),
    addUniqueRowID = Table.AddColumn(expandChannelCJ, "uniqueRowID", each Text.Combine({Text.From(Number.From([asOfDate]), "en-GB"), Text.From(Number.From([productionDate]), "en-GB"), [plant], [channel]}, "-"), type text),
    chgTypes = Table.TransformColumnTypes(addUniqueRowID,{{"productionDate", type date}, {"plant", type text}, {"channel", type text}})
in
  chgTypes

 

 

You will need your plant unique list and channel unique list already set up in Power Query (and called exactly 'plant' & 'channel') for this to work.

 

You can now follow the steps I took to generate this template table.

 

Pete

View solution in original post

dannyboc
Frequent Visitor

Hi @BA_Pete ,

 

The method you previoulsy posted worked for me!

I manually created some tables like "plant" and "channel" in blank queries. Before I was extracting the unique plant and channel values from my original table, so that is why I was facing a memory error.

 

Thank you so much!

dannyboc
Frequent Visitor

Hi Pete,

 

Thank you for sharing this! On the preview mode, the query looks to work.

Replicating with current data, I found out that the new table created (Date) with the query would take tens of gigabytes. I'm not sure if I'm doing something wrong. Can you please check the pbi file attached?

 

https://drive.google.com/file/d/1iuPOFAmyT8rKXKPKNAjhXxsaaV8R9OG2/view?usp=sharing

 

Thanks

Hi @dannyboc ,

 

I can't seem to download the files you shared, but I don't think I actually need to. What you are describing sounds plausible due to the crossjoins.

In order to show a row even if there's no data, the row needs to exist in the source data (not always, but I'll explain this in a minute, it might work better for you with limitations).

The code that I shared effectively does this:

 

- [asOfDate] = Start with every date for up to three years, maximum 1,096 rows

- [productionDate] = For each [asOfDate], crossjoin every date of another two years (731 rows).

- [plant] = For each [productionDate], crossjoin every possible plant (3 rows in example data).

- [channel] = For each [plant], crossjoin every possible channel (3 rows in example data).

 

The resultant row expansion, therefore, looks like this:

 

- Date fields: 1,096 * 731 = 801,176 rows

- 801,176 * 3 plants = 2,403,528 rows

- 2,403,528 * 3 channels = 7,210,584 rows

 

You can see quite quickly that just adding one more plant or channel adds 1,096 * 731 * 3 rows to the table.

If your real data has maybe 20 plants and 15 channels, we're getting 240m+ rows. So your large data size sounds plausible.

 

However, there may be another way: Instead of pre-building the rows in Power Query, you can create them at runtime instead. You would need to make sure that your [plant] and [channel] dimension tables are related to your fact table, and you would also need to create a new dimension table from the [asOfDate] and [productionDate] steps from my code and relate this to the fact table too. Ensure that it is the dimension table fields that are used in any visualisations. Then you just need to add ' + 0' to the end of your Type1 and Type2 measures. It might look something like this:

 

_type1 = SUM(yourFactTable[Type1]) + 0

 

 

This is going to force the visual to apply a zero value to all dimensions, whether there's values or not.

 

HOWEVER!!

There's a very good reason I didn't suggest this seemingly simple solution in the first place: doing this will still require all these rows to be materialised somewhere in order to be displayed. The difference with this solution is that it's going to use end-user RAM to do it, rather than having your gateway machine do it overnight then just serve it pre-made. I can't say for certain, but my gut-feel is that this will grind your end-users computer/your report/the universe to a halt.

This could possibly be mitigated by using parameters or fixed report filters to limit the rows needing to be materialised at runtime, but this would require specific scenario testing and is getting outside the scope of this thread.

 

Pete

Helpful resources

Announcements
2022 Release Wave 1 760x460.png

2022 Release Wave 1 Plan

Power Platform release plan for the 2022 release wave 1 describes all new features releasing from April 2022 through September 2022.

Power BI December 2021 Update_carousel 768x460.jpg

Check it Out!

Click here to read more about the December 2021 Updates!

Jan 2022 Dev Camp 768x460 copy.png

Power BI Dev Camp- January 27th, 2022

Mark your calendars and join us for our next Power BI Dev Camp!

Top Solution Authors