cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Frequent Visitor

Infilling Missing data

Hi Power BI Community,

I have a problem where a series of providers supplying data every day. However providers may miss a day. See table below
(Provider 2 has missed 02/09/2020 and Provider 3 has missed 03/09/2020)

Date   Provider      A   B   C
01/09/2020     1a1b1c1
01/09/2020     2a2b2c2
01/09/2020     3a3b3c3
02/09/2020     1a4b4c4
02/09/2020     3a5b5c5
03/09/2020     1a6b6c6
03/09/2020     2a7b7c7

 

Is there a way I can duplicate the previously reported entree for the missing days? i.e add the following rows where required

Date    Provider ABC
02/09/2020     2 a2b2c2
03/09/2020     3 a5b5c5

 

Thank you in advance!
Liam

1 ACCEPTED SOLUTION
Super User III
Super User III

Hi @LCrossling 

This can also be done in DAX, and it would perhaps be simpler. Place the following M code in a blank query to see the steps of the transformation in Power Query.  It can be done in a more compact way but I have used simpler steps on purpose so that it is clearer.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dc+7DcAgDADRVSLXSID5KbNEFMH775ActEBxcvFki+eREH24vQYN4uTixX94ySAWpbsdUxgZxPTAEowMYmkx3R7NMGL5wOa2AiNWFkvbbRVGrB7Y/EKDEWvS+wc=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, Provider = _t, A = _t, B = _t, C = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Provider", Int64.Type}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Date"}, {{"Providers", each [Provider], type table [Date=nullable date, Provider=nullable number, A=nullable text, B=nullable text, C=nullable text]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Missing", each List.RemoveMatchingItems(List.Distinct(#"Changed Type"[Provider]),[Providers])),
    #"Expanded Missing" = Table.ExpandListColumn(#"Added Custom", "Missing"),
    #"Filtered Rows" = Table.SelectRows(#"Expanded Missing", each ([Missing] <> null)),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Providers"}),
    #"Added Custom1" = Table.AddColumn(#"Removed Columns", "Previous date", each List.Max(Table.SelectRows(#"Changed Type", (inner)=> [Missing]=inner[Provider] and inner[Date] <[Date])[Date])),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Missing rows", each Table.SelectRows(#"Changed Type", (inner)=> inner[Date]=[Previous date] and inner[Provider] = [Missing] ) ),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom2",{"Missing", "Previous date"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns1",{{"Date", "Actual Date"}}),
    #"Expanded Missing rows" = Table.ExpandTableColumn(#"Renamed Columns", "Missing rows", {"Date", "Provider", "A", "B", "C"}, {"Date", "Provider", "A", "B", "C"}),
    missingRowsT = Table.TransformColumnTypes(#"Expanded Missing rows",{{"Date", type date}, {"Provider", Int64.Type}, {"A", type text}, {"B", type text}, {"C", type text}}),
    #"Removed Columns2" = Table.RemoveColumns(missingRowsT,{"Date"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Removed Columns2",{{"Actual Date", "Date"}}),
    res= Table.Combine({#"Changed Type", #"Renamed Columns1"}),
    #"Sorted Rows" = Table.Sort(res,{{"Date", Order.Ascending}, {"Provider", Order.Ascending}})
in
    #"Sorted Rows"

 

 

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

View solution in original post

4 REPLIES 4
Impactful Individual
Impactful Individual

@LCrossling 

In the table, if no data is missing from each Provider on the first day, try my code.

// output
let
    Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("tdK9DsIgEMDxV2lubtLj6Ed08+MB3MUBtYOTSaMupu8uOVzwrpu3QPgHkl+A4xv28THCGtCFBlehISSEGsITMWKVp8N0f92u45RXeUxHij3fyf2erTZpZxR5m+pZ1F2qFwdz/W8W6SyRmSUqs8iA5XWWyMwSlVm+YJHlI7YqS1RmtQashdvqVJaozOoKlre8rV5licqs3oC18OUHlSUqswaYTx8=", BinaryEncoding.Base64),Compression.Deflate))),
    date_tbl = Table.FromValue(List.Distinct(Source[Date]),[DefaultColumnName="Date"]),
    fx = (tbl)=>
       let 
           nested = Table.NestedJoin(date_tbl, "Date", tbl, "Date", "t", JoinKind.LeftOuter),
           sTbl = Table.LastN(Table.RemoveLastN(nested, each not Table.IsEmpty([t])),2)
       in  Table.ReplaceValue(sTbl[t]?{0}? ??#table({"Date"},{}), each [Date], sTbl[Date]?{1}?, Replacer.ReplaceValue, {"Date"}),
       //Table.ReplaceValue(let t= sTbl[t]?{0}? in if t=null then #table({"Date"},{}) else t, each [Date], sTbl[Date]?{1}?, Replacer.ReplaceValue, {"Date"}),
    group = Table.Group(Source, {"   Provider  "}, {"t", fx})[t],
    result = Table.Combine(group)
in
    result

If the code with the consecutive question mark operator doesn't work on your computer, replace it with the code I've commented out.

Super User III
Super User III

Hi @LCrossling 

This can also be done in DAX, and it would perhaps be simpler. Place the following M code in a blank query to see the steps of the transformation in Power Query.  It can be done in a more compact way but I have used simpler steps on purpose so that it is clearer.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dc+7DcAgDADRVSLXSID5KbNEFMH775ActEBxcvFki+eREH24vQYN4uTixX94ySAWpbsdUxgZxPTAEowMYmkx3R7NMGL5wOa2AiNWFkvbbRVGrB7Y/EKDEWvS+wc=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, Provider = _t, A = _t, B = _t, C = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Provider", Int64.Type}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Date"}, {{"Providers", each [Provider], type table [Date=nullable date, Provider=nullable number, A=nullable text, B=nullable text, C=nullable text]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Missing", each List.RemoveMatchingItems(List.Distinct(#"Changed Type"[Provider]),[Providers])),
    #"Expanded Missing" = Table.ExpandListColumn(#"Added Custom", "Missing"),
    #"Filtered Rows" = Table.SelectRows(#"Expanded Missing", each ([Missing] <> null)),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Providers"}),
    #"Added Custom1" = Table.AddColumn(#"Removed Columns", "Previous date", each List.Max(Table.SelectRows(#"Changed Type", (inner)=> [Missing]=inner[Provider] and inner[Date] <[Date])[Date])),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Missing rows", each Table.SelectRows(#"Changed Type", (inner)=> inner[Date]=[Previous date] and inner[Provider] = [Missing] ) ),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom2",{"Missing", "Previous date"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns1",{{"Date", "Actual Date"}}),
    #"Expanded Missing rows" = Table.ExpandTableColumn(#"Renamed Columns", "Missing rows", {"Date", "Provider", "A", "B", "C"}, {"Date", "Provider", "A", "B", "C"}),
    missingRowsT = Table.TransformColumnTypes(#"Expanded Missing rows",{{"Date", type date}, {"Provider", Int64.Type}, {"A", type text}, {"B", type text}, {"C", type text}}),
    #"Removed Columns2" = Table.RemoveColumns(missingRowsT,{"Date"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Removed Columns2",{{"Actual Date", "Date"}}),
    res= Table.Combine({#"Changed Type", #"Renamed Columns1"}),
    #"Sorted Rows" = Table.Sort(res,{{"Date", Order.Ascending}, {"Provider", Order.Ascending}})
in
    #"Sorted Rows"

 

 

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

View solution in original post

Thanks for this seems to work but it is running very slowly on my huge dataset. What would be the method you would use in DAX?

@LCrossling 

Apologies, I forgot to answer earlier. Try creating a new calculated table in DAX, where Table1 is your initial table:

Table1_filled =
VAR t1_ =
    CROSSJOIN ( DISTINCT ( Table1[Date] ), DISTINCT ( Table1[Provider] ) )
VAR t2_ =
    SUMMARIZE ( Table1, Table1[Date], Table1[Provider] )
VAR filledRowsT_ =
    GENERATE (
        EXCEPT ( t1_, t2_ ),
        VAR previousDate_ =
            CALCULATE ( MAX ( Table1[Date] ), Table1[Date] < EARLIER ( Table1[Date] ) )
        RETURN
            CALCULATETABLE (
                ROW (
                    "A", DISTINCT ( Table1[A] ),
                    "B", DISTINCT ( Table1[B] ),
                    "C", DISTINCT ( Table1[C] )
                ),
                Table1[Date] = previousDate_
            )
    )
RETURN
    UNION ( Table1, filledRowsT_ )

If you can share your real data (perhaps anonymized if necessary) I can try to optimize it, either in DAX or PQ

 

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

Cheers 

 

SU18_powerbi_badge

 

 

Helpful resources

Announcements
secondImage

Happy New Year from Power BI

This is a must watch for a message from Power BI!

December Update

Check it Out!

Click here to read more about the December 2020 Updates!

Community Blog

Check it Out!

Click here to read the latest blog and learn more about contributing to the Power BI blog!

Get Ready for Power BI Dev Camp

Get Ready for Power BI Dev Camp

Mark your calendars and join us for our next Power BI Dev Camp!.

Top Solution Authors
Top Kudoed Authors