Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more.
Get startedGrow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.
Hi Everyone
I have data which has a modification date, modification field name and the corresponding data.
So people create address data (for instance) for an employee.
But the fields are not connected / sometimes modified together but not always.
Looks like this:
What I have | ||||||||
Employee | Modification Date | Address | Suburb | State | Zip | Date | Result / What I would like to see | |
EMP01 | 1-Jan-23 | Smith Street | ||||||
EMP01 | 1-Jan-23 | Maintown | ||||||
EMP01 | 1-Jan-23 | EastState | ||||||
EMP01 | 1-Jan-23 | 5000 | 1-Jan-23 | Smith Street Maintown, EastState 5000 | ||||
EMP01 | 1-Feb-23 | Hightown | 1-Feb-23 | Smith Street Hightown, EastState 5000 | ||||
EMP01 | 1-Mar-23 | Back Street | 1-Mar-23 | Back Street Hightown, EastState 5000 | ||||
EMP01 | 1-Apr-23 | New Street | ||||||
EMP01 | 1-Apr-23 | Bigtown | ||||||
EMP01 | 1-Apr-23 | WestState | ||||||
EMP01 | 1-Apr-23 | 6000 | 1-Apr-23 | New Street, Bigtown, WestState 6000 |
How can I clean this up so I get a full address and corresponding date?
Also, if a field is not updated, the data from the previous modification is "carried forward".
Let me know what you think - thanks!
Solved! Go to Solution.
Hi,
Edit 2nd step YourSource = Source and refer your data
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcvUNMDBU0lEy1PVKzNM1MgYyg3MzSzIUgkuKUlNLgFwFJKwUq4NVC0jONzEzryS/PI9I5SDsmlhcElySWJJKpHoQNjUwMMBU6paahFDqkZmegd8lvolFEOVOicnZRHnVsQCqwy+1nDQNIDmnzHT8DkJRDcLhqXiDBkM9CJvBgiYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Employee = _t, #"Modification Date" = _t, Address = _t, Suburb = _t, State = _t, Zip = _t, Column1 = _t]),
YourSource = Source,
AddedSuburbSuffix = Table.TransformColumns(YourSource, {{"Suburb", each if Text.Trim(Text.From(_)) <> "" then Text.Trim(Text.From(_)) & ", " else "", type text}}),
ReplaceBlankToNull = Table.TransformColumns(AddedSuburbSuffix,
List.Transform(Table.ColumnNames(AddedSuburbSuffix), (colName)=>
{ colName, each if Text.Trim(_) = "" then null else Text.Trim(_), type text } ) ),
ColNamesToFill = List.Buffer(List.Select(Table.ColumnNames(ReplaceBlankToNull), each not List.Contains({"Employee", "Modification Date"}, _))),
StepBack = ReplaceBlankToNull,
FilledDown = Table.FillDown(StepBack, ColNamesToFill),
FilledUp = Table.FillUp(FilledDown, ColNamesToFill),
#"Grouped Rows" = Table.Group(FilledUp, {"Employee", "Modification Date"}, {{"Address", each Text.Combine(Record.ToList(Table.Last(Table.SelectColumns(_, ColNamesToFill))), " "), type text}})
in
#"Grouped Rows"
Import your csv. Rename it to MyData. Then create blank query. Open Advanced Editor and replace whole code with the one I created. Close Advanced editor and edit 2nd step called YourSource.
Now it is = Source. Change it to = MyData
Hi,
Edit 2nd step YourSource = Source and refer your data
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcvUNMDBU0lEy1PVKzNM1MgYyg3MzSzIUgkuKUlNLgFwFJKwUq4NVC0jONzEzryS/PI9I5SDsmlhcElySWJJKpHoQNjUwMMBU6paahFDqkZmegd8lvolFEOVOicnZRHnVsQCqwy+1nDQNIDmnzHT8DkJRDcLhqXiDBkM9CJvBgiYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Employee = _t, #"Modification Date" = _t, Address = _t, Suburb = _t, State = _t, Zip = _t, Column1 = _t]),
YourSource = Source,
AddedSuburbSuffix = Table.TransformColumns(YourSource, {{"Suburb", each if Text.Trim(Text.From(_)) <> "" then Text.Trim(Text.From(_)) & ", " else "", type text}}),
ReplaceBlankToNull = Table.TransformColumns(AddedSuburbSuffix,
List.Transform(Table.ColumnNames(AddedSuburbSuffix), (colName)=>
{ colName, each if Text.Trim(_) = "" then null else Text.Trim(_), type text } ) ),
ColNamesToFill = List.Buffer(List.Select(Table.ColumnNames(ReplaceBlankToNull), each not List.Contains({"Employee", "Modification Date"}, _))),
StepBack = ReplaceBlankToNull,
FilledDown = Table.FillDown(StepBack, ColNamesToFill),
FilledUp = Table.FillUp(FilledDown, ColNamesToFill),
#"Grouped Rows" = Table.Group(FilledUp, {"Employee", "Modification Date"}, {{"Address", each Text.Combine(Record.ToList(Table.Last(Table.SelectColumns(_, ColNamesToFill))), " "), type text}})
in
#"Grouped Rows"
Hi @dufoq3 and thanks for offering a solution.
I'm using a CSV file and I'm not sure if I'm copying the Source info correctly.
I get an error with the "let _t " part of the first line in your code above but I don't know Advanced Editor well enough to understand the error.
Can you use a CSV so I can understand how the Source info looks?
Thanks
Import your csv. Rename it to MyData. Then create blank query. Open Advanced Editor and replace whole code with the one I created. Close Advanced editor and edit 2nd step called YourSource.
Now it is = Source. Change it to = MyData
Hi @dufoq3 That was terrific and such a clever way to solve this problem.
Now I understand this solution, I have a lot of scenarios where my data is downloaded in this "staggered way". Thanks so much for taking time to look at this.
Cheerio 👍🏽
Thanks - I will give this a try
Hi,
Unpivot, Pivot, FillDown into Group, Expand and CombineColumns
let
Source = Prev_Step,
Unpivot = Table.UnpivotOtherColumns(Source, {"Employee", "Modification Date"}, "Attribute", "Value"),
Pivot = Table.Pivot(Unpivot, List.Distinct(Unpivot[Attribute]), "Attribute", "Value"),
Group = Table.Group(Pivot, {"Employee"}, {{"Data", each Table.FillDown(_,{"Address", "Suburb", "State", "Zip"})}}),
Expand = Table.ExpandTableColumn(Group, "Data", {"Modification Date", "Address", "Suburb", "State", "Zip"}, {"Modification Date", "Address", "Suburb", "State", "Zip"}),
Result = Table.CombineColumns(Table.TransformColumnTypes(Expand, {{"Zip", type text}}),{"Address", "Suburb", "State", "Zip"},Combiner.CombineTextByDelimiter(",", QuoteStyle.None),"Result")
in
Result
Stéphane
Thanks @slorin - I appreciate your suggestion.
I have abandoned finding a solution for this issue. I will tackle the issue differently. Thanks
Hi @Linnil ,
I think the quickest/simplest way to achieve this would be to group in Power Query.
Multi-select (Ctrl+click) your [Employee] and [Modification Date] columns, then right-click on one of the selected column titles and go to 'Group By...'.
Once in the dialog, add all the other columns as aggregates using the MAX operator.
Pete
Proud to be a Datanaut!
Hi @BA_Pete - thanks for having a go.
That still gave me data on individual lines / not combined.
I have thought of another way to tackle this problem not using this kind of technique. Thanks for your help.
No worries, glad you got it sorted.
Any chance you could share your solution to help future readers?
Pete
Proud to be a Datanaut!
Hi - I have abandoned find a solution for this issue. I will tackle the issue differently. Thanks