Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello,
I have machine data here which sometimes writes errors and forget to write the name of the creator of an order and only sometimes rewrite it correctly with the right content:
OrdeID | Starttime | CreatedBy | ||
1 | 2021-01-01 | Worker1 | ||
2 | 2021-01-02 | Worker2 | ||
3 | 2021-01-03 | Worker3 | ||
4 | 2021-01-04 | Worker4 | ||
1 | 2021-01-01 | |||
2 | 2021-01-02 | |||
3 | 2021-01-03 | |||
4 | 2021-01-04 | |||
11 | 2021-01-05 | |||
12 | 2021-01-06 | |||
13 | 2021-01-07 | |||
14 | 2021-01-08 | |||
In that case it creates a duplicate, but I only want to keep the ones which has "created by" data. The problem is I also need to consider other cells which are empty in "created by" and has no correct duplicates with data. these I also need to keep. In my sample table I only want to get rid of the ones with red orderID.
Hope someone has an idea how to get rid of the ones without deleting the other data.
Thank you very much in advance.
Best.
Solved! Go to Solution.
The way I understood the problem as you want to
keep all rows "OrdeID Starttime CreatedBy",
with group by "OrdeID Starttime" and keep the first created by without blanks as first preference. If this is true, you can do this.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIyMDLUNQAhICc8vyg7tchQKVYnWskIWdIILmkEljRGljSGSxqDJU2QJU3gkiZgSQw7FbDbpoDdHgXsNkCEDVEMN0WIo5huhhBHMd4cIY5ivgVEPBYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [OrdeID = _t, Starttime = _t, CreatedBy = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"OrdeID", Int64.Type}, {"Starttime", type date}, {"CreatedBy", type text}}),
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"OrdeID", Order.Ascending}, {"Starttime", Order.Ascending}, {"CreatedBy", Order.Descending}}),
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"OrdeID", "Starttime"}, {{"Alldatarows", each _, type table [OrdeID=nullable number, Starttime=nullable date, CreatedBy=nullable text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([Alldatarows], "Sub Index", 1, 1)),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"OrdeID", "Starttime", "CreatedBy", "Sub Index"}, {"Custom.OrdeID", "Custom.Starttime", "Custom.CreatedBy", "Custom.Sub Index"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom", each [Custom.Sub Index] = 1)
in
#"Filtered Rows"
Basically, I did sorting of data, tried creating subgroup index and removed subgroup index > 1
If this is what you need, rename the columns, and remove the sub index column.
Hope this helps!
The way I understood the problem as you want to
keep all rows "OrdeID Starttime CreatedBy",
with group by "OrdeID Starttime" and keep the first created by without blanks as first preference. If this is true, you can do this.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIyMDLUNQAhICc8vyg7tchQKVYnWskIWdIILmkEljRGljSGSxqDJU2QJU3gkiZgSQw7FbDbpoDdHgXsNkCEDVEMN0WIo5huhhBHMd4cIY5ivgVEPBYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [OrdeID = _t, Starttime = _t, CreatedBy = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"OrdeID", Int64.Type}, {"Starttime", type date}, {"CreatedBy", type text}}),
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"OrdeID", Order.Ascending}, {"Starttime", Order.Ascending}, {"CreatedBy", Order.Descending}}),
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"OrdeID", "Starttime"}, {{"Alldatarows", each _, type table [OrdeID=nullable number, Starttime=nullable date, CreatedBy=nullable text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([Alldatarows], "Sub Index", 1, 1)),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"OrdeID", "Starttime", "CreatedBy", "Sub Index"}, {"Custom.OrdeID", "Custom.Starttime", "Custom.CreatedBy", "Custom.Sub Index"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Custom", each [Custom.Sub Index] = 1)
in
#"Filtered Rows"
Basically, I did sorting of data, tried creating subgroup index and removed subgroup index > 1
If this is what you need, rename the columns, and remove the sub index column.
Hope this helps!
Hey @Applicable88 ,
Per chance you want to keep all the records but be able to create a filter to achieve your result you can try this:
Column 2 =
VAR a =
CALCULATE (
COUNT ( 'Table'[OrdeID] ),
FILTER ( 'Table', 'Table'[OrdeID] = EARLIER ( 'Table'[OrdeID] ) )
)
VAR b =
CALCULATE (
COUNT ( 'Table'[OrdeID] ),
ALLEXCEPT ( 'Table', 'Table'[CreatedBy] )
)
RETURN
IF ( a = 1, "YES", IF ( b = 1, "YES", "NO" ) )
You'll end up with a column where you'll want to filter on YES.
Proud to be a Super User!
Open Power Query. Select only the OrdeID column and go to Remove Rows and select Remove Duplicates. That should do it. I replicated your case, see result below:
Hello @Anonymous,
thanks for your effort, but I saw you use a step called "remove bottom rows" which implies it only works if the one needed to be removed always comes second, which is unfortunately not in my case. It can be mixed. Sometimes comes first sometimes second etc. Any possiblites to overcome thes irregularities?
Best.
That step was just because I created the table manually and the last row was full of nulls. This would not happen on your data source 🙂
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.