Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
Hi,
I have an interesting scenario where I'd like to filter my main data table...
Row | Customer | Consignor | Supplier | Region | SubRegion | Count | Value |
1 | A | ZZ | 20 | 11.3 | |||
2 | A | 22 | 12.43 | ||||
3 | A | ZZ | BB | 1 | 2 | 36 | 20.34 |
4 | C | BB | 32 | 18.08 | |||
5 | C | 58 | 32.77 | ||||
6 | D | BB | 1 | 96 | 54.24 | ||
7 | D | 69 | 38.985 | ||||
8 | B | DD | 465 | 262.725 | |||
9 | B | AA | 3 | 4 | 51 | 28.815 | |
10 | B | DD | 5 | 5 | 55 | 31.075 | |
11 | E | AA | 66 | 37.29 | |||
12 | E | AA | 12 | 6.78 | |||
13 | E | AA | 6 | 3.39 |
By certain rules that reside in a "rules" table...
Company | Field | Check Field | Required |
A | Customer | Consignor | 1 |
BB | Supplier | Region | 1 |
BB | Supplier | SubRegion | 1 |
C | Customer | Supplier | 1 |
D | Customer | Supplier | 1 |
The "Company" column in the rules table doesn't map directly to a column in the data table. The "Field" column in the rules table defines which column it matches with in the data table. E.g. Company A, C and D are Customer's in the data table but company BB is a Supplier in the data table.
The "Check Field" column defines the column that the rule relates to in the data table. All of the rules are designed to enforce a value if defined, i.e. they can't be null or blank if defined in the rules table.
So, for the first rule, Company A who is a Customer in the data table must always have a Consignor entered.
The next rules, Company BB who is a Supplier in the data table must always have Region and SubRegion entered.
Following that logic, the end result I am looking for in this example is, only rows 2, 4, 5, 6 and 7 visible.
Reason for this (in case it is not clear):
Row 2 doesn't have a Consignor entered for Customer A.
Row 4 doesn't have a Region or SubRegion entered for Supplier BB so would fail on two rules.
Row 5 doesn't have a Supplier entered for Customer C.
Row 6 doesn't have a SubRegion entered for Supplier BB.
Row 7 doesn't have a Supplier entered for Customer D.
I need the solution in M code.
I thought I'd be able to do something with...
= Table.AddColumn(#"Removed Other Columns", "Custom", each List.Distinct(Table.SelectRows(Rules, (x) => Text.Contains([Customer], x[Company]) and ... )[Required]))
but I don't know how to incorporate the "Field" and "Check Field" columns to this.
Any suggestions appreciated.
Thanks,
Kim
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
Solved! Go to Solution.
Hi @KNP
if your rules table not excessively big - under 50-60 rows (even 100 should be fine, but a bit slower).
This is the main data table:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dZBNDsQgCEav0rhuiIC/y3Y6l6jp/a8xgnZCmnTxkRieT7A1h251W8959rKYkO8FEdhda3M0seVJSQMJwsDY2vZdegL1cFIncFAw9NNnOpS7hazCAr4oFw1nE4uykLNiIj+sDuehSicGoPFsNpxNqqIrUEtUTuz3VIe9EJJMRKm/TAOtBt1kd/kDWS/q6gUKDhD9UxrvSGEEnycpN7/W+Z9T1uEMVAdIb6B2EuTxi8ivQvEBd931Aw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Row = _t, Customer = _t, Consignor = _t, Supplier = _t, Region = _t, SubRegion = _t, Count = _t, Value = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Row", Int64.Type}, {"Customer", type text}, {"Consignor", type text}, {"Supplier", type text}, {"Region", Int64.Type}, {"SubRegion", Int64.Type}, {"Count", Int64.Type}, {"Value", type number}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Filter", each fApplyRules(_)),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Filter] = false))
in
#"Filtered Rows"
This is the fApplyRules function:
(pRow as record) =>
let
mRow = pRow,
Source = Rules,
// Filter Supplier/Customer
#"SC Filtered" = Table.SelectRows(Source, each ([Field] = "Supplier" and [Company] = mRow[Supplier]) or ([Field] = "Customer" and [Company] = mRow[Customer])),
Output = List.Accumulate(#"SC Filtered"[Check Field], true, (s, a)=> s and (Record.Field(mRow, a) <> null and Text.Trim(Text.From(Record.Field(mRow, a))) <> "" ))
in
Output
Kind regards,
JB
This solution attempts to resolve the performance issue for larger rules tables by joining on Customer and Supplier separately.
First the rules query
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUXIuLS7Jz00tAjHz84oz0/PyQWxDpVidaCUnJyAzuLSgICcTrCIoNT0zPw+ndHBpEpoKZ1QbkNRC5F3wyccCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"Company" = _t, Field = _t, #"Check Field" = _t, Required = _t]),
#"Filtered Rows" = Table.SelectRows(Source, each ([Required] = "1")),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Company", "Field"}, {{"CheckField", each List.Combine(List.Transform([Check Field], each {_})), type text}})
in
#"Grouped Rows"
Now the main query
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZBdDoMwCIDv0mdDCvT3UecuYeP9r7ECnXWLCRDFr1/B1hy6xa09j6OXK8j3ggjszqU5GswPIU0kCIbwXbNt8k2gnpzUBxwUDP3tZQrFLFhtBXxRKE7oiliUg5wVEet+86A9V+nHAGS35UldkaqICtQSlRHvmGSfbEgyBKV+IRlXJ7fKqrKybBN10wIFjUP/J4zflMIIPg9QDr5vxjGfrMAZqBpFj5S2E2T7YcjPKjEBd9H5AQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Row = _t, Customer = _t, Consignor = _t, Supplier = _t, Region = _t, SubRegion = _t, Count = _t, Value = _t]),
#"Replaced Value" = Table.ReplaceValue(Source,"",null,Replacer.ReplaceValue,{"Row", "Customer", "Consignor", "Supplier", "Region", "SubRegion", "Count", "Value"}),
#"Merged Queries" = Table.NestedJoin(#"Replaced Value", {"Customer"}, Rules, {"Company"}, "Rules", JoinKind.Inner),
#"Expanded Rules" = Table.ExpandTableColumn(#"Merged Queries", "Rules", {"Field", "CheckField"}, {"Field", "CheckField"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Rules", each ([Field] = "Customer") and
not List.Accumulate([CheckField], true, (s,c) => s and Record.Field(_,c) <> null)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Field", "CheckField"}),
StartSupplier = #"Replaced Value",
#"Merged Queries1" = Table.NestedJoin(StartSupplier, {"Supplier"}, Rules, {"Company"}, "Rules", JoinKind.Inner),
#"Expanded Rules1" = Table.ExpandTableColumn(#"Merged Queries1", "Rules", {"Field", "CheckField"}, {"Field", "CheckField"}),
#"Filtered Rows 1" = Table.SelectRows(#"Expanded Rules1", each ([Field] = "Supplier") and
not List.Accumulate([CheckField], true, (s,c) => s and Record.Field(_,c) <> null)),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows 1",{"Field", "CheckField"}),
Combine = Table.Distinct(#"Removed Columns" & #"Removed Columns1")
in
Combine
Hi @KNP
if your rules table not excessively big - under 50-60 rows (even 100 should be fine, but a bit slower).
This is the main data table:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dZBNDsQgCEav0rhuiIC/y3Y6l6jp/a8xgnZCmnTxkRieT7A1h251W8959rKYkO8FEdhda3M0seVJSQMJwsDY2vZdegL1cFIncFAw9NNnOpS7hazCAr4oFw1nE4uykLNiIj+sDuehSicGoPFsNpxNqqIrUEtUTuz3VIe9EJJMRKm/TAOtBt1kd/kDWS/q6gUKDhD9UxrvSGEEnycpN7/W+Z9T1uEMVAdIb6B2EuTxi8ivQvEBd931Aw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Row = _t, Customer = _t, Consignor = _t, Supplier = _t, Region = _t, SubRegion = _t, Count = _t, Value = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Row", Int64.Type}, {"Customer", type text}, {"Consignor", type text}, {"Supplier", type text}, {"Region", Int64.Type}, {"SubRegion", Int64.Type}, {"Count", Int64.Type}, {"Value", type number}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Filter", each fApplyRules(_)),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Filter] = false))
in
#"Filtered Rows"
This is the fApplyRules function:
(pRow as record) =>
let
mRow = pRow,
Source = Rules,
// Filter Supplier/Customer
#"SC Filtered" = Table.SelectRows(Source, each ([Field] = "Supplier" and [Company] = mRow[Supplier]) or ([Field] = "Customer" and [Company] = mRow[Customer])),
Output = List.Accumulate(#"SC Filtered"[Check Field], true, (s, a)=> s and (Record.Field(mRow, a) <> null and Text.Trim(Text.From(Record.Field(mRow, a))) <> "" ))
in
Output
Kind regards,
JB
Hi @Anonymous @Anonymous ,
Thank you both. I like both of those solutions for different reasons. My rules table is not likely to get very big. Probably 20 max. Although there might be a few new combinations of Field and Check Field that would require either adding to the function or repeating the merge steps depending on which solution I use.
JB - if you have time could you please explain a couple of things to help my understanding.
This line in the function, I don't fully understand what it is doing...
Output = List.Accumulate(#"SC Filtered"[Check Field], true, (s, a)=> s and (Record.Field(mRow, a) <> null and Text.Trim(Text.From(Record.Field(mRow, a))) <> "" ))
and this line in the main query (I think it is just my understanding of how the function is invoked with the '_')...
#"Added Custom" = Table.AddColumn(#"Changed Type", "Filter", each #"fApplyRules"(_)),
@Anonymous - likewise, if you wouldn't mind helping my understanding of a couple of things.
I understand this is grouping the Check Field by the Company and Field and creating a list of possible CheckFields...
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Company", "Field"}, {{"CheckField", each List.Combine(List.Transform([Check Field], each {_})), type text}})
...I guess I just need to understand better what this part of the main query is doing...
#"Filtered Rows" = Table.SelectRows(#"Expanded Rules", each ([Field] = "Customer") and not List.Accumulate([CheckField], true, (s,c) => s and Record.Field(_,c) <> null)),
Thanks,
Kim
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
Hi Kim,
1. This is where filtering magic happens. As you may have more than one column "required" for each customer, this piece of code checks every condition set in the rules table: i.e. for your BB rule it consequently checks that both region and then subregion are not empty (trimming was added to eliminate " " cases). "and" logic makes the output value false if any of the sub-checks fail. If you add another rule of type:
BB | Supplier | Consignor | 1 |
this code will make 3 checks (region, subregion & consignor) and so forth.
For more detail and examples, you can search the List.Accumulate(). This is one of the most useful functions in M. There are several really good articles on the Internet.
2. Correct. The _ is used as a reference to the current "hidden" parameter passed to several Table & List functions and usually has the meaning of "current row/record" for table functions and "current item" for list functions. In this case, it passes a current row from the Main table into the filtering function in a form of "type record".
Kind regards,
JB
Thank you both.
The function will work perfectly for my needs.
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |