Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I have a number of data fields that I am trying to cleanse and am trying to find the most efficient way to go about this. Please see below sample of data:
Machine Names: |
Acritech Hole Popper ED-2000M |
Acritech Hole Popper ED-2000M2 |
Acritech Hole Popper ED-2000MR |
Acritech Hole Popper ED-2500M |
Dener Press Brake Elite-XL-175-3.6 |
Dener Press Brake Elite-XL-320-4 |
Gantry Laser 25LMX |
Gantry Laser 25LMXII |
Gantry Laser 30LMXIII |
Gantry Laser 35LMXRII |
Gantry Laser 40LMXVII |
Ingersoll Sinker EDM Eagle 500 |
Ingersoll Sinker EDM Gantry 1500HSJ |
Ingersoll Sinker EDM Gantry 500 |
Ingersoll Sinker EDM Gantry 500HSJ |
Ingersoll Sinker EDM Gantry 800 |
Goal: To be able to extract a specific portion of the machine name (EX: Acritech Hole Popper / Dener Press Brake / Gantry Laser / Ingersoll Sinker EDM) from a list simular to the sample data above.
Problem: I am struggling on finding the most efficient way of extracting a certain portion of the machine name because there are hundreds of names like this and not all of the companies have the same number of characters or any delimiters that I can extract from.
How would you go about this? Calculated column with hundreds of IF statements?
Any guidance that you can share will be greatly appreciated!
Thank you datanauts!
Solved! Go to Solution.
@Anonymous,
You can solve it in below way:
1. As because there is no common logic between rows then you need to create a dictionary by yourself. So, create a table called 'Dictionary' which will store all correct machine names as below:
(example - link)
2. Create a new blank query and call it fnFuzzyLookup. Then paste below Power Query code there:
(Input as text) => let ListCount = List.Count(Dictionary[Brand]), Result = List.Generate( () => [ i = 0, t = Dictionary[Brand]{i}, k = if Text.Middle(Input,0,Text.Length(t)) = t then 1 else 0 ], each (if ([i] < ListCount and [k] < 2) then true else false), each [ i = [i] + 1, t = Dictionary[Brand]{i}, k = if [k] = 1 then 2 else (if (Text.Middle(Input,0,Text.Length(t)) = t) then 1 else 0) ], each (if [k] = 0 then "" else [t]) ) in List.Last(Result)
3. Go to the main table (query) and then Add column -> Invoke custom function:
Regards,
Ruslan
-------------------------------------------------------------------
Did I answer your question? Mark my post as a solution!
@Anonymous,
You can solve it in below way:
1. As because there is no common logic between rows then you need to create a dictionary by yourself. So, create a table called 'Dictionary' which will store all correct machine names as below:
(example - link)
2. Create a new blank query and call it fnFuzzyLookup. Then paste below Power Query code there:
(Input as text) => let ListCount = List.Count(Dictionary[Brand]), Result = List.Generate( () => [ i = 0, t = Dictionary[Brand]{i}, k = if Text.Middle(Input,0,Text.Length(t)) = t then 1 else 0 ], each (if ([i] < ListCount and [k] < 2) then true else false), each [ i = [i] + 1, t = Dictionary[Brand]{i}, k = if [k] = 1 then 2 else (if (Text.Middle(Input,0,Text.Length(t)) = t) then 1 else 0) ], each (if [k] = 0 then "" else [t]) ) in List.Last(Result)
3. Go to the main table (query) and then Add column -> Invoke custom function:
Regards,
Ruslan
-------------------------------------------------------------------
Did I answer your question? Mark my post as a solution!
@zoloturu absolutely stunning! There was 100% chance that I could not have come up with this on my own! So much gratitude! Thank you!
Question for you, is there a way for me to add a further modification to this Power Query? Everything worked out great but I have a few fields that are null/blank and I have custom names that I would like to fill in.
Ex: Any descr named "Gantry Laser" be updated to = "Tanaka Gantry Laser"? Please see screenshot below.
My hope is to fill the blank/null values with names that I will have the option to custom.
Any guidance will be greatly appreciated!
@Anonymous,
If I understood you correctly then below approach can help you:
1. Go to the table where you plan to add a column.
2. Menu -> Add column -> Custom column
3. Enter a condition similar to below
if Text.Middle([NewColumn],0,12) = "Gantry Laser" then "Tanaka " & [NewColumn] else [NewColumn]
(basically, check if the left side is 'Gantry Laser' then add 'Tanaka' to it or you can use a function Text.Replace)
Text.Middle function reference - https://docs.microsoft.com/en-us/powerquery-m/text-middle
Text.Replace - https://docs.microsoft.com/en-us/powerquery-m/text-replace
Regards,
Ruslan
-------------------------------------------------------------------
Did I answer your question? Mark my post as a solution!
So much gratitude to you! THANK YOU!
@Anonymous,
You are welcome.
Regards,
Ruslan
If there is no repeatable rule, then there is no getting around the fact that you will need to Interveine manually at some point. I suggest, create a list of all the unique entries in Excel. Add a column with the result you want. Load the table to power query and merge it on the description. Then extract the new column you need. You can also create an audit query that shows which items are missing values.
As @MattAllington said.
These steps might help you preparing that list, as they remove all strings with numbers in them:
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jY/BCoMwEER/ZfHclBi17bVFqRYFUSiCeAgSrBiiJF76941iT9XG6743s0xZWtdatiOrXxD2nEHaDwOTEPiIYIwTqzoYDGJWsv+K9/3jM6FPqWRKwU3SjkHAdQQVMbLPHnKOJ5PmEIzcWbpTMco3xFRpl3hxUmyco+gXOHgGa2SKZGvEnTLPhUSiYVL1nEPeim7emUBAGz1cr91WlkZbS2H+MHp7unZWXaaq6gM=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [#"Machine Names:" = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Machine Names:", type text}}), #"Added Custom" = Table.AddColumn(#"Changed Type", "TextToList", each Text.Split([#"Machine Names:"], " ")), #"Added Custom1" = Table.AddColumn(#"Added Custom", "StringToListOfCharacters", each List.Transform([TextToList], (x) => Text.ToList(x))), #"Added Custom2" = Table.AddColumn(#"Added Custom1", "SelectStringsWithoutNumbers", each List.Select([StringToListOfCharacters], (x) => not List.ContainsAny(x, {"0".."9"}))), #"Added Custom3" = Table.AddColumn(#"Added Custom2", "Reassemble", each Text.Combine(List.Transform([SelectStringsWithoutNumbers], (x) => Text.Combine(x, "")), " ")), #"Removed Other Columns" = Table.SelectColumns(#"Added Custom3",{"Machine Names:", "Reassemble"}) in #"Removed Other Columns"
and you'll get a table like this:
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
@Anonymous
Seems we can also simply split by space/delimiter ...by splitting at the right most Space/Delimiter
Indeed, that's much easier for the sample given.
... But will only work if there is just one string that contains numbers 😉
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
111 | |
99 | |
80 | |
64 | |
57 |
User | Count |
---|---|
145 | |
111 | |
92 | |
84 | |
66 |