Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi,
I have a column [Description] where in each row string of text may contain few order numbers (between 0 and 5 order numbers, and they start with “O-“) that I have to extract. I found starting positions of these orders and have them as a list in each row:
=Table.AddColumn(#"Added Custom", "PositionList", each Text.PositionOf([Description],"O-",Occurrence.All))
Order numbers are random alphanumeric combinations and have fixed length of 7. I need help with extraction of order numbers out of text – Text.Middle would work if I can do a loop and pass the list of starting positions as an argument. I am aiming to replace in my list starting positions with order numbers themselves as later I will need to remove duplicates (with List.Distict) from the output list.
I am certain i am making it more complicated than it needs to be and there is more streamlined way of achieving that☹, but i can't figure it out.
I am grateful for any help and advice.
Vlad
Solved! Go to Solution.
@Vladisam there you go
let
Source = Web.BrowserContents("https://community.powerbi.com/t5/Power-Query/Extraction-of-multiple-7-characters-text-substrings-starting/m-p/2117172#M62409"),
#"Extracted Table From Html" = Html.Table(Source, {{"Column1", "DIV[id='bodyDisplay_8'] > DIV.lia-message-body-content > TABLE:nth-child(3) > * > TR > :nth-child(1)"}}, [RowSelector="DIV[id='bodyDisplay_8'] > DIV.lia-message-body-content > TABLE:nth-child(3) > * > TR"]),
#"Changed Type" = Table.TransformColumnTypes(#"Extracted Table From Html",{{"Column1", type text}}),
fx = let fx =(input)=>
Web.Page(
"<script>
var x='"&input&"';
var b = x.match(/O[-][0-9A-Za-z]{5}/gm);
document.write(b);
</script>"){0}[Data]{0}[Children]{1}[Children]
in
fx,
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each fx([Column1])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Text"}, {"Text"}),
#"Added Custom1" = Table.AddColumn(#"Expanded Custom", "Custom", each List.Distinct(Text.Split([Text],","))),
#"Extracted Values" = Table.TransformColumns(#"Added Custom1", {"Custom", each Text.Combine(List.Transform(_, Text.From), ","), type text}),
#"Removed Columns" = Table.RemoveColumns(#"Extracted Values",{"Text"})
in
#"Removed Columns"
Ok check it out:
=Table.AddColumn("Orders", each List.Select(Text.Split([Column1], " "), each Text.StartsWith(_, "OI-")))
-- Nate
@Vladisam please take a look https://community.powerbi.com/t5/Community-Blog/Using-JavaScript-in-power-query-for-regex-Part2/ba-p...
Thank you @smpa01 !!
I think option to use JS natively in Power Query unlocks tons of opportunities!
@Vladisam yes, it does and I will make few more posts to show what more can be done using js in PQ which are otherwise not doable (probably) without using R or Pythonon currently.
Yes, this is what I am looking for - and it should be only unique values. But let me create sample data as well.
(Very cool Sales Calendar).
Sorry got confused with R - it's from Pat's post (above) regarding regex.
Dupe- deleted:
I got away with List functions 🙂 - see reply above.
JS is something completely new for me as business user (but a lot of good stuff there, have to givve it a try at some point). Always nice to have alternative paths to solution. Thanks.
I would try a different approach. If the Order Numbers in the Description column are separated by commas, I would first use the split columns GUI function to split by the comma. If you don't have them separated by comma (or something else, I would use the Replace function to replace "OI" in the Description column with ",OI", and THEN split by comma.
Then, whatever the name of your list column is (let's say it's called "Split"), add your custom column:
Table.AddColumn(PriorStepName, "Orders", each List.Select(List.Distinct([Split]), Text.StartsWith(_, "OI)))
--Nate
Hi Nate,
Order numbers are embedded into string of text (300-400 characters) without commas. Not sure how many of them can be included (data is new to me, I came across 5 orders so far - and these are actually only 2 unique orders twice + error 🙂 ), so it makes it difficult to built robust solution. I managed to make it work with nested List functions (see response to Eyelyn9), thus avoiding to replicate logic on multiple split columns.
List functions rock :).
How about providing a representative sample of your data and expected output.
I should've done that (see my reply to Eyelyn9) but it's customer's data - needs too much obfuscation. I made it work with List functions (also in my reply to Eyelyn9). Will try watkinnc' approach shortly just to have alternative.
OK, I came up with a similar solution as I see in some of the other answers:
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom",
each List.Distinct(
List.Select(
Text.Split([Column1]," "),
each Text.StartsWith(_,"O-") and Text.Length(_)=7)))
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.