Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Vladisam
Helper II
Helper II

Extraction of multiple 7 characters text substrings starting with "O-" from string of text

Hi,

 

I have a column [Description] where in each row string of text may contain few order numbers (between 0 and 5 order numbers, and they start with “O-“) that I have to extract. I found starting positions of these orders and have them as a list in each row:

 

=Table.AddColumn(#"Added Custom", "PositionList", each Text.PositionOf([Description],"O-",Occurrence.All))

 

Order numbers are random alphanumeric combinations and have fixed length of 7. I need help with extraction of order numbers out of text – Text.Middle would work if I can do a loop and pass the list of starting positions as an argument. I am aiming to replace in my list starting positions with order numbers themselves as later I will need to remove duplicates (with List.Distict) from the output list.

I am certain i am making it more complicated than it needs to be and there is more streamlined way of achieving that☹, but i can't figure it out.
I am grateful for any help and advice.

Vlad

2 ACCEPTED SOLUTIONS

@Vladisam  there you go

 

 

let
    Source = Web.BrowserContents("https://community.powerbi.com/t5/Power-Query/Extraction-of-multiple-7-characters-text-substrings-starting/m-p/2117172#M62409"),
    #"Extracted Table From Html" = Html.Table(Source, {{"Column1", "DIV[id='bodyDisplay_8'] > DIV.lia-message-body-content > TABLE:nth-child(3) > * > TR > :nth-child(1)"}}, [RowSelector="DIV[id='bodyDisplay_8'] > DIV.lia-message-body-content > TABLE:nth-child(3) > * > TR"]),
    #"Changed Type" = Table.TransformColumnTypes(#"Extracted Table From Html",{{"Column1", type text}}),
    fx = let   fx =(input)=>
    Web.Page(
        "<script>
            var x='"&input&"';
            var b = x.match(/O[-][0-9A-Za-z]{5}/gm);
            document.write(b);
        </script>"){0}[Data]{0}[Children]{1}[Children]    
in
    fx,
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each fx([Column1])),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Text"}, {"Text"}),
    #"Added Custom1" = Table.AddColumn(#"Expanded Custom", "Custom", each List.Distinct(Text.Split([Text],","))),
    #"Extracted Values" = Table.TransformColumns(#"Added Custom1", {"Custom", each Text.Combine(List.Transform(_, Text.From), ","), type text}),
    #"Removed Columns" = Table.RemoveColumns(#"Extracted Values",{"Text"})
    
in
    #"Removed Columns"

 

 

smpa01_0-1633464502129.png

 

 

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs

View solution in original post

watkinnc
Super User
Super User

Ok check it out:

 

=Table.AddColumn("Orders", each List.Select(Text.Split([Column1], " "), each Text.StartsWith(_, "OI-")))

 

-- Nate


I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!

View solution in original post

31 REPLIES 31

@Vladisam  please take a look https://community.powerbi.com/t5/Community-Blog/Using-JavaScript-in-power-query-for-regex-Part2/ba-p...

 

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs

Thank you @smpa01 !!

I think option to use JS natively in Power Query unlocks tons of opportunities!

@Vladisam  yes, it does and I will make few more posts to show what  more can be done using js in PQ which are otherwise not doable (probably) without using R or Pythonon currently.

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs

Yes, this is what I am looking for - and it should be only unique values. But let me create sample data as well.

(Very cool Sales Calendar).

Sorry got confused with R - it's from Pat's post (above) regarding regex.

 

Dupe- deleted:

 
 
 
 

I got away with List functions 🙂 - see reply above.

JS is something completely new for me as business user (but a lot of good stuff there, have to givve it a try at some point). Always nice to have alternative paths to solution. Thanks.

watkinnc
Super User
Super User

I would try a different approach. If the Order Numbers in the Description column are separated by commas, I would first use the split columns GUI function to split by the comma. If you don't have them separated by comma (or something else, I would use the Replace function to replace "OI" in the Description column with ",OI", and THEN split by comma.

 

Then, whatever the name of your list column is (let's say it's called "Split"), add your custom column:

 

Table.AddColumn(PriorStepName, "Orders", each List.Select(List.Distinct([Split]),  Text.StartsWith(_, "OI)))

 

--Nate 


I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!

Hi Nate,

Order numbers are embedded into string of text (300-400 characters) without commas. Not sure how many of them can be included (data is new to me, I came across 5 orders so far - and these are actually only 2 unique orders twice + error 🙂 ), so it makes it difficult to built robust solution. I managed to make it work with nested List functions (see response to Eyelyn9), thus avoiding to replicate logic on multiple split columns.
List functions rock :). 

ronrsnfld
Super User
Super User

How about providing a representative sample of your data and expected output.

I should've done that (see my reply to Eyelyn9) but it's customer's data - needs too much obfuscation. I made it work with List functions (also in my reply to Eyelyn9). Will try watkinnc' approach shortly just to have alternative.

OK, I came up with a similar solution as I see in some of the other answers:

 

 

    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", 
        each List.Distinct(
                List.Select(
                    Text.Split([Column1]," "), 
                        each Text.StartsWith(_,"O-") and Text.Length(_)=7)))

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors