Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
electrobrit
Post Patron
Post Patron

split text if starts with "M" with all caps (3-6 characters)

I have a column with data I need to extract certain text. (sample data is attached)

If the "application" column starts with capital M (and 3-6 characters, mostly 6 but sometimes 3). I need this to go in a seperate column if it's there and if not, be left blank.

I am also struggling with those which have 2 applications, would end up with each of these on a seperate line and all the other data on that oroginal row just duplication on the 2nd one. 

electrobrit_0-1599668163932.png

 

This ultimately is an important filter column so trying to extract the data.
It's stumped me for a couple days now. Help is appreciated! 

Sample data 

1 ACCEPTED SOLUTION

Get rid of everything before the 'if' and remove ", type text)" .  That is Advanced Editor code, so it includes a name for the current step, a reference to the previous step , the action requested and finally the code to perform it.

 

I realise this will be challenging for you but if you want to learn to use Power Query, learning the difference between 'point-and-click' features, M language and debugging in the advanced editor will be helpful skills. 

View solution in original post

14 REPLIES 14
electrobrit
Post Patron
Post Patron

Although this worked perfectly with DAX, I'm finding I need to do this in power query because I need to use that column to merge 2 tables. Any thoughts?

@Greg_Deckler 

@Gregory_Felton I could not decifier this, it doesn't look like a custom column. 

@ImkeF @edhans @HotChilli 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

I copied Greg's algorithm. Add these in the Advanced Editor and swap #"Removed Columns" for your previous step 

 

#"Inserted Text Before Delimiter" = Table.AddColumn(#"Removed Columns", "Text Before Delimiter", each Text.BeforeDelimiter([Application], " "), type text),
    #"Replaced Value" = Table.ReplaceValue(#"Inserted Text Before Delimiter","-","",Replacer.ReplaceText,{"Text Before Delimiter"}),
    #"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each if Text.StartsWith([Text Before Delimiter], "M") then [Text Before Delimiter] else "", type text)
in
    #"Added Custom"

 

@HotChilli  thanks for looking and trying to offer a solution.

It didn't work in advanced editor (there are a lot of steps before and I kept getting errors such as "_ already referenced, did you miss a module?". So I tried to move the steps up to see if I could avoid the error, still error.
 I changed numbering of the steps it seemed to work but the custom column produced an error.

I finally tried to do it in steps (insert text before delimiter (space), replace value, then added custom) but the custom column says "False". 
I spent a long time trying to figure it out but to no avail.

 

error in custom column.png

Get rid of everything before the 'if' and remove ", type text)" .  That is Advanced Editor code, so it includes a name for the current step, a reference to the previous step , the action requested and finally the code to perform it.

 

I realise this will be challenging for you but if you want to learn to use Power Query, learning the difference between 'point-and-click' features, M language and debugging in the advanced editor will be helpful skills. 

it worked! Appreciate your help. Yes, this is an area I need to learn better.

thanks @Greg_Deckler and @HotChilli both Dax and Power Query solutions worked great and I have them in my toolbox now. 
So helpful to see how each work an does the same thing for different requirements in the report.

Your best bet is to use R or Python since they have regex capabilities. Otherwise you could use a pattern like this:

 

let
    Source = #table({"A"},{{"ONEa-#\Ee-/Ne"}}),
    GetFirstSixChars = Table.TransformColumns(Source, {{"A", each Text.Start(_, 6), type text}}),
    Result = (Table.SplitColumn(GetFirstSixChars, "A", Splitter.SplitTextByCharacterTransition({"A".."Z"}, {"-"," ", "/", "a".."z"}), {"A"}))[A]{0}
in
    Result

Greg_Deckler
Super User
Super User

@electrobrit - That's pretty ugly data. Probably could get there with enough FIND, LEFT, etc. But, you may be the 2nd candidate for my fuzzy matching algorithm. If you have a list of possible values, could work very well for you.

https://community.powerbi.com/t5/Quick-Measures-Gallery/Fuzzy/td-p/1352914

 

Otherwise, can you share sample data as text so that I can experiment?


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

@Greg_Deckler 

I did share a link in original post to the sample data.
Here it is in the body if that helps. It is ugly huh? thanks for looking

I guess if I could get the "M" application in the next column, for those with 2, I might can figure something else out.

 

 

ApplicationNew Application (3-6 char if begins with M)FINAL APP (if 2 in orig App column, 2 separate lines)  
DSAC - Web application is vulnerable to Clickjacking    
Enable Custom Attributes on the InspectionAttribute and InspectionScriptItem services    
Further Rostering Enhancements to predict Individual rosters when starting mid period    
MSB7TR- Change to Rate Change EventingMSB7TRMSB7TR  
MSE203- Review Supplier - Tax Details Tab - ATO fieldsMSE203MSE203  
MSEWCM - Add Custom Attributes to Warranty ClaimsMSEWCMMSEWCM  
MSO Performance InvestigationMSOMSO  
MSO010 - US Table File - Associated Value 03 - Ready for Digital Enterprise SchedulingMSO010 MSO010   
MSO14T/MSO14B - Rotation Requisitions - allow change of rotationMSO14T/MSO14BMSO14T  
  MSO14Ball other columns in this row will duplicate from other  row
Fixed Scheduling flag is CHAR(1) instead of Boolean on WSDL    
GIS Coordinates saved with 3 decimal places when manually entered    
MSB603 - Output Report MSR603 errorMSB603MSB603  
MSE17D - Freight Costing for Category Costed ItemsMSE17DMSE17D  
MSE603 - Include Existing List ErrorMSE603MSE603  
MSE623 - Unable to open cross-district Work OrderMSE623MSE623  
MSE660- Project & MSE66B Project Estimate - Unable to Maintain MSE660 MSE66B MSE660  
  MSE66B all other columns in this row will duplicate from other  row
MSE891 - Rate Override IssuesMSE891MSE891  
MSEAPM - Email Approval link does not redirect to the respective requestMSEAPMMSEAPM  

@Greg_Deckler

I do have a list of all possibles values I can add to pbix file as a table. How would I match those and return the value? Also, would I do that in Power Query or with Dax?

@electrobrit I will try to mock up your data in a PBIX. The fuzzy matching stuff is DAX - https://community.powerbi.com/t5/Quick-Measures-Gallery/Fuzzy/td-p/1352914

 

But! You also have fuzzy matching in Power Query when you do a Merge query. You could check that out and maybe it helps. We couldn't use it in our case, hence the DAX that gave us more control. Do a merge query on your two tables and look at the bottom of the merge dialog for fuzzy matching


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

@Greg_Deckler 
THANK YOU- will try that and see what happens.

@electrobrit - Turns out, you probably don't need fuzzy matching after all. 😞  See attached PBIX below sig, Table (11)

Column = 
    VAR __Left = LEFT([Application],SEARCH(" ",[Application],,LEN([Application])))
    VAR __Sub = SUBSTITUTE(__Left,"-","")
RETURN
    IF(FIND("M",[Application],,0)=1,__Sub,BLANK())

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

@Greg_Deckler  thank you, worked great. Will figure out a solution for when 2 show up in the cell. This is not a big issue though, the other was a problem. Thanks again.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.