Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
I_Like_Pi
Resolver II
Resolver II

Data Cleanse: remove text values from a field if they contain a symbol.

I am trying to clean up some data in an extract that was provided to me and will continue to be provided in the current form.

One of the columns is a list of text, I would like to get the desired result i.e.;

inputdesired result
abc abc jkl edr edr.gytabc jkl edr
cba wer wer dca.xyz mnocba wer dca mno

 

Essentially I need to remove the characters after the period (if they exist) and then dedup the list.

Or at least that is how my head is approching it.

I started with;

Text.Combine(List.Distinct(Text.Split([input], " ")), " ")

which returns

abc jkl edr edr.gyt

cba wer dca.xyz mno

 

So I have nothing for stripping the .gyt or .xyz prior to the distinct.

I am happy to do this in multiple passes/columns.

For the example above I used 3 character blocks and a period but in reality they are of variable length and there are a couple other symbols.

Does anyone know how some ideas to strip the text after the dot?

Thanks in advance for any takers.

 

1 ACCEPTED SOLUTION
Vijay_A_Verma
Super User
Super User

See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSkxKVgDhrOwchdSUIhDWS68sUYrViVZKTkpUKE8tAuOU5ES9isoqhdy8fKXYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [input = _t]),
    #"Added Custom" = Table.AddColumn(Source, "Result", each List.Distinct([l=Text.Split([input]," "), r=List.Transform(l,(x)=>Text.Split(x,"."){0})][r])),
    #"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Result", each Text.Combine(List.Transform(_, Text.From), " "), type text})
in
    #"Extracted Values"

 

View solution in original post

4 REPLIES 4
AlexisOlson
Super User
Super User

You can also use Text.BeforeDelimiter.

 

Table.AddColumn(
    #"prev step",
    "desired result",
    each Text.Combine(
             List.Distinct(
                 List.Transform(
                     Text.Split([input], " "),
                     each Text.BeforeDelimiter(_, ".")
                 )
             ),
             " "
         )
)
I_Like_Pi
Resolver II
Resolver II

WOW!! Just realized how dumb that last post was. I thought that whole Let _t was the secret sauce required before the added column. I just realized the added column was all the sauce needed. 

List.Distinct([l=Text.Split([input]," "), r=List.Transform(l,(x)=>Text.Split(x,"."){0})][r])),

I get the next line combined the list but for me this is where the magic happened.

 

Thanks so much!!!

I_Like_Pi
Resolver II
Resolver II

Firstly thanks,

I am trying but i'm struggling to implement that. Your example initiates the let _t from from within

Table.FromRows( "rows from your json import", "column to add in your let _t form").

Do I need to itemize down my table to rows to come back to this form? My table is generated from an import of multiple XLM files so it starts from a folder then translates the binary to an xml table

....  each Xml.Tables(File.Contents([Folder Path]&[Name]))

To which I expand and drop a lot of data.

I dont really have a row by row import.

 

I will keep trying to smash the hammer regardless, hopefully I can find the other end of this potential rabbit hole.

Vijay_A_Verma
Super User
Super User

See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSkxKVgDhrOwchdSUIhDWS68sUYrViVZKTkpUKE8tAuOU5ES9isoqhdy8fKXYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [input = _t]),
    #"Added Custom" = Table.AddColumn(Source, "Result", each List.Distinct([l=Text.Split([input]," "), r=List.Transform(l,(x)=>Text.Split(x,"."){0})][r])),
    #"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Result", each Text.Combine(List.Transform(_, Text.From), " "), type text})
in
    #"Extracted Values"

 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors