Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
mgp-opti
Frequent Visitor

Find and extract 7 digit numbers starting with a 1 from a column containing other numbers and text

Hi all,

 

Please could someone let me know how I can extract an 7 digit number that starts with a 1 from a column which contains other numbers (luckily of a different length or which don't start with a 1) and text. It would be great to separate this number and put it in to it's own column if that's possible.


Examples below. As you can see the 7 digit number appears in different places in the column.

1002223 - 7003117328 - ID: 74526 - Attendance
1046890301 - 210438 1002515 - CW
2047074788 - 210525-1002834
1053847149 - 1016727- 212170 - cw upper

 

Many thanks in advance for any help. 
Marcus

1 ACCEPTED SOLUTION

OK, I had better get a kudo for this... 😜

 

PBIX is attached.

 

Column = 
    VAR __Length = LEN([Column1])
    VAR __Search1 = SEARCH("1",[Column1],1,BLANK())
    VAR __Search2 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search1+1,BLANK()),BLANK())
    VAR __Search3 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search2+1,BLANK()),BLANK())
    VAR __Search4 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search3+1,BLANK()),BLANK())
    VAR __Search5 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search4+1,BLANK()),BLANK())
    VAR __Search6 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search5+1,BLANK()),BLANK())
    VAR __Search7 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search6+1,BLANK()),BLANK())
    VAR __Search8 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search7+1,BLANK()),BLANK())
    VAR __Search9 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search8+1,BLANK()),BLANK())
    VAR __Search10 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search9+1,BLANK()),BLANK())
    VAR __Numbers = { "1", "2", "3", "4", "5", "6", "7", "8", "9", "0" }
    VAR __Table = 
        ADDCOLUMNS(
            ADDCOLUMNS(
                ADDCOLUMNS(
                    FILTER(
                        { 
                            (1,__Search1), 
                            (2,__Search2), 
                            (3, __Search3), 
                            (4, __Search4), 
                            (5, __Search5), 
                            (6, __Search6), 
                            (7, __Search7), 
                            (8, __Search8), 
                            (9, __Search9), 
                            (10, __Search10)
                        },
                    NOT(ISBLANK([Value2]))
                    ),
                "__Extract",IF([Value2] + 7 > __Length,RIGHT([Column1],7),MID([Column1],[Value2],7))
                ),
                "__Eighth",IF([Value2] + 8 > __Length,RIGHT([Column1],1),MID([Column1],[Value2]+8,1))
            ),
            "__Check1",IF(ISNUMBER([__Eighth]),FALSE,TRUE),
            "__Check2",IFERROR(VALUE([__Extract]),-1),
            "__Check3",IF(SEARCH(" ",[__Extract],1,-1) = -1,TRUE,FALSE)
        )
RETURN
    MINX(
        FILTER(
            __Table,
            [__Check1] = TRUE && [__Check2] <> -1 && [__Check3] = TRUE
        ),
        [__Extract]
    )

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

10 REPLIES 10
Jimmy801
Community Champion
Community Champion

Hello @mgp-opti 

 

applying some Text.Split and List.Select should to the trick.

Here the complete solution

let
	Source = #table
	(
		{"Get Number"},
		{
			{"1002223 - 7003117328 - ID: 74526 - Attendance"},	{"1046890301 - 210438 1002515 - CW"},	{"2047074788 - 210525-1002834"},	{"1053847149 - 1016727- 212170 - cw upper"}

					}
	),
    FirstNumberWith1 = Table.AddColumn
    (
        Source,
        "First number with 1 and length = 7 splitting with - or space",
        (add)=>try List.First( List.Select(List.Transform(Text.SplitAny(add[Get Number]," -"), each Text.Remove(_,{" "})),(sel)=> Text.Start(sel,1)="1" and Text.Length(sel)=7 )) otherwise "no number with 1 found"
    )
in
    FirstNumberWith1

 

Copy paste this code to the advanced editor in a new blank query to see how the solution works. If this solution fits your need, copy and past a part of it and implement it in your query

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Hi Jimmy,

Many thanks for your response, much appreciated. However it looks like I would need to enter the row data in to the query and I've got thousands of rows.

I might  be assuming wrong!

Many thanks anyway.

Best,

Marcus

Hello @mgp-opti 

 

no, you have to replace my Source-part with your data query and adapt the column-name to your real column.

 

Jimmy

Understood - sorry!

I'll give it a go and will report back.

Many thanks,

Marcus

mgp-opti
Frequent Visitor

Hi Greg,

 

Many thanks for your reply. I'm trying to separate 1002223, 1002515, 1002834 and 1016727 in the examples. But other them beginning with a 1 and being 7 digits long, there is no pattern. I'm hoping there is a way to select them using this criteria so that I can use these numbers (which are work orders) as unique identifiers which allow me to merge data.

 

Any advice would be fantastic.

 

Many thanks,

 

Marcus

 

dax
Community Support
Community Support

Hi @mgp-opti ,

I am not clear about your rquirement, you could try below M code to see whether it work or not

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("LYw7DsIwEESvYrkm0n69GzpEGk5AEblAwW0UoSCuzxrSzefNzHNGACLiNCQDYERj8jC36ZxMlEroy7639flYl5brqS+k+AgMGB2FY0/9RVEjuN5/EIEYmJj7H1LSoUPOcpwouxjKGD0CFiPrIKFBJMsnvbetvXKtXw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Column1", "Column1 - Copy"),
    #"Replaced Value" = Table.ReplaceValue(#"Duplicated Column","-"," ",Replacer.ReplaceText,{"Column1"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Replaced Value", "Column1", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5", "Column1.6", "Column1.7", "Column1.8", "Column1.9", "Column1.10", "Column1.11"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", Int64.Type}, {"Column1.2", type text}, {"Column1.3", type text}, {"Column1.4", Int64.Type}, {"Column1.5", Int64.Type}, {"Column1.6", Int64.Type}, {"Column1.7", type text}, {"Column1.8", type text}, {"Column1.9", type text}, {"Column1.10", type text}, {"Column1.11", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type1", { "Column1 - Copy"}, "Attribute", "Value"),
    #"Changed Type2" = Table.TransformColumnTypes(#"Unpivoted Columns",{{"Value", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type2", "Custom", each if Text.StartsWith([Value],"1") and Text.Length([Value])=7 then [Value] else null),
    #"Sorted Rows" = Table.Sort(#"Added Custom",{{"Column1 - Copy", Order.Ascending},{"Custom", Order.Ascending}}),
    #"Filled Up" = Table.FillUp(#"Sorted Rows",{"Custom"}),
    #"Removed Columns" = Table.RemoveColumns(#"Filled Up",{"Attribute", "Value"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Columns")
in
    #"Removed Duplicates"

Best Regards,
Zoe Zhi

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Hmm, that is indeed unfortunate as I cannot think how you could have made that problem any harder! I am going to give it a go however, stay tuned.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

OK, I had better get a kudo for this... 😜

 

PBIX is attached.

 

Column = 
    VAR __Length = LEN([Column1])
    VAR __Search1 = SEARCH("1",[Column1],1,BLANK())
    VAR __Search2 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search1+1,BLANK()),BLANK())
    VAR __Search3 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search2+1,BLANK()),BLANK())
    VAR __Search4 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search3+1,BLANK()),BLANK())
    VAR __Search5 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search4+1,BLANK()),BLANK())
    VAR __Search6 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search5+1,BLANK()),BLANK())
    VAR __Search7 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search6+1,BLANK()),BLANK())
    VAR __Search8 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search7+1,BLANK()),BLANK())
    VAR __Search9 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search8+1,BLANK()),BLANK())
    VAR __Search10 = IF(NOT(ISBLANK(__Search1)),SEARCH("1",[Column1],__Search9+1,BLANK()),BLANK())
    VAR __Numbers = { "1", "2", "3", "4", "5", "6", "7", "8", "9", "0" }
    VAR __Table = 
        ADDCOLUMNS(
            ADDCOLUMNS(
                ADDCOLUMNS(
                    FILTER(
                        { 
                            (1,__Search1), 
                            (2,__Search2), 
                            (3, __Search3), 
                            (4, __Search4), 
                            (5, __Search5), 
                            (6, __Search6), 
                            (7, __Search7), 
                            (8, __Search8), 
                            (9, __Search9), 
                            (10, __Search10)
                        },
                    NOT(ISBLANK([Value2]))
                    ),
                "__Extract",IF([Value2] + 7 > __Length,RIGHT([Column1],7),MID([Column1],[Value2],7))
                ),
                "__Eighth",IF([Value2] + 8 > __Length,RIGHT([Column1],1),MID([Column1],[Value2]+8,1))
            ),
            "__Check1",IF(ISNUMBER([__Eighth]),FALSE,TRUE),
            "__Check2",IFERROR(VALUE([__Extract]),-1),
            "__Check3",IF(SEARCH(" ",[__Extract],1,-1) = -1,TRUE,FALSE)
        )
RETURN
    MINX(
        FILTER(
            __Table,
            [__Check1] = TRUE && [__Check2] <> -1 && [__Check3] = TRUE
        ),
        [__Extract]
    )

@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

Hi Greg,

You are a total genius - your solution works perfectly! Thank you so much, I really appreciate all the work you put in on this as I know it was a really tough one.

You fully deserve the kudo!

Massive thanks again. This Community is amazing.

Best,

Marcus

Greg_Deckler
Super User
Super User

So in the example data provided, what numbers are you looking to extract? In generaly, you are going to use a combination of SEARCH/FIND and probably MID although you might use LEFT or RIGHT as well. Depends.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Mastering Power BI 2nd Edition

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors