cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Helper I
Helper I

HELP NEEDED to use a custom function in Power Query in order to delete HTML TAGS

Hi, 

 

I'm having the same problem as that person -->Robust-function-to-remove-HTML-tags and apparently they found a way to create a function that will clean the HTML tags using M. 

 

I am fairly new to M and i'm not exactly sure where to paste the function they created nor how to apply it to my dataset. I have a Single table "LES REPORTING DRAFT" with multiple columns and the name of the column i need to clean is "EMAIL_BODY".

Should i create a blank query and copy and paste the function here?

Do i have to change the function so it applies to my dataset?

I'm guessing i need to create the function somewhere then go to the advanced editor in my "LES REPORTING DRAFT" Table where i will call that said function am i right ? If someone could walk me through steps on how to use that function it would be greatly appreciated!

 

Here is the function 

 

let
    removeOne = (input) =>
        let
            text = Text.From(input),
            length = Text.Length(text),
            position = Text.PositionOf(text, "<"),
            positionEnd = Text.PositionOf(text, ">"),
            range = positionEnd-position+1,
            result = if position >= 0 then Text.ReplaceRange(text, position, range, "") else input
        in
            result,

    removeAll = (input) =>
        let
            rmvOne = removeOne(input),
            rmvAll = if Text.PositionOf(rmvOne, "<") >= 0 then @removeAll(rmvOne) else rmvOne
        in
            rmvAll,

 

// Source = get from database,
    Source = "<p>hello I am text to replace, lol</p>",

// return = removeAll([columnToPassToFunction])
    return = removeAll(Source)
        in
            return

 

 

Thank you in advance, 

 

Nicolas

1 ACCEPTED SOLUTION

Hi @TerrificPoet,

no problem, we all were in your shoes at some time.

 

What I meant was this:

 

let
    removeOne = (input) =>
        let
            text = Text.From(input),
            length = Text.Length(text),
            position = Text.PositionOf(text, "<"),
            positionEnd = Text.PositionOf(text, ">"),
            range = positionEnd-position+1,
            result = if position >= 0 then Text.ReplaceRange(text, position, range, "") else input
        in
            result,

    removeAll = (HTML as text) =>
let
    Source = Text.From(HTML),
    SplitAny = Text.SplitAny(Source,"<>"),
    ListAlternate = List.Alternate(SplitAny,1,1,1),
    ListSelect = List.Select(ListAlternate, each _<>""),
    TextCombine = Text.Combine(ListSelect, "")
in
    TextCombine,

    Source = Excel.Workbook(File.Contents("F:\Lowe's\customer experience\SAMPLE FILE FOR PB FORUM.xlsx"), null, true),
    #"LES REPORTING DRAFT_Sheet" = Source{[Item="LES REPORTING DRAFT",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(#"LES REPORTING DRAFT_Sheet", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ROW_ID", type text}, {"ATTRIB_9", type text}, {"DESC_TEXT", type text}, {"CREATED", type datetime}, {"SR_CAT_TYPE_CD", type text}, {"SR_AREA", type text}, {"SR_TITLE", type text}, {"REOPENED_TS", type any}, {"ACT_OPEN_DT", type datetime}, {"ACT_CLOSE_DT", type any}, {"CREATED_1", type datetime}, {"COMMENTS_LONG", type text}, {"EMAIL_RECIP_NAME", type text}, {"EVT_STAT_CD", type text}, {"NAME", type text}, {"ROW_ID_2", type text}, {"EMAIL_BCC_LINE", type text}, {"EMAIL_BODY", type text}, {"EMAIL_CC_LINE", type text}, {"EMAIL_RECIP_ADDR", type text}, {"EMAIL_SNDR_ADDR", type text}, {"EMAIL_SNDR_NAME", type text}, {"EMAIL_TO_LINE", type text}, {"SR_NUM", type text}, {"SR_TITLE_3", type text}, {"SR_AREA_4", type text}, {"SR_SUB_AREA", type text}, {"SR_STAT_ID", type text}, {"X_ASG_LCT_ID", type text}, {"X_ASO_LCT_ID", type text}, {"SALES_ID", type any}, {"MAIN_ISSUE", type text}, {"LEVEL1", type text}, {"LEVEL2", type text}, {"DESCRIPTION", type text}, {"SUMMARY", type text}, {"RESOLVE STEPS", type text}, {"RESOLUTION", type text}, {"ALL TAGS", type text}, {"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}, {"Category 5", type text}, {"Category 6", type any}, {" Category 7", type any}, {" Category 8", type any}, {" Category 9", type any}, {" Category 1", type any}, {"ACT_CLOSED_DT_REOPENS", type datetime}, {"ACT_OPEN_DT_REOPENS", type datetime}, {"Days Open", type number}, {"Email Created Date", type any}, {"Fiscal Week Start", type any}, {"Fiscal Month", type any}, {"Fiscal Week", type any}, {"Fiscal Year", type any}}),
    #"Extracted Date" = Table.TransformColumns(#"Changed Type",{{"CREATED", DateTime.Date, type date}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Extracted Date", "ACT_CLOSE_DT", "ACT_CLOSE_DT - Copy"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Duplicated Column",{{"ACT_CLOSE_DT - Copy", type date}}),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type1", {{"ACT_CLOSE_DT - Copy", #date(1950, 1, 1)}}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Replaced Errors",{{"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}, {"Category 5", type text}, {"Category 6", type text}, {" Category 7", type text}, {" Category 8", type text}, {" Category 9", type text}, {" Category 1", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type2",{{" Category 1", "Category 10"}}),
    #"Trimmed Text" = Table.TransformColumns(#"Renamed Columns",{{"Category 1", Text.Trim, type text}, {"Category 2", Text.Trim, type text}, {"Category 3", Text.Trim, type text}, {"Category 4", Text.Trim, type text}, {"Category 5", Text.Trim, type text}, {"Category 6", Text.Trim, type text}, {" Category 7", Text.Trim, type text}, {" Category 8", Text.Trim, type text}, {" Category 9", Text.Trim, type text}, {"Category 10", Text.Trim, type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Trimmed Text",null,"is null",Replacer.ReplaceValue,{"Category 10"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",null,"is null",Replacer.ReplaceValue,{" Category 9"}),
    #"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1",null,"is null",Replacer.ReplaceValue,{" Category 8"}),
    #"Replaced Value3" = Table.ReplaceValue(#"Replaced Value2",null,"is null",Replacer.ReplaceValue,{" Category 7"}),
    #"Replaced Value4" = Table.ReplaceValue(#"Replaced Value3",null,"is null",Replacer.ReplaceValue,{"Category 6"}),
    #"Replaced Value5" = Table.ReplaceValue(#"Replaced Value4",null,"is null",Replacer.ReplaceValue,{"Category 5"}),
    #"Replaced Value6" = Table.ReplaceValue(#"Replaced Value5",null,"is null",Replacer.ReplaceValue,{"Category 4"}),
    #"Replaced Value7" = Table.ReplaceValue(#"Replaced Value6",null,"is null",Replacer.ReplaceValue,{"Category 3"}),
    #"Replaced Value8" = Table.ReplaceValue(#"Replaced Value7",null,"is null",Replacer.ReplaceValue,{"Category 2"}),
    #"Replaced Value9" = Table.ReplaceValue(#"Replaced Value8","","is null",Replacer.ReplaceValue,{"Category 1"}),
    #"Replaced Value10" = Table.ReplaceValue(#"Replaced Value9","","is null",Replacer.ReplaceValue,{"Category 2"}),
    #"Replaced Value11" = Table.ReplaceValue(#"Replaced Value10","","is null",Replacer.ReplaceValue,{"Category 3"}),
    #"Removed Errors" = Table.RemoveRowsWithErrors(#"Replaced Value11", {"CREATED"}),
    #"Removed Errors1" = Table.RemoveRowsWithErrors(#"Removed Errors", {"Days Open"}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Removed Errors1", {"ROW_ID", "ATTRIB_9", "DESC_TEXT", "CREATED", "SR_CAT_TYPE_CD", "SR_AREA", "SR_TITLE", "REOPENED_TS", "ACT_OPEN_DT", "ACT_CLOSE_DT", "CREATED_1", "COMMENTS_LONG", "EMAIL_RECIP_NAME", "EVT_STAT_CD", "NAME", "ROW_ID_2", "EMAIL_BCC_LINE", "EMAIL_BODY", "EMAIL_CC_LINE", "EMAIL_RECIP_ADDR", "EMAIL_SNDR_ADDR", "EMAIL_SNDR_NAME", "EMAIL_TO_LINE", "SR_NUM", "SR_TITLE_3", "SR_AREA_4", "SR_SUB_AREA", "SR_STAT_ID", "X_ASG_LCT_ID", "X_ASO_LCT_ID", "SALES_ID", "MAIN_ISSUE", "LEVEL1", "LEVEL2", "DESCRIPTION", "SUMMARY", "RESOLVE STEPS", "RESOLUTION", "ALL TAGS", "ACT_CLOSED_DT_REOPENS", "ACT_OPEN_DT_REOPENS", "Days Open", "Email Created Date", "Fiscal Week Start", "Fiscal Month", "Fiscal Week", "Fiscal Year", "ACT_CLOSE_DT - Copy"}, "Attribute", "Value"),
    #"Renamed Columns1" = Table.RenameColumns(#"Unpivoted Columns",{{"Value", "Root cause"}, {"EMAIL_BODY", "EMAILDIRTY"}}),
    return = Table.AddColumn(#"Renamed Columns1", "EMAILCLEAN", each removeAll([EMAILDIRTY]))
in
    return

You just replace the function code from your "removeAll" and it can be used as it is now in your last step. (the function code from removeOne will just be ignored, as it isn't used in the query)

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

9 REPLIES 9
Microsoft
Microsoft

@TerrificPoet,

I merged your threads because that the two threads state same issue. I create a sample PBIX file using your power query function, you can check that how to use this function with your source data.

https://1drv.ms/u/s!AhsotbnGu1Nok1pLXwJMXCUgvKI9

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thank you very much @v-yuezhe-msft the PBIX file you provided has been very helpful but i'm still having issues implementing the function within my own query. Here is my Advanced Editor Script :

let
    removeOne = (input) =>
        let
            text = Text.From(input),
            length = Text.Length(text),
            position = Text.PositionOf(text, "<"),
            positionEnd = Text.PositionOf(text, ">"),
            range = positionEnd-position+1,
            result = if position >= 0 then Text.ReplaceRange(text, position, range, "") else input
        in
            result,

    removeAll = (input) =>
        let
            rmvOne = removeOne(input),
            rmvAll = if Text.PositionOf(rmvOne, "<") >= 0 then @removeAll(rmvOne) else rmvOne
        in
            rmvAll,

    Source = Excel.Workbook(File.Contents("F:\Lowe's\customer experience\SAMPLE FILE FOR PB FORUM.xlsx"), null, true),
    #"LES REPORTING DRAFT_Sheet" = Source{[Item="LES REPORTING DRAFT",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(#"LES REPORTING DRAFT_Sheet", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ROW_ID", type text}, {"ATTRIB_9", type text}, {"DESC_TEXT", type text}, {"CREATED", type datetime}, {"SR_CAT_TYPE_CD", type text}, {"SR_AREA", type text}, {"SR_TITLE", type text}, {"REOPENED_TS", type any}, {"ACT_OPEN_DT", type datetime}, {"ACT_CLOSE_DT", type any}, {"CREATED_1", type datetime}, {"COMMENTS_LONG", type text}, {"EMAIL_RECIP_NAME", type text}, {"EVT_STAT_CD", type text}, {"NAME", type text}, {"ROW_ID_2", type text}, {"EMAIL_BCC_LINE", type text}, {"EMAIL_BODY", type text}, {"EMAIL_CC_LINE", type text}, {"EMAIL_RECIP_ADDR", type text}, {"EMAIL_SNDR_ADDR", type text}, {"EMAIL_SNDR_NAME", type text}, {"EMAIL_TO_LINE", type text}, {"SR_NUM", type text}, {"SR_TITLE_3", type text}, {"SR_AREA_4", type text}, {"SR_SUB_AREA", type text}, {"SR_STAT_ID", type text}, {"X_ASG_LCT_ID", type text}, {"X_ASO_LCT_ID", type text}, {"SALES_ID", type any}, {"MAIN_ISSUE", type text}, {"LEVEL1", type text}, {"LEVEL2", type text}, {"DESCRIPTION", type text}, {"SUMMARY", type text}, {"RESOLVE STEPS", type text}, {"RESOLUTION", type text}, {"ALL TAGS", type text}, {"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}, {"Category 5", type text}, {"Category 6", type any}, {" Category 7", type any}, {" Category 8", type any}, {" Category 9", type any}, {" Category 1", type any}, {"ACT_CLOSED_DT_REOPENS", type datetime}, {"ACT_OPEN_DT_REOPENS", type datetime}, {"Days Open", type number}, {"Email Created Date", type any}, {"Fiscal Week Start", type any}, {"Fiscal Month", type any}, {"Fiscal Week", type any}, {"Fiscal Year", type any}}),
    #"Extracted Date" = Table.TransformColumns(#"Changed Type",{{"CREATED", DateTime.Date, type date}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Extracted Date", "ACT_CLOSE_DT", "ACT_CLOSE_DT - Copy"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Duplicated Column",{{"ACT_CLOSE_DT - Copy", type date}}),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type1", {{"ACT_CLOSE_DT - Copy", #date(1950, 1, 1)}}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Replaced Errors",{{"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}, {"Category 5", type text}, {"Category 6", type text}, {" Category 7", type text}, {" Category 8", type text}, {" Category 9", type text}, {" Category 1", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type2",{{" Category 1", "Category 10"}}),
    #"Trimmed Text" = Table.TransformColumns(#"Renamed Columns",{{"Category 1", Text.Trim, type text}, {"Category 2", Text.Trim, type text}, {"Category 3", Text.Trim, type text}, {"Category 4", Text.Trim, type text}, {"Category 5", Text.Trim, type text}, {"Category 6", Text.Trim, type text}, {" Category 7", Text.Trim, type text}, {" Category 8", Text.Trim, type text}, {" Category 9", Text.Trim, type text}, {"Category 10", Text.Trim, type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Trimmed Text",null,"is null",Replacer.ReplaceValue,{"Category 10"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",null,"is null",Replacer.ReplaceValue,{" Category 9"}),
    #"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1",null,"is null",Replacer.ReplaceValue,{" Category 8"}),
    #"Replaced Value3" = Table.ReplaceValue(#"Replaced Value2",null,"is null",Replacer.ReplaceValue,{" Category 7"}),
    #"Replaced Value4" = Table.ReplaceValue(#"Replaced Value3",null,"is null",Replacer.ReplaceValue,{"Category 6"}),
    #"Replaced Value5" = Table.ReplaceValue(#"Replaced Value4",null,"is null",Replacer.ReplaceValue,{"Category 5"}),
    #"Replaced Value6" = Table.ReplaceValue(#"Replaced Value5",null,"is null",Replacer.ReplaceValue,{"Category 4"}),
    #"Replaced Value7" = Table.ReplaceValue(#"Replaced Value6",null,"is null",Replacer.ReplaceValue,{"Category 3"}),
    #"Replaced Value8" = Table.ReplaceValue(#"Replaced Value7",null,"is null",Replacer.ReplaceValue,{"Category 2"}),
    #"Replaced Value9" = Table.ReplaceValue(#"Replaced Value8","","is null",Replacer.ReplaceValue,{"Category 1"}),
    #"Replaced Value10" = Table.ReplaceValue(#"Replaced Value9","","is null",Replacer.ReplaceValue,{"Category 2"}),
    #"Replaced Value11" = Table.ReplaceValue(#"Replaced Value10","","is null",Replacer.ReplaceValue,{"Category 3"}),
    #"Removed Errors" = Table.RemoveRowsWithErrors(#"Replaced Value11", {"CREATED"}),
    #"Removed Errors1" = Table.RemoveRowsWithErrors(#"Removed Errors", {"Days Open"}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Removed Errors1", {"ROW_ID", "ATTRIB_9", "DESC_TEXT", "CREATED", "SR_CAT_TYPE_CD", "SR_AREA", "SR_TITLE", "REOPENED_TS", "ACT_OPEN_DT", "ACT_CLOSE_DT", "CREATED_1", "COMMENTS_LONG", "EMAIL_RECIP_NAME", "EVT_STAT_CD", "NAME", "ROW_ID_2", "EMAIL_BCC_LINE", "EMAIL_BODY", "EMAIL_CC_LINE", "EMAIL_RECIP_ADDR", "EMAIL_SNDR_ADDR", "EMAIL_SNDR_NAME", "EMAIL_TO_LINE", "SR_NUM", "SR_TITLE_3", "SR_AREA_4", "SR_SUB_AREA", "SR_STAT_ID", "X_ASG_LCT_ID", "X_ASO_LCT_ID", "SALES_ID", "MAIN_ISSUE", "LEVEL1", "LEVEL2", "DESCRIPTION", "SUMMARY", "RESOLVE STEPS", "RESOLUTION", "ALL TAGS", "ACT_CLOSED_DT_REOPENS", "ACT_OPEN_DT_REOPENS", "Days Open", "Email Created Date", "Fiscal Week Start", "Fiscal Month", "Fiscal Week", "Fiscal Year", "ACT_CLOSE_DT - Copy"}, "Attribute", "Value"),
    #"Renamed Columns1" = Table.RenameColumns(#"Unpivoted Columns",{{"Value", "Root cause"}, {"EMAIL_BODY", "EMAILDIRTY"}}),
    return = Table.AddColumn(#"Renamed Columns1", "EMAILCLEAN", each removeAll([EMAILDIRTY]))
in
    return

Whenever I run it i receive an error : 

Expression.Error: Evaluation resulted in a stack overflow and cannot continue.

What did i do wrong in the Advanced Editor ? Also do you think it would be better, performance wise, to create the column EMAILCLEAN before doing the unpivoted columns as unpivoting columns creates more rows? 

 

I created a sample file

 

Here is a link to a google drive where i uploaded a PBIX file and the correspondant excel file :

 

https://drive.google.com/open?id=1eDk05RFLdV-T2Kv_5ao2i8m6nLfwQW7w

 

Thank you for your help and your time,

 

Nicolas

Recursion is slow in M and therefore not recommended. Use List.Generate instead.

 

But for your problem, a simpler way could be this function: https://social.technet.microsoft.com/Forums/office/en-US/b080d122-14f0-43bc-8a86-f50cdf1c32d8/removi...

 

Simply replace your function code from removeAll with it and you don't have to adjust the other code at all.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

thank you @ImkeF for your answer,

 

I'm relatively new to M and i'm having trouble implementing your code in my own M Script. I get an error telling me that I cannot convert a value of type List to type text.  Could you help me thank you 

let 
    Source = Excel.Workbook(File.Contents("F:\Lowe's\customer experience\SAMPLE FILE FOR PB FORUM.xlsx"), null, true),
    #"LES REPORTING DRAFT_Sheet" = Source{[Item="LES REPORTING DRAFT",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(#"LES REPORTING DRAFT_Sheet", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ROW_ID", type text}, {"ATTRIB_9", type text}, {"DESC_TEXT", type text}, {"CREATED", type datetime}, {"SR_CAT_TYPE_CD", type text}, {"SR_AREA", type text}, {"SR_TITLE", type text}, {"REOPENED_TS", type any}, {"ACT_OPEN_DT", type datetime}, {"ACT_CLOSE_DT", type any}, {"CREATED_1", type datetime}, {"COMMENTS_LONG", type text}, {"EMAIL_RECIP_NAME", type text}, {"EVT_STAT_CD", type text}, {"NAME", type text}, {"ROW_ID_2", type text}, {"EMAIL_BCC_LINE", type text}, {"EMAIL_BODY", type text}, {"EMAIL_CC_LINE", type text}, {"EMAIL_RECIP_ADDR", type text}, {"EMAIL_SNDR_ADDR", type text}, {"EMAIL_SNDR_NAME", type text}, {"EMAIL_TO_LINE", type text}, {"SR_NUM", type text}, {"SR_TITLE_3", type text}, {"SR_AREA_4", type text}, {"SR_SUB_AREA", type text}, {"SR_STAT_ID", type text}, {"X_ASG_LCT_ID", type text}, {"X_ASO_LCT_ID", type text}, {"SALES_ID", type any}, {"MAIN_ISSUE", type text}, {"LEVEL1", type text}, {"LEVEL2", type text}, {"DESCRIPTION", type text}, {"SUMMARY", type text}, {"RESOLVE STEPS", type text}, {"RESOLUTION", type text}, {"ALL TAGS", type text}, {"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}, {"Category 5", type text}, {"Category 6", type any}, {" Category 7", type any}, {" Category 8", type any}, {" Category 9", type any}, {" Category 1", type any}, {"ACT_CLOSED_DT_REOPENS", type datetime}, {"ACT_OPEN_DT_REOPENS", type datetime}, {"Days Open", type number}, {"Email Created Date", type any}, {"Fiscal Week Start", type any}, {"Fiscal Month", type any}, {"Fiscal Week", type any}, {"Fiscal Year", type any}}),
    #"Extracted Date" = Table.TransformColumns(#"Changed Type",{{"CREATED", DateTime.Date, type date}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Extracted Date", "ACT_CLOSE_DT", "ACT_CLOSE_DT - Copy"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Duplicated Column",{{"ACT_CLOSE_DT - Copy", type date}}),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type1", {{"ACT_CLOSE_DT - Copy", #date(1950, 1, 1)}}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Replaced Errors",{{"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}, {"Category 5", type text}, {"Category 6", type text}, {" Category 7", type text}, {" Category 8", type text}, {" Category 9", type text}, {" Category 1", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type2",{{" Category 1", "Category 10"}}),
    #"Trimmed Text" = Table.TransformColumns(#"Renamed Columns",{{"Category 1", Text.Trim, type text}, {"Category 2", Text.Trim, type text}, {"Category 3", Text.Trim, type text}, {"Category 4", Text.Trim, type text}, {"Category 5", Text.Trim, type text}, {"Category 6", Text.Trim, type text}, {" Category 7", Text.Trim, type text}, {" Category 8", Text.Trim, type text}, {" Category 9", Text.Trim, type text}, {"Category 10", Text.Trim, type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Trimmed Text",null,"is null",Replacer.ReplaceValue,{"Category 10"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",null,"is null",Replacer.ReplaceValue,{" Category 9"}),
    #"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1",null,"is null",Replacer.ReplaceValue,{" Category 8"}),
    #"Replaced Value3" = Table.ReplaceValue(#"Replaced Value2",null,"is null",Replacer.ReplaceValue,{" Category 7"}),
    #"Replaced Value4" = Table.ReplaceValue(#"Replaced Value3",null,"is null",Replacer.ReplaceValue,{"Category 6"}),
    #"Replaced Value5" = Table.ReplaceValue(#"Replaced Value4",null,"is null",Replacer.ReplaceValue,{"Category 5"}),
    #"Replaced Value6" = Table.ReplaceValue(#"Replaced Value5",null,"is null",Replacer.ReplaceValue,{"Category 4"}),
    #"Replaced Value7" = Table.ReplaceValue(#"Replaced Value6",null,"is null",Replacer.ReplaceValue,{"Category 3"}),
    #"Replaced Value8" = Table.ReplaceValue(#"Replaced Value7",null,"is null",Replacer.ReplaceValue,{"Category 2"}),
    #"Replaced Value9" = Table.ReplaceValue(#"Replaced Value8","","is null",Replacer.ReplaceValue,{"Category 1"}),
    #"Replaced Value10" = Table.ReplaceValue(#"Replaced Value9","","is null",Replacer.ReplaceValue,{"Category 2"}),
    #"Replaced Value11" = Table.ReplaceValue(#"Replaced Value10","","is null",Replacer.ReplaceValue,{"Category 3"}),
    #"Removed Errors" = Table.RemoveRowsWithErrors(#"Replaced Value11", {"CREATED"}),
    #"Removed Errors1" = Table.RemoveRowsWithErrors(#"Removed Errors", {"Days Open"}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Removed Errors1", {"ROW_ID", "ATTRIB_9", "DESC_TEXT", "CREATED", "SR_CAT_TYPE_CD", "SR_AREA", "SR_TITLE", "REOPENED_TS", "ACT_OPEN_DT", "ACT_CLOSE_DT", "CREATED_1", "COMMENTS_LONG", "EMAIL_RECIP_NAME", "EVT_STAT_CD", "NAME", "ROW_ID_2", "EMAIL_BCC_LINE", "EMAIL_BODY", "EMAIL_CC_LINE", "EMAIL_RECIP_ADDR", "EMAIL_SNDR_ADDR", "EMAIL_SNDR_NAME", "EMAIL_TO_LINE", "SR_NUM", "SR_TITLE_3", "SR_AREA_4", "SR_SUB_AREA", "SR_STAT_ID", "X_ASG_LCT_ID", "X_ASO_LCT_ID", "SALES_ID", "MAIN_ISSUE", "LEVEL1", "LEVEL2", "DESCRIPTION", "SUMMARY", "RESOLVE STEPS", "RESOLUTION", "ALL TAGS", "ACT_CLOSED_DT_REOPENS", "ACT_OPEN_DT_REOPENS", "Days Open", "Email Created Date", "Fiscal Week Start", "Fiscal Month", "Fiscal Week", "Fiscal Year", "ACT_CLOSE_DT - Copy"}, "Attribute", "Value"),
    #"Renamed Columns1" = Table.RenameColumns(#"Unpivoted Columns",{{"Value", "Root cause"}, {"EMAIL_BODY", "EMAILDIRTY"}}),
    SELECT = Text.From(Table.Column(#"Renamed Columns1","EMAILDIRTY")),
    SplitAny = Text.SplitAny(SELECT,"<>"),
    ListAlternate = List.Alternate(SplitAny,1,1,1),
    ListSelect = List.Select(ListAlternate, each _<>""),
    TextCombine = Text.Combine(ListSelect, "")

in
    TextCombine

Hi @TerrificPoet,

no problem, we all were in your shoes at some time.

 

What I meant was this:

 

let
    removeOne = (input) =>
        let
            text = Text.From(input),
            length = Text.Length(text),
            position = Text.PositionOf(text, "<"),
            positionEnd = Text.PositionOf(text, ">"),
            range = positionEnd-position+1,
            result = if position >= 0 then Text.ReplaceRange(text, position, range, "") else input
        in
            result,

    removeAll = (HTML as text) =>
let
    Source = Text.From(HTML),
    SplitAny = Text.SplitAny(Source,"<>"),
    ListAlternate = List.Alternate(SplitAny,1,1,1),
    ListSelect = List.Select(ListAlternate, each _<>""),
    TextCombine = Text.Combine(ListSelect, "")
in
    TextCombine,

    Source = Excel.Workbook(File.Contents("F:\Lowe's\customer experience\SAMPLE FILE FOR PB FORUM.xlsx"), null, true),
    #"LES REPORTING DRAFT_Sheet" = Source{[Item="LES REPORTING DRAFT",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(#"LES REPORTING DRAFT_Sheet", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"ROW_ID", type text}, {"ATTRIB_9", type text}, {"DESC_TEXT", type text}, {"CREATED", type datetime}, {"SR_CAT_TYPE_CD", type text}, {"SR_AREA", type text}, {"SR_TITLE", type text}, {"REOPENED_TS", type any}, {"ACT_OPEN_DT", type datetime}, {"ACT_CLOSE_DT", type any}, {"CREATED_1", type datetime}, {"COMMENTS_LONG", type text}, {"EMAIL_RECIP_NAME", type text}, {"EVT_STAT_CD", type text}, {"NAME", type text}, {"ROW_ID_2", type text}, {"EMAIL_BCC_LINE", type text}, {"EMAIL_BODY", type text}, {"EMAIL_CC_LINE", type text}, {"EMAIL_RECIP_ADDR", type text}, {"EMAIL_SNDR_ADDR", type text}, {"EMAIL_SNDR_NAME", type text}, {"EMAIL_TO_LINE", type text}, {"SR_NUM", type text}, {"SR_TITLE_3", type text}, {"SR_AREA_4", type text}, {"SR_SUB_AREA", type text}, {"SR_STAT_ID", type text}, {"X_ASG_LCT_ID", type text}, {"X_ASO_LCT_ID", type text}, {"SALES_ID", type any}, {"MAIN_ISSUE", type text}, {"LEVEL1", type text}, {"LEVEL2", type text}, {"DESCRIPTION", type text}, {"SUMMARY", type text}, {"RESOLVE STEPS", type text}, {"RESOLUTION", type text}, {"ALL TAGS", type text}, {"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}, {"Category 5", type text}, {"Category 6", type any}, {" Category 7", type any}, {" Category 8", type any}, {" Category 9", type any}, {" Category 1", type any}, {"ACT_CLOSED_DT_REOPENS", type datetime}, {"ACT_OPEN_DT_REOPENS", type datetime}, {"Days Open", type number}, {"Email Created Date", type any}, {"Fiscal Week Start", type any}, {"Fiscal Month", type any}, {"Fiscal Week", type any}, {"Fiscal Year", type any}}),
    #"Extracted Date" = Table.TransformColumns(#"Changed Type",{{"CREATED", DateTime.Date, type date}}),
    #"Duplicated Column" = Table.DuplicateColumn(#"Extracted Date", "ACT_CLOSE_DT", "ACT_CLOSE_DT - Copy"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Duplicated Column",{{"ACT_CLOSE_DT - Copy", type date}}),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Changed Type1", {{"ACT_CLOSE_DT - Copy", #date(1950, 1, 1)}}),
    #"Changed Type2" = Table.TransformColumnTypes(#"Replaced Errors",{{"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}, {"Category 5", type text}, {"Category 6", type text}, {" Category 7", type text}, {" Category 8", type text}, {" Category 9", type text}, {" Category 1", type text}}),
    #"Renamed Columns" = Table.RenameColumns(#"Changed Type2",{{" Category 1", "Category 10"}}),
    #"Trimmed Text" = Table.TransformColumns(#"Renamed Columns",{{"Category 1", Text.Trim, type text}, {"Category 2", Text.Trim, type text}, {"Category 3", Text.Trim, type text}, {"Category 4", Text.Trim, type text}, {"Category 5", Text.Trim, type text}, {"Category 6", Text.Trim, type text}, {" Category 7", Text.Trim, type text}, {" Category 8", Text.Trim, type text}, {" Category 9", Text.Trim, type text}, {"Category 10", Text.Trim, type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Trimmed Text",null,"is null",Replacer.ReplaceValue,{"Category 10"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",null,"is null",Replacer.ReplaceValue,{" Category 9"}),
    #"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1",null,"is null",Replacer.ReplaceValue,{" Category 8"}),
    #"Replaced Value3" = Table.ReplaceValue(#"Replaced Value2",null,"is null",Replacer.ReplaceValue,{" Category 7"}),
    #"Replaced Value4" = Table.ReplaceValue(#"Replaced Value3",null,"is null",Replacer.ReplaceValue,{"Category 6"}),
    #"Replaced Value5" = Table.ReplaceValue(#"Replaced Value4",null,"is null",Replacer.ReplaceValue,{"Category 5"}),
    #"Replaced Value6" = Table.ReplaceValue(#"Replaced Value5",null,"is null",Replacer.ReplaceValue,{"Category 4"}),
    #"Replaced Value7" = Table.ReplaceValue(#"Replaced Value6",null,"is null",Replacer.ReplaceValue,{"Category 3"}),
    #"Replaced Value8" = Table.ReplaceValue(#"Replaced Value7",null,"is null",Replacer.ReplaceValue,{"Category 2"}),
    #"Replaced Value9" = Table.ReplaceValue(#"Replaced Value8","","is null",Replacer.ReplaceValue,{"Category 1"}),
    #"Replaced Value10" = Table.ReplaceValue(#"Replaced Value9","","is null",Replacer.ReplaceValue,{"Category 2"}),
    #"Replaced Value11" = Table.ReplaceValue(#"Replaced Value10","","is null",Replacer.ReplaceValue,{"Category 3"}),
    #"Removed Errors" = Table.RemoveRowsWithErrors(#"Replaced Value11", {"CREATED"}),
    #"Removed Errors1" = Table.RemoveRowsWithErrors(#"Removed Errors", {"Days Open"}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Removed Errors1", {"ROW_ID", "ATTRIB_9", "DESC_TEXT", "CREATED", "SR_CAT_TYPE_CD", "SR_AREA", "SR_TITLE", "REOPENED_TS", "ACT_OPEN_DT", "ACT_CLOSE_DT", "CREATED_1", "COMMENTS_LONG", "EMAIL_RECIP_NAME", "EVT_STAT_CD", "NAME", "ROW_ID_2", "EMAIL_BCC_LINE", "EMAIL_BODY", "EMAIL_CC_LINE", "EMAIL_RECIP_ADDR", "EMAIL_SNDR_ADDR", "EMAIL_SNDR_NAME", "EMAIL_TO_LINE", "SR_NUM", "SR_TITLE_3", "SR_AREA_4", "SR_SUB_AREA", "SR_STAT_ID", "X_ASG_LCT_ID", "X_ASO_LCT_ID", "SALES_ID", "MAIN_ISSUE", "LEVEL1", "LEVEL2", "DESCRIPTION", "SUMMARY", "RESOLVE STEPS", "RESOLUTION", "ALL TAGS", "ACT_CLOSED_DT_REOPENS", "ACT_OPEN_DT_REOPENS", "Days Open", "Email Created Date", "Fiscal Week Start", "Fiscal Month", "Fiscal Week", "Fiscal Year", "ACT_CLOSE_DT - Copy"}, "Attribute", "Value"),
    #"Renamed Columns1" = Table.RenameColumns(#"Unpivoted Columns",{{"Value", "Root cause"}, {"EMAIL_BODY", "EMAILDIRTY"}}),
    return = Table.AddColumn(#"Renamed Columns1", "EMAILCLEAN", each removeAll([EMAILDIRTY]))
in
    return

You just replace the function code from your "removeAll" and it can be used as it is now in your last step. (the function code from removeOne will just be ignored, as it isn't used in the query)

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

Thank you very much for your help @ImkeF 

 

This really helped me !

 

Kind regards,

 

Nicolas

 

 

Helper I
Helper I

Hi, I'm in the process of moving Tableau reports to Power Bi. When I was using Tableau i was able to clean my data by deleting the HTML TAGS using that Formula :

 

 

REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE

(REGEXP_REPLACE([Email Body], '\<.*?\>', "")

,'&nbsp',"")

,'&lt',"")

,'&gt',"")

,';',"")

,'&amp',"&")

,'Subject:',"Subject:"),

'&quot','"'),

'To:',"To:"),

'From:',"From:"),

"\{.*?\}",""),

"v\\:\*",""),

"o\\:\*",""),

"w\\:\*",""),

'.shape',""),

"div.*","Code prevents text display"),

"\@font-face",""),

"/*[a-zA-Z]*/",""),

"*[a-zA-Z]*",""),

"p.MsoNormal",""),

"li.MsoNormal",""),

"/*Font Definitions/*",""),

"/*Style Definitions/*","")

 

But as Tableau doesnt use the same language as Power Bi this won't work anymore. How could I clean my data in Power Bi ?

 

Kind regards,

 

Nicolas

@TerrificPoet,

You can use SUBSTITUTE function in Power BI Desktop, there is an example for your reference.

Column = SUBSTITUTE(SUBSTITUTE(Table1[Email Body],"\<.*?\>",""),"div*","Code prevents text display")

1.PNG

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi @v-yuezhe-msft,

 

Thank you for your answer but i forgot to mention something in my earlier post ...

it might be difficult to see at first but the function i posted earlier is a Regular expression therefore when for example there is this :  '\<.*?\>' it is not a string to replace but a code for multiple possible word here is a link to a Regular Expression Code Cheat Sheet that might help you understand the code : http://www.cbs.dtu.dk/courses/27610/regular-expressions-cheat-sheet-v2.pdf

 

In the  Regular expression the code for multiple words are contained between  ['] and the exact string to replace are contained between ["] which is slightly different.  

Helpful resources

Announcements
secondImage

Congratulations!

We are excited to announce the Power BI Super Users!

Wave Release 2

Check out the updates in Power BI.

Overview of Power BI 2020 release wave 2!

Get Ready for Power BI Dev Camp

Get Ready for Power BI Dev Camp

Mark your calendars and join us for our next Power BI Dev Camp!.

secondImage

The largest Power BI virtual conference

100+ sessions, 100+ speakers, Product managers, MVPs, and experts. All about Power BI. Attend online or watch the recordings.

Top Solution Authors
Top Kudoed Authors