cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
jchungfluentco
Frequent Visitor

Removing HTML Script from data in Query Editor

Hi everyone,

 

I'm trying to remove all text between < and > including the carrots from a column in Query Editor.

 

I thought about splitting the column with the ? being the delimiter, then creating a column with "?", and then concatenating but the first example wants shows that there are cells with styling before the question mark.

 

Thanks in advance!

 

Here are two examples:

 

Click <b>YES/b> if you watched the Super Bowl?

 

Did you watch the Super Bowl?<span style="font-size:14px !important; display: block;">Click YES if you did.</span>

<div id="disclaimer-1">You may be eligible for a free high five if you watched the Super Bowl.</div>

<style type="text/css" id="customCSS">
.Qid1 .answers{
position: relative;
top: -130px;
}

.Qid1 #disclaimer-54269{
position: relative;
top: 175px;
display: block;
width: 100%;
margin: auto;
text-align: justify;
font-size: 10px;
line-height: 16px;
color: #999;
}

@media all and (min-width: 600px){
.Qid1 .answers{
top: -120px;
}
.Qid1 #disclaimer-54269{
font-size: 12px;
line-height: 20px;
}
}

@media all and (min-width: 601px){
.Qid1 .answers{
top: -75px;
}
.Qid1 #disclaimer-54269{
top: 175px;
float: none;
display: block;
width: 100%;
}
}

@media all and (min-width: 900px){
.Qid1 #disclaimer-54269{
top: 85px;
}
}

</style>

 

 

1 ACCEPTED SOLUTION

I transformed it into a proper function here 🙂 https://github.com/ImkeF/M/blob/master/Library/Text.RemoveHtmlTags.pq

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

23 REPLIES 23
Greg_Deckler
Super User
Super User

Web.Page("your html") or this technique:

 

https://social.technet.microsoft.com/Forums/en-US/473b2c06-7845-4df7-846a-d350ee84e942/remove-html-t...

 

Maybe @ImkeF has some more tricks up her sleeve.


@ me in replies or I'll lose your thread!!!
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Learn Power BI 2nd Edition

I transformed it into a proper function here 🙂 https://github.com/ImkeF/M/blob/master/Library/Text.RemoveHtmlTags.pq

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi @ImkeF,

 

I am having the same issue with a field in a table I am pulling from Salesforce object. Would you be able tell me what I'm doing wrong when trying to implement your code? Are there other lines I would need to modify besides source? I am pasting it below the exsisting code in the query for this particular table and getting "Token Eof expected" error. Sorry if these are overly basic questions, I'm only just beginning to learn power query so any help you can offer is much appreciated!

 

let
    Source = Salesforce.Data("https://login.salesforce.com/", [ApiVersion=48, CreateNavigationProperties=true]),
    Claims__c = Source{[Name="Claims__c"]}[Data],
    #"Renamed Columns" = Table.RenameColumns(Claims__c,{{"Name", "Claim Number"}, {"Claim_Notice_Date__c", "Claim Notice Date"}, {"Loss_Type__c", "Loss Type"}, {"Status__c", "Status"}, {"Date_of_Binding__c", "Date of Binding"}, {"Claim_Type__c", "Claim Type"}}),
    #"Inserted Merged Column" = Table.AddColumn(#"Renamed Columns", "Merged", each Text.Combine({[Status], " - ", Text.Proper([Loss Type])}), type text),
    #"Removed Columns" = Table.RemoveColumns(#"Inserted Merged Column",{"Merged"}),
    #"Inserted Merged Column1" = Table.AddColumn(#"Removed Columns", "Merged", each Text.Combine({[Status], " - ", [Loss Type]}), type text),
    #"Added Conditional Column" = Table.AddColumn(#"Inserted Merged Column1", "Ststus & Loss Type", each if [Merged] = "Open - Third Party" then "Open 3rd" else if [Merged] = "Open - First Party" then "Open 1st" else if [Merged] = "Closed - Third Party" then "Closed 3rd" else if [Merged] = "Closed - First Party" then "Closed 1st" else null, type text),
    #"Renamed Columns1" = Table.RenameColumns(#"Added Conditional Column",{{"Ststus & Loss Type", "Status & Loss Type"}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Renamed Columns1",{"Status & Loss Type", "Merged"}),
    #"Inserted Merged Column2" = Table.AddColumn(#"Removed Columns1", "Merged", each Text.Combine({[Status], " ", [Loss Type]}), type text),
    #"Sorted Rows" = Table.Sort(#"Inserted Merged Column2",{{"CreatedDate", Order.Descending}}),
    #"Renamed Columns2" = Table.RenameColumns(#"Sorted Rows",{{"Merged", "Status & Loss Type"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns2", "Has Reserve?", each if [ET_Loss_Reserve__c] = null or [ET_Loss_Reserve__c] = 0 then "No" else "Yes"),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom", each [ET_Loss_Reserve__c]+[Paid_Loss__c]+[Paid_ALAE__c]+[Paid_Expense__c]),
    #"Renamed Columns3" = Table.RenameColumns(#"Added Custom1",{{"Custom", "Known Exposure"}}),
    #"Removed Columns2" = Table.RemoveColumns(#"Renamed Columns3",{"Known Exposure"}),
    #"Inserted Sum" = Table.AddColumn(#"Removed Columns2", "Addition", each List.Sum({[Paid_Loss__c], [Paid_Expense__c], [Paid_ALAE__c], [ET_Loss_Reserve__c]}), type number),
    #"Renamed Columns4" = Table.RenameColumns(#"Inserted Sum",{{"Addition", "Known Exposure"}})
in
    #"Renamed Columns4"

let func = (HTML) =>
    let
        Check = if Value.Is(Value.FromText(HTML), type text) then HTML else "",
        Source = Salesforce.Data("https://login.salesforce.com/", [ApiVersion=48, CreateNavigationProperties=true]),
        SplitAny = Text.SplitAny(Source,"<>"),
        ListAlternate = List.Alternate(SplitAny,1,1,1),
        ListSelect = List.Select(ListAlternate, each _<>""),
        TextCombine = Text.Combine(ListSelect, "")
    in
        TextCombine, 
        documentation = [
            Documentation.Name =  " Text.RemoveHtmlTags"
            , Documentation.Description = "Remove Html Tags"
            , Documentation.LongDescription = " Removes all Html tags from a text"
            , Documentation.Category = " Text.Modification"
            , Documentation.Source = " Inspired by a solution from Bill Szysz"
            , Documentation.Author = " Imke Feldmann: www.TheBIccountant.com & Mike Carlo: PowerBI.Tips"
            , Documentation.Examples = {[
                Description = "Function that enables one to pass in a column that has HTML tags."
                , Code = "<div>my bit of text</div>"
                , Result = "my bit of text "
                ]}] 
 in 
    Value.ReplaceType(func, Value.ReplaceMetadata(Value.Type(func), documentation))

Many thanks ImkeF, documentation and instruction very helpful to execute.

This did it for me, thanks to ImkeF for the many posts to help complement instrucitons for making use of it!

bchager6
Continued Contributor
Continued Contributor

@ImkeF Can you please provide guidance on where and how to insert this code? I am assuming it's to be inserted somewhere within the existing code found within the Advanced Editor?

Hi @bchager6 ,

you simply create a new query for this function like described here: https://www.youtube.com/watch?v=6TQN6KPG74Q

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

bchager6
Continued Contributor
Continued Contributor

@ImkeF  OK, thank you. 

 

I am importing data fom ServiceNow and every column contains HTML markup. I included a snipped of the URL below.

Can your function transform all of the columns on import somehow?

 

let func = (HTML) =>
    let
        Check = if Value.Is(Value.FromText(HTML), type text) then HTML else "",
        Source = Json.Document(Web.Contents("https://dev.service-now.com/api/now/table/project?sysparm_fields=assumptions%2Cbarriers%2Cbenefits%2...")),
        SplitAny = Text.SplitAny(Source,"<>"),
        ListAlternate = List.Alternate(SplitAny,1,1,1),
        ListSelect = List.Select(ListAlternate, each _<>""),
        TextCombine = Text.Combine(ListSelect, "")
    in
        TextCombine,
        documentation = [
            Documentation.Name =  " Text.RemoveHtmlTags"
            , Documentation.Description = "Remove Html Tags"
            , Documentation.LongDescription = " Removes all Html tags from a text"
            , Documentation.Category = " Text.Modification"
            , Documentation.Source = " Inspired by a solution from Bill Szysz"
            , Documentation.Author = " Imke Feldmann: www.TheBIccountant.com & Mike Carlo: PowerBI.Tips"
            , Documentation.Examples = {[
                Description = "Function that enables one to pass in a column that has HTML tags."
                , Code = "<div>my bit of text</div>"
                , Result = "my bit of text "
                ]}]
 in
    Value.ReplaceType(func, Value.ReplaceMetadata(Value.Type(func), documentation))

Hi @bchager6 ,

pretty sure my code can do that, but not with your modifications.

Just to re-assure: You want to apply my function on all columns of a table?

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

bchager6
Continued Contributor
Continued Contributor

Yes, all columns.

just take the original code of my function and use it as a separate function called "fnRemoveHtmlTags".

Then go back to your table and add a new step with the following code:

 

Table.TransformColumns(<PreviousStepName>, List.Zip({Table.ColumnNames( <PreviousStepName> ), List.Repeat({fnRemoveHtmlTags}, List.Count(Table.ColumnNames( <PreviousStepName> )))}))

 

You have to replace <PreviousStepName> by the name of the previous step (!! it occours 3 times in the code, but I'm not able to bold them all...)

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi @bchager6 ,

did you manage to get it working? If not, this article might help you: https://www.thebiccountant.com/2019/12/18/advanced-transformation-multiple-columns-at-once-in-power-... 

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

bchager6
Continued Contributor
Continued Contributor

@ImkeF  Thank you for asking. Yes, it works perfectly in terms of removing all of the HTML (thank you!), but now I'm receiving an "OLE DB or ODBC error: Type mismatch. (Exception from HRESULT: 0x80020005 (DISP_E_TYPEMISMATCH))" message after applying the query changes. So I'm trying to figure out what's causing that. Not sure if it's related to this function in that it changes the data type and I have to handle that or what.

Hi @bchager6 

you can try an adjusted function that doesn't do the type-conversion:

 

 (Table as table, Function, optional ColumnNames as list) =>

let

    columnNames = if ColumnNames = null then Table.ColumnNames(Table) else ColumnNames,

    Transformation = Table.TransformColumns( Table, List.Transform(columnNames, each {_, Function} ) )

in

    Transformation

 

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

bchager6
Continued Contributor
Continued Contributor

How would that change the line I added to my Advanced Editor for the original function you wrote?

 

#"Remove HTML Markup" = Table.TransformColumns(#"Extracted Text Between Delimiters", List.Zip({Table.ColumnNames(#"Extracted Text Between Delimiters" ), List.Repeat({fnRemoveHtmlTags}, List.Count(Table.ColumnNames(#"Extracted Text Between Delimiters" )))}))

That function didn't include any type conversion, so it cannot be the cause for it. 

Unfortunately I have no idea what causes it. 

Maybe you try using the other function instead.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

bchager6
Continued Contributor
Continued Contributor

@ImkeF  I went back to my report and ensured that all of the data types matched what was in Power Query Editor (in this case it's all text data from ServiceNow) and the Type Mismatch error disappeared.

 

Thanks for all of your help!

Awesome , you made my day very easy and simple. Thank you so much...

I am connecting to a SharePoint Online List.  Is there a way to use this code to remove the HTML Script for the data when I load?  Thanks.

Hello my friend!

 

I´m having the same issue. Did You get a solution?

 

Thanks.

 

Dan

Helpful resources

Announcements
Power Platform Conf 2022 768x460.jpg

Join us for Microsoft Power Platform Conference

The first Microsoft-sponsored Power Platform Conference is coming in September. 100+ speakers, 150+ sessions, and what's new and next for Power Platform.

Power BI Dev Camp Session 23 768x460.jpg

Check it Out!

Mark your calendars and join us on Thursday, June 30 at 11a PDT for a great session with Ted Pattison!

Top Solution Authors
Top Kudoed Authors