cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
nirmit27
Helper II
Helper II

Best way to remove HTML tags in M or DAX from very large text

Hi,

What is the best way to remove HTML tags from a very large text field in M or DAX?

Currently I am using Html.Table but my table contains over 20 lakh records with very large text (exceeding 300 characters) and it takes more than 0.5 hour to load up after I click "Close & Apply". Its very frustrating to wait this long for a simple modification. Can it be moved to DAX? Any ideas on how to do it?

 

Thanks

Nirmit

1 ACCEPTED SOLUTION
v-yangliu-msft
Community Support
Community Support

Hi  @nirmit27 ,

I created some data:

vyangliumsft_0-1640673270585.png

Here are the steps you can follow:

1. Create calculated column.

Column =
PATHITEM(
    SUBSTITUTE(
        SUBSTITUTE([Html], ">", "|"),
        "</","|"), 2,TEXT)

2. Result:

vyangliumsft_1-1640673270586.png

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

View solution in original post

5 REPLIES 5
v-yangliu-msft
Community Support
Community Support

Hi  @nirmit27 ,

I created some data:

vyangliumsft_0-1640673270585.png

Here are the steps you can follow:

1. Create calculated column.

Column =
PATHITEM(
    SUBSTITUTE(
        SUBSTITUTE([Html], ">", "|"),
        "</","|"), 2,TEXT)

2. Result:

vyangliumsft_1-1640673270586.png

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

v-yangliu-msft
Community Support
Community Support

Hi  @nirmit27 ,

 

This is the related document, you can view this content:

https://community.powerbi.com/t5/Desktop/Removing-HTML-Script-from-data-in-Query-Editor/td-p/351995

https://community.powerbi.com/t5/Power-Query/Removing-HTML-tags-complex/m-p/1221727

https://community.powerbi.com/t5/Power-Query/Removing-HTML-tags-and-reordering-text/m-p/1900057

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

@v-yangliu-msft  i have trid HTML.TABLE in M already. These posts show more complicated way of achieving the same result so not sure if these would really be any faster? Again these are older posts probably before the HTML.TABLE function was released.

Any ideas on DAX? I am more keen on doing that to save time in loading up the queries.

parry2k
Super User
Super User

@nirmit27 did you tried using Power BI dataflows, just copy the queries from desktop to dataflow and see if it performs better. if you have premium then turn on Enhanced computer engine for dataflow and it will surely be way faster.

 

Follow us on LinkedIn

 

Learn about conditional formatting at Microsoft Reactor

My latest blog post The Power of Using Calculation Groups with Inactive Relationships (Part 1) (perytus.com) I would  Kudos if my solution helped. 👉 If you can spend time posting the question, you can also make efforts to give Kudos to whoever helped to solve your problem. It is a token of appreciation!

 

Visit us at https://perytus.com, your one-stop-shop for Power BI-related projects/training/consultancy.






Did I answer your question? Mark my post as a solution.

Proud to be a Super User! Appreciate your Kudos 🙂
Feel free to email me with any of your BI needs.





@parry2k we can't really move to dataflow option for the complexity involved. I will anyway test it and update here if see any improvements. Thanks for replying.
Any ideas on DAX? I am more keen on doing that to save time in loading up the queries.

Helpful resources

Announcements
Microsoft Build 768x460.png

Microsoft Build is May 24-26. Have you registered yet?

Come together to explore latest innovations in code and application development—and gain insights from experts from around the world.

charticulator_carousel_with_text (1).png

Charticulator Design Challenge

Put your data visualization and design skills to the test! This exciting challenge is happening now through May 31st!

Power BI Dev Camp Session 22 768x460.jpg

Check it out!

Mark your calendars and join us on Thursday, May 26 at 11a PDT for a great session with Ted Pattison!