Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
I have a table which contains millions of rows - basically for each date since 2011, each product in my company, and each customer we have, there is an attribute.
The data source (analysis services database) presents the date in the form "1st January 2011" and unfortunately I cannot change this ... My aim is to only pull in data relating to the latest date available in this field. I have two problems that I suspect are connected when trying to achieve this.
The first one is that Power Query gives me an error saying that it can't parse this as a date (because of the Ordinal Indicators "st" "nd" "rd" "th") and defaults to treating the dates as Text. My solution to this was to do a simple find & replace on the ordinal indicator ("st" "nd" "rd" "th"), and then change type to date - this was a success. I then used the inbuilt options in the Power Query interface to filter to the latest date. Technically this solves my problem, but unfortunately, because of the table being quite large, and my solution being inefficient (i.e. I am performing operations on rows prior to discarding more than 99% of them), the performance is dreadful and it really takes a long time to pull through.
It made me think that I need to find a different way of dealing with the Ordinal Indicators ("st" "nd" "rd" "th") from the outset - otherwise I will always be performing a large number of transformations prior to filtering which is bad practice. Does anyone know how best to get this awkward formatting to be correctly interpretted as a date field by Power Query?
Thanks so much for your help in advance.
Best wishes,
Laurence
When you were replacing the "th","st" how did you get it to not pull from the month?
Example:
August 25th 2019 --> Augu 25 2019
Hi @lefinalzugzwang,
Based on my research, you could refer to below blog to have a test, but it also will effect your performence in query editor.
Reference:https://wmfexcel.com/2017/11/11/data-transformation-with-power-query/
Regards,
Daniel He
Hi Daniel, thanks for your reply.
This is basically what I was doing ... it did give me another idea though, which was to try:
Unfortunately the performance doesn't seem to have improved by this. It takes much more time "evaluating", "loading data to model", and then takes more time per row (albeit with a lesser number of rows) - the net result is it is similarly slow to execute.
This makes me wonder whether it is actually the filtering that is causing the performance issue? Is it normal for filtering rows by date to be such an intensive task?
Thanks for your time.
Best wishes,
Laurence
Hi @lefinalzugzwang,
Could you please tell me if your problem has been solved? If it is, could you please mark the helpful replies as Answered?
Regards,
Daniel He
Hi Daniel,
Unfortunately I am still suffering with the issues, and haven't yet found a solution based on the best practices.
Thanks for your help nevertheless.
Regards,
Laurence
Hi @lefinalzugzwang,
Based on my research, I am afraid you could avoid the performence problem, you could refer to below link to know how to improve the performence:
http://blog.pragmaticworks.com/power-bi-performance-tips-and-techniques
https://docs.microsoft.com/en-us/power-bi/power-bi-reports-performance
Regards,
Daniel He
User | Count |
---|---|
130 | |
108 | |
101 | |
67 | |
62 |
User | Count |
---|---|
137 | |
115 | |
102 | |
71 | |
61 |