Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Aweptimum
Helper II
Helper II

Optimizing DateTime in Power BI

I recently found a white paper on the the VertiPaq engine and downloaded DAX Studio to see the VertiPaq metrics for a large timeseries data-set I have (~60 million rows) down to the second.

 

It is imported from a PGSQL database and comes to ~500 MB. I was astounded that 73% of the ENTIRE thing is just the timestamp column (date-time), but it makes sense. Time is stored as a UNIX integer in the DB, but for plotting time-series it's better to have something human-readable.

 

What struck me though is that, if you split the date-time column between date and time, VertiPaq could easily build a dictionary for the date with run-length-encoding and ditto for the time. It would eliminate the uniqueness of the time-stamp column, massively reduce the size of the compressed time-series data, and yet still preserve the granularity.

 

I tried out splitting the column in two, but I wasn't able to plot it. Has anyone found a way to compress their timestamp column while still being able to plot it? Or is there a better plot somewhere in the visual store that could handle this?

 

Thanks!

 

Edit: sqlbi article recommending this very technique:
https://www.sqlbi.com/articles/optimizing-high-cardinality-columns-in-vertipaq/

2 REPLIES 2
Aweptimum
Helper II
Helper II

Sorry to necro this thread, I went off down the hole of Power BI custom visuals to see if they could offer a solution. It is possible to use D3/Plotly to format UNIX time as Date/Time without converting the underlying data, which is awesome. It is also entirely possible to knit stored Date and Time columns into Date/Time in a custom plot to enable compression in VertiPaq.

 

I myself used R and Plotly in an rhtml visual to make something that could support both. The performance of straight UNIX vs Date/Time knitting seems to be about the same, if a little slower on the knitting side. However, Plotly's scatter performance inside of Power BI seems to be the bottleneck. It takes ~3x as long to plot compared to a native Power BI line chart. An advantage Plotly has is that its built-in filtering doesn't require the visual to be updated, so it's a lot more responsive once rendered. I would recommend it if a Pro license is available to you.

 

If you don't have a Pro license, I don't recommend it - I don't think anyone wants to pay, nor should they pay, $10 a month just to plot UNIX data at 1/3 of the performance.

 

I believe the onus should be on Microsoft, not the end-user, to natively support UNIX formatting and Date/Time compression, and so I've submitted an idea here. ~20 people might see this, but at least it won't be implemented for lack of trying on our part.

 

Also, if you use Azure PostgreSQL, upvote this: Keep Timescale Updated

It's a DB extension that has a compression ratio of ~20:1 for timeseries data in version 1.7

Aweptimum
Helper II
Helper II

I posted this identical question in the Power BI subreddit, someone there pointed out it's possible to use drill-down to plot data based on two fields. The good news is that it worked, the bad news is that its fixed to categorical mode, so the plot only displays over a ~20 minute period. It is possible to scroll through the plot, but it's not practical, and I doubt anyone wants to scroll through a period of 30 days in such a manner.

 

If no one can offer an answer in the next few days, I'll submit a proposal for split date-time columns support to Power BI ideas and link it here. It would make storage of larger scale timeseries data to the second a lot more feasible in Power BI. (I think it's a shame it's not built into xVelocity/VertiPaq Engine though)

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.