Reply
Frequent Visitor
Posts: 15
Registered: ‎12-22-2015
Accepted Solution

Using SQL Server with Nonclustered Columnstore Index

[ Edited ]

Hi everyone,

 

I'm trying to load data from a SQL Server 2017 table with a Nonclustered Columstore Index.
I'm primarly using a columnstore index due to the compression benefits. The underyling rowstore table is >10gb in size (making refreshes in the web service impossbile) and greatly benefits from the columnstore compression (~800mb in the columnstore index).


I've previously sucessfully tested this setup with a clustered columnstore index (CCI); Web service refreshes worked and refresh times were unchanged, great! However, our usecases requires about 1 mil row updates per hour (queryied by ID) which results in terrible write performance on a CCI table as I can't create a clustered index on the primary key.
As storage is not an issue for us, I figured a Nonclustered Columnstore Index (NCCI) would be the silver bullet, giving good update performance whilst still benefitting from the compression in PowerBI.

However, when I try to load an NCCI table in PowerBI, the query does not seem to use the columnstore but rather the rowstore (according to SQL Server execution plan as well as service refresh errors due to table size).

 

I could come up with 2 workaround:

1. Force usage of NCCI via query hints. Suboptimal as I would like to keep our PowerBI team from having to write raw SQL. Also, not sure if this is possible to do on a "connection-level" / in a way that preserves query folding functionalities.

2. Creating seperate tables for the rowstore and CCI and regularly rebuilding the CCI off of the rowstore. This seems rather ugly as the entire CCI would have to be rebuild every time (instead of the delta being merged as is the case with an NCCI). Besides its hackyness this would be a workable solution depending on the cost of a CCI rebuild; Haven't tested that yet.

Does anyone else here have experience in using NCCIs with PowerBI? Is there some setting I'm missing?
Also, the issue applies to both direct query and import mode. Direct Query is however not really an option for us. We'd migrate to SSAS before we go to direct query.

Thanks a lot!


Accepted Solutions
Frequent Visitor
Posts: 15
Registered: ‎12-22-2015

Re: Using SQL Server with Nonclustered Columnstore Index

Forgot to update with the solution:
I had messed up the creation of the NCCI index and not included 2 columns in it that PowerBI was querying.
The SQL server therefore had to, of course, use the rowstore. It worked fine after fixing the index.

View solution in original post


All Replies
Moderator
Posts: 9,540
Registered: ‎03-10-2016

Re: Using SQL Server with Nonclustered Columnstore Index

@MoorsTech,

What connection mode do you choose to connect to SQL Server in Power BI currently, import or directquery? What detailed error message do you get in Power BI Service?

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Frequent Visitor
Posts: 15
Registered: ‎12-22-2015

Re: Using SQL Server with Nonclustered Columnstore Index

[ Edited ]

@v-yuezhe-msft

I've tried both directquery and import with identical results. However, as previously stated, directquery is not actually an option for us.

The error message is just the standard error for SQL Server tables that are too large:
"The amount of uncompressed data on the gateway client has exceeded the limit of 10 GB for a single table. Please consider reducing the use of highly repetitive strings values through normalized keys, removing unused columns, or upgrading to Power BI Premium."

This is totally expected as the table in question, when using a standard rowstore scheme, is over 10gb in size.

 

My issue is that, when using a CCI, I can circumvent this issue as the CCI compresses this 10gb table to 800mb. I've tested this and it works as expected. But when connecting to an NCCI table, PowerBI does not explicitly ask to use the NCCI table and SQL Server then, for some reason, decides to "optimize" the query and use the rowstore table.
I can force the SQL Server to use the NCCI like this:

SELECT x from Fact with(index=ncci)

However that requires writing raw SQL in PowerBI and thereby disables query folding.
A solution that I'd be happy with would ultimately require either:

 

  • Some way to specify in PowerBI that it should append "with(index=ncci)" to all queries
  • Some user/connection/table level property to be configured on the SQL server that specifies the prefferential usage of the NCCI

 

Frequent Visitor
Posts: 15
Registered: ‎12-22-2015

Re: Using SQL Server with Nonclustered Columnstore Index

Forgot to update with the solution:
I had messed up the creation of the NCCI index and not included 2 columns in it that PowerBI was querying.
The SQL server therefore had to, of course, use the rowstore. It worked fine after fixing the index.