Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Power BI runs query multiple times on backend?

I'm developing a dashboard using Apache Hive as the backend storage (using an ODBC driver) and I've noticed something strange.

When I add a datasource, Power BI runs the query 3 times for each refresh. Some of these queries are quite complex on huge datasets, with many joins that can take > 15 minutes to complete. Having Power BI run the exact same query 3 times is causing issues.

What is Power BI doing, and is there any way to get it to not run the same query multiple times on each refresh?

I'm on Power BI version 2.47.4766.542 64-bit (June 2017)

33 REPLIES 33
pkoevesdi
Regular Visitor

I had the same problem with reading local excel files into Power Query. This helped me somehow:

https://docs.microsoft.com/en-us/power-query/multiple-queries

Freedom
Frequent Visitor

Same issue here. Running mysql connector and noticing multiple threads per single datasource. This causes huge performance issues especially when you have complex queries.

marcelgalema
New Member

I'm seeing this as well. I built an API that works perfectly fine in a browser, but when the call is executed as a web query it executes many times. I've seen > 10 queries being executed before the preview dialog pops up! 

 

This seems to be related to the query execution time as well. If PBI does not get an answer, it simply retries the query (every 15 secs or so). But some of my queries take more time than that and you can imagine what happens to my databases when the same heavy query is re-executed all the time.

 

I analyzed the actual requests and they are all the same, so I cannot 'filter' based on request info. I also tried to rewrite the query from select * from table to select 'all column names' from table. That is (of course) faster, but not always feasible (huge nr of columns).

 

Any ideas?

Hi there - I have the exact same issue and have tried to search for a solution, but it doesn't really seem that there is one, so I made a work-around which seems to work, but  it does involve a few steps.

 

In your backend:

- make a unique constraint on your tables primary_key (or another useful column) and a column with timestamp (or create a column if your don't have one), this ensures that the database only accepts one insert at a time.

 

In powerBI:

- make a dateTime column. If your query runs once a day simply take date with latest hour, in this way all the entries will be the same even if it takes a few minutes from insert to insert. powerBI will now throw an error because the database now has a unique constraint, but the error will only come at the second insert.

 

I use R to send data from powerBI to SQL andto avoid the error I have wrapped the insert_data into a tryCatch - see below.

 

# 'dataset' holds the input data for this script

# Install nescessaary package(s)

library(DBI)


# Connect to database

con <- dbConnect(odbc::odbc(), "odbc_writer", timeout = 10)


# Begin session

dbBegin(con)


# Skip to next if error is encounted (this WILL happen because poewrBI tries to send the data multiple times and the database table has been set up to be unique).

for (i in 1:10) { 

  skip_to_next <- FALSE

  # Append data to table - and skip by error

  tryCatch(dbAppendTable(con, "table", dataset, row.names = NULL), error = function(e) { skip_to_next <<- TRUE})

  if(skip_to_next) { next }     
}


# Commit what has been written

dbCommit(con)

 
# Disconnect data connection

dbDisconnect(con)

 

 

 

Anonymous
Not applicable

Hi Microsoft team !

 

What about this really important issue ?

It is not acceptable from customers not to have any answer or solution for this old issue, which is really problematic for us as several users run queries from PowerBi Desktop to our datasources (like Hive) and this become now a real performance issue for our platform.

We really need a solution from your team. Quickly.

Anonymous
Not applicable

Hi,

I am having the same problem with the connection to Big Query.

 

I have just received our bill for last month and have noticed that the queries I have run via Power BI are running 2 or 3 times and each run is costing the business as it seems it is saying they are new queries each time so it is not using the cached data from a previous run.

 

Has anyone had any responses on why this is happening and if so any recommendations to stop it

 

Anonymous
Not applicable

Hi @Anonymous  ! I didn't have any answer from Microsoft Power Bi Team unfortunately.

This problem is really critical and it is causing extra costs for your company and I don't understand why they are not answering us.

Several tickets are opened about the same issue but nobody moves ...

Anonymous
Not applicable

Hi,

 

I know we are getting extra costs as well.

 

I have raised a ticket via our helpdesk to Microsoft so hopefully if we keep bombarding them we might get somewhere.

 

If I hear anything will let you know if you could do the same

 

Anonymous
Not applicable

Hi,

 

I have been having an email conversation with the Power BI support team and first they said this was bug but now they are saying it is by design.

 

I have pushed it back to them to explain why they think this is acceptable

 

Will update when I know anything

Anonymous
Not applicable

@Anonymous Thank you for all those informations.

From my point of view, no answer from the Microsoft PowerBI Team.

Anonymous
Not applicable

Hi,

 

I have had numerous email converstaions and an actual call last Friday abou this.

 

In my case we are using Big Query and apparently by design PBI will run a query twice once for metadat and then for the actual data.

 

However the problem comes when you use an ODBC driver to connect ( I am using Simba as advised by Google) this then multiples the running of a query up by anothe two times so in all at least 4 times the query will  run, inmy case it runs a query anything up to 6 times .

 

I ran a test and the query runs multiple times if you make amends in power query , just openeing the PBIX file. Just seting up the PBIX adding a conditional column in PQ aloing with a custom column closing reopening and refreshing in total a simple query ran 22 times in all...not good.

 

We have pushed back to t hem now to sort out, theu did suggest to use there Big Query API but the problem is it runs the whole tables in and even if you set up views in Big Query it brings them in stages so back to multiple queries.

 

Its back with them now 😞

 

Anonymous
Not applicable

@Anonymous   do you have any updates on this?  We are running into a similar issue but ours is with an Azure Managed Instance.  

Anonymous
Not applicable

We are having the same problem. We run a complicated query with multiple joins. We're connecting to a Vertica DB running on AWS. We connect via ODBC. Running the query multiple times (concurrently!) means that the data base runs out of memory. It also costs us money unnecessarily. 

Hello BIDrone, did you find a way to avoid this?

I am having the same major issue.

¿Did you find a way to solve this?

cbailiss
Advocate II
Advocate II

The repeated running of the same query is a pain that affects multiple sources.

I raised an idea:

https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/36713977-stop-executing-queries-m...

If you feel strongly, vote for it.

Anonymous
Not applicable

Is this expected behavior for any ODBC connection? I am running into this issue with a Redbrick database also.

sirwinny
New Member

This is simply unacceptable. This is the definition of death by a thousand cuts. We have over 60 dashboards spread across 3 servers and that's still not enough to off load the issue this is creating. If a dashboard is pulling in 5 tables thats 10 connections and executions to pull one dashboard.

 

In the examply below I ran a trace on the dashboard refresh and you can clearly see the query running twice with a connection reset inbetween. It's impossible to negate the negative effect of this many connections even to multiple servers.

 

 sql profile trace of power bi refresh

The Problem occurs when Power BI issues a Hive Query something like 'SELECT * FROM DATE' but if imported as the table it uses only one connection. Hope this helps

 

Thanks

 

Anonymous
Not applicable

Hi,

 

This is because of you connection is established directly to your dataset. Microsoft recomends to build tabular model, so that your data will be stored in olap engine and will throw results alone to your PBI Reports.

Anonymous
Not applicable

I'm not sure I fully understand. The way I'm pulling data (and I bet the others in the tread are) is to hook up via ODBC to hive and run an import on a table or a query that results in a "tablular" dataset. As far as Power BI should be concerned, it's the same output as hooking up to an oracle table using ODBC, or any other database. 

 

Is the tabular model something different then an output that looks like a table (column/rows)?

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors