Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi, we have configured a gateway with a data source to AWS Athena. We are using Dataflow to connect to the AWS Athena dsn and then connecting to Dataflow in Power BI Desktop to build visuals. Initially, we found in Navigator that the "Table is Empty" . Per Microfoft's documentation, we found that this issue will arise because data has not been loaded to the table. When a dataflow is initially created, a refresh must occur to load data. When we selected "refresh now", it took about 30 minutes to load 302k rows from Athena. This seems fairly long considering the fact that we can query this same table in Athena in mere seconds.
Has anyone else encountered this same issue? Is this an expected behavior? Thank you very much.
@v-janeyg-msft Hi, I apologize, I'm a bit confused. The refresh / initial load was successful. However, the concern is that it took 30 minutes to load 302k rows. This seems fairly long considering the fact that we can query this same table in Athena in mere seconds. Is this common? Is it expected behavior?
Hi, @catcollector
You should know that query and loading are not the same.Loading involves the process of connecting to your data source and then loading the entire model data.It is related to the complexity of the data model, so the time may be longer. It is recommended that you directly try to connect to the data source in the desktop, which should shorten a part of the time.
Best Regards
Janey Guo
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
@v-janeyg-msft Can you explain what you mean by directly connecting to the data source in Power BI desktop? Do you mean that we should use the ODBC connector in Desktop as opposed to building a dataflow in Service to connect to Athena using an ODBC connector? TIA!
Hi, @catcollector
Yes,you can directly use ODBC connector to connect to the data source in powerbi desktop,then if you publish your report to PBI service,you need to set up a gataway to refresh.
You can download ODBC Driver here.
Follow the steps in these documents to install and use it in PBI desktop.
Install and Configuration:
How To Connect Microsoft Power BI to Amazon Athena | by Thomas Spicer | Openbridge
Set up refresh for gateway:
Connect Microsoft Power BI to Amazon Athena | by Gaurav Jain | Medium
Best Regards
Janey Guo
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi, @catcollector
It’s my pleasure to answer for you.
According to your description,You first connect the data source in the service with the data stream, and then connect the data source in the service in the desktop.
Is it slow to refresh data using refresh in the service or slow to load data in desktop?
You can try to directly connect and load data directly in the desktop.Maybe it will improve.
Links about:
Connecting to Amazon Athena with ODBC - Amazon Athena
If it doesn’t solve your problem, please feel free to ask me.
Best Regards
Janey Guo
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
@v-janeyg-msft Hi, thank you for your response. We are connecting to the data using Dataflow (in Power BI server). However, you have to load data in the initial creation of the dataflow (refresh now option). We're finding that this "refresh" is taking 30+ minutes to load 302k rows. Unsure if this is common?
Hi, @catcollector
Does the first refresh take half an hour or is it needed every time, and can the refresh succeed in the end?It shouldn't be so long,can you share information about refresh history of the dataflow?
If it doesn’t solve your problem, please feel free to ask me.
Best Regards
Janey Guo
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
@v-janeyg-msft It is the first refresh after creating a dataflow that is taking half an hour to load 302k records. This is our first experience with Dataflow, so I don't have a whole lot of experience to go off of. I've read that after the first refresh/initial load, we should consider incremental refresh to prevent from a complete reload each time. This is the approach we plan to take.
Will you let me know what information you're seeking related to the refresh history? Admittedly, I am not an experience Power BI user, I am the IT partner helping our business partners. If you can explain what you're looking for, I can certainly ask our business partners to provide the necessary information.
Thank you!
Hi, @catcollector
Here are links about incremental refresh:
Incremental refresh in Power BI - Power BI | Microsoft Docs
Using incremental refresh with dataflows | Microsoft Docs
And you can view refresh history in PBI Service---datafolw---setting----refresh history,then check information like duration and status.
If it doesn’t solve your problem, please feel free to ask me.
Best Regards
Janey Guo
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.