cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
gulrog Frequent Visitor
Frequent Visitor

Azure DocumentDB and slow refresh

I'm testing PowerBi and the azure documentdb datasource. It works fine, but when I try to Refresh the data it takes forever. I have 500 000 records in this database, and it uses around 15 minutes to refresh. In a production environment it would be 25 million records in this database and I would like it to update as often as possible (every 1 hour it seems), but with this refresh speed it would not finnish an update within 1 hour. Am I doing someting wrong? I was thinking that PowerBi should handle large amount of data, but it looks like every time I update it fetches all records from the database again (even if none of them has changed).

1 ACCEPTED SOLUTION

Accepted Solutions
ankitpatira Super Contributor
Super Contributor

Re: Azure DocumentDB and slow refresh

@gulrog Below is the screenshot of data refresh rate in power bi based on subscription. Even if you have Pro it is going to take a lot to refresh 25 million records. Data refresh in power bi is not delta so each time you refresh it gets all the data.

 

I would suggest you look at limiting your data you import. Also power bi has direct query feature that doesn't import but connects live to data source. DirectQuery is available with azure sql db and azure sql dw but i don't think it is yet available with document db since it is still in beta. You can submit that as an idea to get it prioritised.

 

Capture.PNG

4 REPLIES 4
ankitpatira Super Contributor
Super Contributor

Re: Azure DocumentDB and slow refresh

@gulrog Below is the screenshot of data refresh rate in power bi based on subscription. Even if you have Pro it is going to take a lot to refresh 25 million records. Data refresh in power bi is not delta so each time you refresh it gets all the data.

 

I would suggest you look at limiting your data you import. Also power bi has direct query feature that doesn't import but connects live to data source. DirectQuery is available with azure sql db and azure sql dw but i don't think it is yet available with document db since it is still in beta. You can submit that as an idea to get it prioritised.

 

Capture.PNG

gulrog Frequent Visitor
Frequent Visitor

Re: Azure DocumentDB and slow refresh

Hmm... I get a feeling that the whole product is in beta. I was first trying to use the REST API, but it turned out it was not possible to do something as simple as search out records from yesterday (because of missing DAX support). I then I then decided to use Azure stream analytics together with power bi to analyze realtime events and have an auto updating dashboard. It worked fine, but I also need ot analyze the historical data, so I tried with azure documentdb, but as it turns out it is not a good match with PowerBI, so how I see it now, only traditional sql server will work, but then I need to investigate a lot into tuning indexes and partitions to make it perform good enough with 25 million records. SQL DW could be an option, but I fear the learning curve is a bit steep since I don't know anything about DW. 

 

I read that the team behind PowerBI had a goal that people that see the product should Wow within 5 minutes. I really did, but sadly that Wow turned into a dissapointment when digging a bit deeper.

 

Anyhow - thank you for taking time to answer my question.

Highlighted
Anonymous
Not applicable

Re: Azure DocumentDB and slow refresh

Filtering DocumentDB data based on relative times at the source is possible with some tweaking:

 

1. Create a DocumentDB data source as normal. Do not enter any custom SQL.

2. Right click the query and select "Advanced Editor"

3. Immediately after the let line add in your custom query such as the example below. This uses the Power Query functions to construct a string on the fly that will be sent to the database. In this case it works out the date for 30 days ago and embeds that date and the current date into the string to get data for the last 30 days only. If done correctly you should see a new "querystring" step added under Query Settings.

 

let

   querystring = "SELECT * FROM yourdoc WHERE (yourdoc.datetimefield BETWEEN '" & DateTimeZone.ToText(DateTimeZone.UtcNow() - #duration(30,0,0,0),"yyyy-MM-ddTHH:mm:ss.000Z" ) & "' AND '" & DateTimeZone.ToText(DateTimeZone.UtcNow(),"yyyy-MM-ddTHH:mm:ss.000Z") & "')",

 

*** rest of original script ***

 

4. Under query settings go to the Source step. Should look something like:

 

= DocumentDB.Contents("https://yourserver.documents.azure.com", "yourdb", "yourcollection")

 

5. Add a 4th parameter to the end exactly as shown here-> [Query = querystring]

 

= DocumentDB.Contents("https://yourserver.documents.azure.com", "yourdb", "yourcollection", [Query = querystring])

 

PowerBI will now use a filtered query against DocumentDB

Kpham Visitor
Visitor

Re: Azure DocumentDB and slow refresh

Dear all, 

 

How should my query look like when I specifically need to get data where  "Document.LaneId"=1

 

 

 
let
    Source = DocumentDB.Contents("https://xxx-iot-service-data.documents.azure.com/"),
    iotreportdata = Source{[id="iotreportdata"]}[Collections],
    iotreportdata_xxxTestingAggregatedBy5Minutes =      
    iotreportdata{[db_id="iotreportdata",id="XXXTestingAggregatedBy5Minutes"]}[Documents],
    #"Expanded Document" = Table.ExpandRecordColumn(iotreportdata_XXXTestingAggregatedBy5Minutes, "Document", 
    #"Changed Type2" = Table.TransformColumnTypes(#"Renamed Columns",{{"Document.ProductsInRange", Int64.Type}, {"Time", type time}})
in
    #"Changed Type2"

 

Helpful resources

Announcements
Can You Solve These Challenge

Challenge: Can You Solve These?

Find out how to participate in the first Power BI 'Can You Solve These?' challenge.

Community News & Announcements

Community News & Announcements

Get your latest community news and announcements.

Virtual Launch Event

Microsoft Business Applications October Virtual Launch Event

Join us for an in-depth look at the new innovations across Dynamics 365 and the Microsoft Power Platform.

Community Kudopalooza

Win Power BI Swag with Community Kudopalooza!

Each week, complete activities and be qualified in the drawing for cool Power BI Swag.

Top Kudoed Authors
Users Online
Currently online: 6 members 3,309 guests
Please welcome our newest community members: