I am new to PowerBI. However, in the last month, I have done extensive research on PowerBI. I wanted some suggestions on how to implement a use case.
I have a large flow of JSON data which can be fetched from the rest API, which is about 2.5 Million records per day. I am trying to solve it using various ways:
Using the PowerBI push APIs: But, there is a cap of a maximum of 5 Million records per table for data pushed through APIs.
Using Native Connector: But, custom rest APIs doesn't support incremental refresh. So, every time I refresh, I would have to fetch all of those records again, which would be very inefficient. Also, I think the desktop allows max data of 1 GB.
So, the above approaches which I found promising won't work. So, is there a way that I am unaware of to fulfil such use case?
Basically, I want to fetch a large amount of JSON data through API and store it in PowerBI dataset.
Also, is there any alternative way such that I can push my JSON data to intermediate storage, and fetch from there in PowerBI?
I am stuck at this point, and any help would be highly appreciated.
AFAIK, power bi desktop report file does not have a limitation on data amount, this size limit should more relate to your hard disk storage and system memory. (system memory used to processing and calculation DAX formulas in your report) Current power bi service has one Gb dataset size limit on publishing report, it can be increased if you upgrade premium license above some level.
For large data sources from rest api, I'd like to suggest you store to local file and create a batch script or plan to auto sync to the location database. (e.g: SQL, MySQL, Excel, Access) After these steps, you can simply get data from the synced database without a huge refresh/processing times.
>>Also, is there any alternative way such that I can push my JSON data to intermediate storage, and fetch from there in PowerBI?
I do not so recommend you try this method if you not familiar with programming languages for API development.
You need to process a program to pull data from your rest API and cache to a temp file(due to data amount, I do not so recommend you cache whole records and processing these in system memory) and get the last records from power bi push dataset compare with cached temp file records to filter and push lastest records. (your API does not contain incremental refresh feature do you need to manually compare with records)
So, as I understood, one possible approach is to store the data from the API to the database. So, in this case, are there any limits as to what maximum data can I import from Database at a time through direct query and import mode?
So, the API I use has Date Range Parameters, Start Date and End Date. But, it is not a Database. So, in this case, would it be possible to create a custom connector and implement incremental in it? I am not sure if an incremental refresh is even possible with custom APIs.