Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I almost see the light at the end of the tunnel.
I am able to make a connection through an API call to my data source (Okta in this case) and was very excite until I looked at the data and noticed that I only have about 100 items.
From what I could find, it looks like powerbi is pulling only the first page, and I need it to pull all available pages (Please correct me if I'm wrong).
Below is the query I have (I have to hardcode the api, see - http://community.powerbi.com/t5/Integrations-with-Files-and/Power-BI-and-Okta-logs/m-p/318715#M14535)
let Source = Json.Document(Web.Contents("https://<omitted>.okta.com/api/v1/logs", [Headers=[Authorization="SSWS <omitted>z", ContentType="application/json"]])) in Source
and according to Okta's Documentation here I thnk I need to insert tel=next in my source. - https://developer.okta.com/docs/api/getting_started/design_principles.html#pagination
I could be completely wrong, and I need to do some kind of loop.
I am able to filter my queries but I only still get 100 results.
Any help or suggestions?
An easier way to do this might be to include a skip or skiptoken into your url. Please see this post for more details.
You can generate list of skiptoken values separated by the # of records returned (in your case 100), and then use that number column concatenated with your web call to get 100 records on each row, which you can then expand that column to get the combined results. You can also make a call up front to get the number of total records so you know where to stop your list of numbers/skiptokens.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Hi @Anonymous,
Okta's API returns an HTTP response header called "Link" which contains a URL for the next page. It doesn't seem to be currently possible for Power Query to look at the HTTP "Link" response header. I tried the code below, and it does return some HTTP response headers, but not the "Link" header.
response = Web.Contents(url, [Headers = [Authorization = "SSWS " & token]]), headers = Value.Metadata(response)[Headers]
One quick workaround is to use a PowerShell script like:
https://github.com/mbegan/Okta-Scripts/blob/master/saveEventLogs.md
to export logs to a JSON file which can be consumed by Power Query.
An alternative is to write a custom Data Connector since that has the ability to see all HTTP response headers, eg:
https://github.com/gabrielsroka/gabrielsroka.github.io/blob/master/Okta.pq
Hi @Anonymous
Here's some proof-of-concept code that might help. It doesn't use the HTTP response headers for pagination (since they're only available in a custom Data Connector such as https://github.com/gabrielsroka/gabrielsroka.github.io/blob/master/Okta.pq), so it makes repeated calls to /logs. It works, but would need more work to make it respect rate limits and to wait, maybe using Function.InvokeAfter(), etc.
See the comments and variables below for more info.
let startDate = #date(2018, 11, 1), // Change this. baseUrl = "https://XXXXXXXXXXXXXXXXXXXXX.oktapreview.com", // Change this. token = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX", // Change this. iterations = 2, // Max rate limit for "/logs" is 120 calls per minute. limit = 2, // For "/logs", this must between 1 and 1000. Set it lower while debugging. logsUrl = orgUrl & "/api/v1/logs", headers = [ Authorization = "SSWS " & token, Accept = "application/json", #"Content-Type" = "application/json", #"User-Agent" = "excel-power-query/2016 Windows/10.0.17134.345" ], iso = "s", // Fetch 1 day at a time. If you have more than 1000 events per day, you'll // have to use a more granular time-window, eg, 1 hr at a time, or 1 minute at a time. GetByDateRange = (i) => let since = Date.ToText(startDate + #duration(i, 0, 0, 0), iso), until = Date.ToText(startDate + #duration(i + 1, 0, 0, 0), iso), query = [limit = Text.From(limit), since = since, until = until], logs = Json.Document(Web.Contents(logsUrl, [Headers = headers, Query = query])) in logs, GeneratedList = List.Generate( () => [i = 0, res = GetByDateRange(i)], each [i] < iterations, each [i = [i] + 1, res = GetByDateRange(i)], each [res]), table1 = Table.FromList(GeneratedList, Splitter.SplitByNothing(), {"log"}), table2 = Table.ExpandListColumn(table1, "log"), fieldNames = {"displayMessage", "eventType", "published", "severity"}, Logs = Table.ExpandRecordColumn(table2, "log", fieldNames) in Logs
Hi @Anonymous
I spent some more time and I got a basic Custom Data Connector to work. It needs more work, but this should get you started.
This works with Power BI Desktop (as of now). Excel support may be added by Microsoft in the future.
1. In your Documents folder, create a folder called "Power BI Desktop\Custom Connectors" (mine already existed).
2. Download the code from https://github.com/gabrielsroka/gabrielsroka.github.io/blob/master/Okta.pq to Okta.pq and save it in the "Custom Connectors" folder.
3. In Power BI Desktop, go to File | Options and settings | Options.
4. Go the Security tab.
5. Under Data Extensions, select Allow any extension to load without validation or warning.
6. Restart Power BI Desktop
See https://github.com/Microsoft/DataConnectors for more info.
One thing I found by researching and experimenting (that I didn't see in the documentation) is you can use the Okta.pq file, you don't have to create a .mez file (which is a Zip file). You can even edit the Okta.pq file while it's loaded in Power BI Desktop -- this makes development much easier.
Thank you, @gabrielsroka for sharing.
I am starting to work on a project that gets data from OKTA API.
I stumbled upon pagination and found your Custom Connector solution.
I am new to Custom Connectors, so before I begin, just wanted to make sure it is working for you.
Are there any updates since you published it? Is it ready to use for scheduling in the service?
Thank you
Your code works great for my purposes @gabrielsroka, many thanks for sharing.
I have this working perfectly combining multiple pages for when the API response looks like this:
However some areas of the API give me a response in the following slightly awkward format:
Ultimately this looks like this:
{
"column_definitions":[
{"ColumnName1":{"type":"Numeric","isNullable":false,"identityColumn":true}},
{"ColumnName2":{"type":"Text","isNullable":true,"identityColumn":false}}],
"row_values":
[[2,"TEST"], [2,"TEST"]] etc
}
I can manipulate this format within PowerBI Desktop to extract the data I need but I cannot get the paging to work correctly for this format. The problem appears to lie somewhere in your List.Generate function after getting the first page & assigning the link value as metadata.
The error I'm getting is "“We cannot convert a value of type Record to type List”. Any suggestions would be greatly appreciated.
Hi @Anonymous
Are you connecting to an Okta API? If so, which one? Can you provide a little more detail? Maybe a code sample...
@Anonymous,
You can follow the guide in the blog below to implement Cursor-Based Pagination with Power Query.
https://datachant.com/2016/06/27/cursor-based-pagination-power-query/
Regards,
Lydia
Good Morning,
I've got it partially working (Its intersting since i have to embed the api key in the url, which Ibelieve I got it working, but instead of getting 100 results, i know have 0, any thoughts?
let iterations = 10, // Number of iterations url = "https://xxxxxxxxxxxxx.okta.com/api/v1/logs, [Headers=[Authorization=""SSWS xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"", ContentType=""application/json""]]", FnGetOnePage = (url) as record => let Source = Json.Document(Web.Contents(url)), data = try Source[data] otherwise null, next = try Source[paging][next] otherwise null, res = [Data=data, Next=next] in res, GeneratedList = List.Generate( ()=>[i=0, res = FnGetOnePage(url)], each [i]<iterations and [res][Data]<>null, each [i=[i]+1, res = FnGetOnePage([res][Next])], each [res][Data]) in GeneratedList
Friend, I just put "URL=https:xxxx" without "Headers" and "Content-Type" and it works!
Maybe
Source = Json.Document(Web.Contents(url))
already solves about json
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.