Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello,
I'm pulling data into Power BI using the Elastic Stack REST API, but I'm running into a problem when trying to pull more than 10k records.
In order to do so, I need to use the scroll function: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-request-scroll.html
The first call:
GET /_search?scroll=1m { "query": ... }
Will return the first set of results plus a scroll id:
DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4...
Which is used for subsequent calls:
GET /_search { "scroll_id": "DXF1Z..." }
Each call returns a set of records plus a new scroll id. The scroll id is then passed onto the next record, and so on.
For performance reasons, I don't want to increase the number of records that can be returned by the REST API.
Is it possible to solve this using Power BI?
Thanks,
Ken
Solved! Go to Solution.
There have been a few different threads like this but I believe the solution lies in creating a Power Query function that you call recursively. Here is an example of functions and recursion on a completely unrelated topic but might get you there:
Thanks for pointing me in the right direction.
For anyone else that runs into this problem, here's the first iteration of a working function:
let RecursiveElasticFetch = (queryUrl, scrollUrl, scrollId, counter) => let Counter = if (counter = null) then 0 else counter, Results = if (scrollId = null) then Json.Document(Web.Contents(queryUrl)) else Json.Document(Web.Contents(scrollUrl&scrollId, [Headers=[MyHeader=Text.From(Counter)]])), ParsedResults = Table.FromList(Results[hits][hits], Splitter.SplitByNothing(), null, null, ExtraValues.Error), Return = if (Counter < 10) then ParsedResults & RecursiveElasticFetch(queryUrl, scrollUrl, Results[_scroll_id], Counter+1) else ParsedResults in Return in RecursiveElasticFetch
For the parameters:
queryUrl - The query URL that kicks off the search: http://elasticsrch-dev:9200/intranet*/_search?scroll=1m&source={....}
scrollUrl - The URL used for subsequent searches (the scroll id is appended at the end): http://elasticsrch-dev:9200/_search/scroll?scroll=1m&scroll_id=
scrollId - Used to pass the scroll_id to subsequent calls. Leave it blank for the first call
counter - Used to limit the number of iterations. Leave it blank for the first call.
Hi there, I would like to recommend you a review of the melbetid casino app. What I liked is that Melbetid.app does its best to attract users with attractive promotions and bonuses. Frequent offers and rewards not only increase the excitement of the game but also reflect the platform's commitment to customer satisfaction. The generosity in providing bonuses when downloading the app demonstrates a genuine effort to create a rewarding experience for users.I highly recommend every gambling enthusiast to download this app.
Hey guys!
Does anyone here have an idea how to do this with an Elasticsearch composite query?
The scroll method worked, however due to the large number of records, it is cumbersome to pull the data.
in composite queries there is no scroll_id parameter, there is only after_key, which is key and value
Hi
im facing same issue
Hi ,
Im getting below error and tried to add config as provided below in the config file , not sure what is teh issue
An error occurred in the ‘RecursiveElasticFetch’ query. DataSource.Error: Web.Contents failed to get contents from 'https://monitor.sre.np.aws.csaa.pri/sre-availability-details/_search?scroll=1m' (406): Not Acceptable
Details:
DataSourceKind=Web
DataSourcePath=https://monitor.sre.np.aws.csaa.pri/sre-availability-details/_search
Url=https://monitor.sre.np.aws.csaa.pri/sre-availability-details/_search?scroll=1m
Also, I had this problem when using Self Signed certificates so you may need to allow that in PowerBi.
Re-posting here:
I believe the (406): Not Acceptable issue was because I was using a self-signed certificate.
I replaced the self-signed certificate with a much higher level signed certificate and that seemed to work locally and at the gateway level.
Otherwise, you have to tell PowerBi to trust self-signed certs:
Update the file C:\Program Files\Microsoft Power BI Desktop\bin\Microsoft.Mashup.Container.NetFX45.exe.config to:
<configuration> <system.net> <settings> <servicePointManager checkCertificateName="false" checkCertificateRevocationList="false" /> </settings> </system.net> </configuration>
Restart Power BI.
https://community.powerbi.com/t5/Power-Query/Self-Signed-Certificates-on-PowerBI/m-p/804777/highligh...
What version of Elasticsearch are you using?
Also, I added API Key functionality.
Query (you will need to modify this accordingly):
let
url = "https://CHANGEME:9200/weblogs/_search?scroll=1m",
scrollURL = "https://CHANGEME:9200/_search/scroll",
batchSize = "10000",
urlBody = Text.Combine({"{""size"":",batchSize,",""query"":{""match_all"":{}}}"}),
scrollBody = "{""scroll"":""1m"",""scroll_id"":""",
scrollID = null,
counter = null,
scrollEND = """}",
apiKey = "CHANGEME",
Source = RecursiveElasticFetch(url, scrollURL, batchSize, urlBody, scrollBody, scrollID, counter, scrollEND, apiKey),
#"Renamed Columns" = Table.RenameColumns(Source,{{"Column1", "es"}}),
#"Expanded es" = Table.ExpandRecordColumn(#"Renamed Columns", "es", {"_index", "_type", "_id", "_score", "_source"}, {"es._index", "es._type", "es._id", "es._score", "es._source"})
in
#"Expanded es"
Function:
(Right click on the query name on the left hand side and then click create function)
let
RecursiveElasticFetch = (url, scrollURL, batchSize, urlBody, scrollBody, scrollID, counter, scrollEND, apiKey) =>
let
Results = if (scrollID = null) then
//Initial Query
Json.Document(Web.Contents(url, [Headers=[#"Authorization"=Text.Combine({"ApiKey ",apiKey,""}), #"Content-Type"="application/json"], Content = Text.ToBinary(urlBody)]))
else
//All other queries execute this to gather results from the scroll api even if the scroll id changes.
Json.Document(Web.Contents(scrollURL, [Headers=[#"Authorization"=Text.Combine({"ApiKey ",apiKey,""}), #"Content-Type"="application/json", MyHeader=Text.From(counter)], Content = Text.ToBinary(scrollBody&scrollID&scrollEND)])),
//If this is the first time the function runs, the counter should be null so this will dynamically calcuate how many times this function needs to run.
counter = if (counter = null) then
//Dynamically get the counter - Note: You can uncomment the next line for testing and then comment out the Number.RoundUp
7
//Number.RoundUp(Results[hits][total][value]/Number.FromText(batchSize))
else
counter,
//Store the hits from the ElasticSearch query into ParsedResults and if results already exist, append more results to generate the full table of events.
ParsedResults = Table.FromList(Results[hits][hits], Splitter.SplitByNothing(), null, null, ExtraValues.Error),
Return = if (counter > 0) then
ParsedResults & RecursiveElasticFetch(url, scrollURL, batchSize, urlBody, scrollBody, Results[_scroll_id], counter-1, scrollEND, apiKey)
else
ParsedResults
in
Return
in
RecursiveElasticFetch
Where can I find the API key in elasticsearch on AWS?
I am not that familar with the cloud but you should be able to create an API key in Elastic by following this:
https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-create-api-key.html
You can use Kibana to run the requests and map them to the indices you want the API key to work for.
For example, if you want PowerBi to query winlogbeat-* you need to add that in the index names.
So the API Key is for Elasticsearch access instead of username/password.
Hi Dear all, Im currently working with this function but I need to connect at ElasticSearch server using credentials (Basic username:password) , Do you know if this is possible?
I could not get a good result 😞
thanks a lot for your help.
I haven't needed to do so yet.
I believe that the solutions on this page may help you though. They should how to pass headers through when making a web.contents call.
Thanks for you comment, I getting only a set of record not the whole dataset , Do you suggest me something?
There have been a few different threads like this but I believe the solution lies in creating a Power Query function that you call recursively. Here is an example of functions and recursion on a completely unrelated topic but might get you there:
Thanks for pointing me in the right direction.
For anyone else that runs into this problem, here's the first iteration of a working function:
let RecursiveElasticFetch = (queryUrl, scrollUrl, scrollId, counter) => let Counter = if (counter = null) then 0 else counter, Results = if (scrollId = null) then Json.Document(Web.Contents(queryUrl)) else Json.Document(Web.Contents(scrollUrl&scrollId, [Headers=[MyHeader=Text.From(Counter)]])), ParsedResults = Table.FromList(Results[hits][hits], Splitter.SplitByNothing(), null, null, ExtraValues.Error), Return = if (Counter < 10) then ParsedResults & RecursiveElasticFetch(queryUrl, scrollUrl, Results[_scroll_id], Counter+1) else ParsedResults in Return in RecursiveElasticFetch
For the parameters:
queryUrl - The query URL that kicks off the search: http://elasticsrch-dev:9200/intranet*/_search?scroll=1m&source={....}
scrollUrl - The URL used for subsequent searches (the scroll id is appended at the end): http://elasticsrch-dev:9200/_search/scroll?scroll=1m&scroll_id=
scrollId - Used to pass the scroll_id to subsequent calls. Leave it blank for the first call
counter - Used to limit the number of iterations. Leave it blank for the first call.
Hi
can you please describe how to integrate this on PBI?
Thanks
Hi
Thank you for this solution.
As it was in 2017, do you know of there is a new method to workaround this problem?
Does this script helps to ineract dynamically with the ELS data like as SQL Database?
Thanks
Regards
Hey @KenvM,
Thanks for sharing this iteration! Have you made any new changes? I'm having an issue after the second loop of this script. First call returns one set of records, second call returns a new set, third and all subsequent calls returns a repeat set of data (not sure if from first or second call, but I could verify if needed).
Have you encountered this yourself, and if so, have you found a solution?
Link to a more detailed description on the Elastic forums here: Elastic Forums
Thanks!
I looked at your example posted on the Elastic Search site, and I think the problem may be in this section:
## If function is called w/o a scrollID, then make the initial query, otherwise, query with a scrolling query Results = if (scrollID = null) then Json.Document(Web.Contents(url,[Content = Text.ToBinary(urlBody)])) else Json.Document(Web.Contents(scrollURL,[Content = Text.ToBinary(scrollBody&scrollID&scrollEND)])),
I believe you need to pass the scrollId as part of the query on subsequent calls. Otherwise, you get a new request each time.
Try:
## If function is called w/o a scrollID, then make the initial query, otherwise, query with a scrolling query Results = if (scrollID = null) then Json.Document(Web.Contents(url,[Content = Text.ToBinary(urlBody)])) else Json.Document(Web.Contents(scrollURL&scrollId,[Content = Text.ToBinary(scrollBody&scrollID&scrollEND)])),
Thanks for pointing that out! I actually am passing the scrollID with that calll, but I'm passing it as part of the body as opposed to the URL. Because the _scroll_id value passed back by Elasticsearch grows with the number of shards you have (if my understanding is correct), it can get crazy long. The one I'm returning is 13,873 characters long. Too long to send back as part of the URL. One forum post on that here.
The bolded part below is where I'm passing it.
## If function is called w/o a scrollID, then make the initial query, otherwise, query with a scrolling query Results = if (scrollID = null) then Json.Document(Web.Contents(url,[Content = Text.ToBinary(urlBody)])) else Json.Document(Web.Contents(scrollURL,[Content = Text.ToBinary(scrollBody&scrollID&scrollEND)])),
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
101 | |
49 | |
19 | |
12 | |
11 |