Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
KenvM
Frequent Visitor

Getting Data from Elastic Stack with REST with Scroll

Hello,

 

I'm pulling data into Power BI using the Elastic Stack REST API, but I'm running into a problem when trying to pull more than 10k records.

 

In order to do so, I need to use the scroll function: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-request-scroll.html

 

The first call:

 

 

GET /_search?scroll=1m 
{ "query": ... }

Will return the first set of results plus a scroll id: 

 

DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4...

 

Which is used for subsequent calls:

 

 

GET /_search {
  "scroll_id": "DXF1Z..."
}

 

 

Each call returns a set of records plus a new scroll id. The scroll id is then passed onto the next record, and so on. 

 

For performance reasons, I don't want to increase the number of records that can be returned by the REST API.

 

Is it possible to solve this using Power BI?


Thanks,


Ken

 

2 ACCEPTED SOLUTIONS
Greg_Deckler
Super User
Super User

There have been a few different threads like this but I believe the solution lies in creating a Power Query function that you call recursively. Here is an example of functions and recursion on a completely unrelated topic but might get you there:

 

https://community.powerbi.com/t5/Community-Blog/Using-Recursion-to-Solve-Hex-to-Decimal-Conversion/b...

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

Thanks for pointing me in the right direction.

 

For anyone else that runs into this problem, here's the first iteration of a working function: 

 

let
    RecursiveElasticFetch = (queryUrl, scrollUrl, scrollId, counter) =>

    let
        Counter = if (counter = null) then 0 else counter,

        Results = if (scrollId = null) then            
            Json.Document(Web.Contents(queryUrl))
        else
            Json.Document(Web.Contents(scrollUrl&scrollId, [Headers=[MyHeader=Text.From(Counter)]])),

        ParsedResults = Table.FromList(Results[hits][hits], Splitter.SplitByNothing(), null, null, ExtraValues.Error),

        Return = if (Counter < 10) then
            ParsedResults & RecursiveElasticFetch(queryUrl, scrollUrl, Results[_scroll_id], Counter+1)
        else
            ParsedResults 
    in
        Return 
in
    RecursiveElasticFetch

For the parameters:

 

queryUrl - The query URL that kicks off the search: http://elasticsrch-dev:9200/intranet*/_search?scroll=1m&source={....}

scrollUrl - The URL used for subsequent searches (the scroll id is appended at the end): http://elasticsrch-dev:9200/_search/scroll?scroll=1m&scroll_id=

scrollId - Used to pass the scroll_id to subsequent calls. Leave it blank for the first call

counter - Used to limit the number of iterations. Leave it blank for the first call. 

 

View solution in original post

40 REPLIES 40
nazaretalaragon
New Member

Hi there, I would like to recommend you a review of the melbetid casino app. What I liked is that Melbetid.app does its best to attract users with attractive promotions and bonuses. Frequent offers and rewards not only increase the excitement of the game but also reflect the platform's commitment to customer satisfaction. The generosity in providing bonuses when downloading the app demonstrates a genuine effort to create a rewarding experience for users.I highly recommend every gambling enthusiast to download this app.

fmooura
Regular Visitor

Hey guys!
Does anyone here have an idea how to do this with an Elasticsearch composite query?

The scroll method worked, however due to the large number of records, it is cumbersome to pull the data.

in composite queries there is no scroll_id parameter, there is only after_key, which is key and value

Maggi029
Helper II
Helper II

Hi 

 

im facing same issue

This is the error I get.

An error occurred in the ‘RecursiveElasticFetch’ query. DataSource.Error: Web.Contents failed to get contents from 'https://redacted:9200/index-*stats*/_search?scroll=1m' (406): Not Acceptable
Details:
DataSourceKind=Web
DataSourcePath=https://redacted:9200/filebeat-*stats*/_search
Url=https://redacted:9200/index-*stats*/_search?scroll=1m
 
Any idea how to resolve this?

Hi ,

 

Im getting below error and tried to add config as provided below in the config file , not sure what is teh issue 

An error occurred in the ‘RecursiveElasticFetch’ query. DataSource.Error: Web.Contents failed to get contents from 'https://monitor.sre.np.aws.csaa.pri/sre-availability-details/_search?scroll=1m' (406): Not Acceptable
Details:
DataSourceKind=Web
DataSourcePath=https://monitor.sre.np.aws.csaa.pri/sre-availability-details/_search
Url=https://monitor.sre.np.aws.csaa.pri/sre-availability-details/_search?scroll=1m

Also, I had this problem when using Self Signed certificates so you may need to allow that in PowerBi.

Re-posting here:

I believe the (406): Not Acceptable issue was because I was using a self-signed certificate.

 

I replaced the self-signed certificate with a much higher level signed certificate and that seemed to work locally and at the gateway level. 

 

Otherwise, you have to tell PowerBi to trust self-signed certs:

 

Update the file C:\Program Files\Microsoft Power BI Desktop\bin\Microsoft.Mashup.Container.NetFX45.exe.config to:

 

<configuration>
  <system.net>
    <settings>
      <servicePointManager
        checkCertificateName="false"
        checkCertificateRevocationList="false"         
      />
    </settings>
  </system.net>
</configuration>

Restart Power BI.

https://community.powerbi.com/t5/Power-Query/Self-Signed-Certificates-on-PowerBI/m-p/804777/highligh...

What version of Elasticsearch are you using?

Hi @nicpenning 

 

Im usig elastic search 7.14.0 version

nicpenning
Helper I
Helper I

Also, I added API Key functionality.

 

Query (you will need to modify this accordingly):

 

let
    url = "https://CHANGEME:9200/weblogs/_search?scroll=1m",
    scrollURL = "https://CHANGEME:9200/_search/scroll",
    batchSize = "10000",
    urlBody = Text.Combine({"{""size"":",batchSize,",""query"":{""match_all"":{}}}"}),
    scrollBody = "{""scroll"":""1m"",""scroll_id"":""",
    scrollID = null,
    counter = null,
    scrollEND = """}",
    apiKey = "CHANGEME",
    Source = RecursiveElasticFetch(url, scrollURL, batchSize, urlBody, scrollBody, scrollID, counter, scrollEND, apiKey),
    #"Renamed Columns" = Table.RenameColumns(Source,{{"Column1", "es"}}),
    #"Expanded es" = Table.ExpandRecordColumn(#"Renamed Columns", "es", {"_index", "_type", "_id", "_score", "_source"}, {"es._index", "es._type", "es._id", "es._score", "es._source"})
in
    #"Expanded es"

 

 

  

Function:

(Right click on the query name on the left hand side and then click create function)

 

let
    RecursiveElasticFetch = (url, scrollURL, batchSize, urlBody, scrollBody, scrollID, counter, scrollEND, apiKey) =>

    let
        Results = if (scrollID = null) then
            //Initial Query
            Json.Document(Web.Contents(url, [Headers=[#"Authorization"=Text.Combine({"ApiKey ",apiKey,""}), #"Content-Type"="application/json"], Content = Text.ToBinary(urlBody)]))
        else
            //All other queries execute this to gather results from the scroll api even if the scroll id changes.
            Json.Document(Web.Contents(scrollURL, [Headers=[#"Authorization"=Text.Combine({"ApiKey ",apiKey,""}), #"Content-Type"="application/json", MyHeader=Text.From(counter)], Content = Text.ToBinary(scrollBody&scrollID&scrollEND)])),

        //If this is the first time the function runs, the counter should be null so this will dynamically calcuate how many times this function needs to run.
        counter = if (counter = null) then
            //Dynamically get the counter - Note: You can uncomment the next line for testing and then comment out the Number.RoundUp
            7
            //Number.RoundUp(Results[hits][total][value]/Number.FromText(batchSize))
        else
            counter,

        //Store the hits from the ElasticSearch query into ParsedResults and if results already exist, append more results to generate the full table of events.
        ParsedResults = Table.FromList(Results[hits][hits], Splitter.SplitByNothing(), null, null, ExtraValues.Error),

        Return = if (counter > 0) then
            ParsedResults & RecursiveElasticFetch(url, scrollURL, batchSize, urlBody, scrollBody, Results[_scroll_id], counter-1, scrollEND, apiKey)
        else
            ParsedResults
    in
        Return
in
    RecursiveElasticFetch

 

 

Where can I find the API key in elasticsearch on AWS?

I am not that familar with the cloud but you should be able to create an API key in Elastic by following this:

https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-create-api-key.html

 

You can use Kibana to run the requests and map them to the indices you want the API key to work for.

 

For example, if you want PowerBi to query winlogbeat-* you need to add that in the index names.

 

So the API Key is for Elasticsearch access instead of username/password.

Anonymous
Not applicable

Hi Dear all, Im currently working with this function but I need to connect at ElasticSearch server using credentials (Basic username:password) , Do you know if this is possible?

 

I could not get a good result 😞

 

 

thanks a lot for your help.

I haven't needed to do so yet.

 

I believe that the solutions on this page may help you though. They should how to pass headers through when making a web.contents call.

 

https://community.powerbi.com/t5/Desktop/Issue-with-getting-data-via-API-with-bearer-token/td-p/1262...

 

Anonymous
Not applicable

Thanks for you comment, I getting only a set of record not the whole dataset , Do you suggest me something?

Greg_Deckler
Super User
Super User

There have been a few different threads like this but I believe the solution lies in creating a Power Query function that you call recursively. Here is an example of functions and recursion on a completely unrelated topic but might get you there:

 

https://community.powerbi.com/t5/Community-Blog/Using-Recursion-to-Solve-Hex-to-Decimal-Conversion/b...

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Thanks for pointing me in the right direction.

 

For anyone else that runs into this problem, here's the first iteration of a working function: 

 

let
    RecursiveElasticFetch = (queryUrl, scrollUrl, scrollId, counter) =>

    let
        Counter = if (counter = null) then 0 else counter,

        Results = if (scrollId = null) then            
            Json.Document(Web.Contents(queryUrl))
        else
            Json.Document(Web.Contents(scrollUrl&scrollId, [Headers=[MyHeader=Text.From(Counter)]])),

        ParsedResults = Table.FromList(Results[hits][hits], Splitter.SplitByNothing(), null, null, ExtraValues.Error),

        Return = if (Counter < 10) then
            ParsedResults & RecursiveElasticFetch(queryUrl, scrollUrl, Results[_scroll_id], Counter+1)
        else
            ParsedResults 
    in
        Return 
in
    RecursiveElasticFetch

For the parameters:

 

queryUrl - The query URL that kicks off the search: http://elasticsrch-dev:9200/intranet*/_search?scroll=1m&source={....}

scrollUrl - The URL used for subsequent searches (the scroll id is appended at the end): http://elasticsrch-dev:9200/_search/scroll?scroll=1m&scroll_id=

scrollId - Used to pass the scroll_id to subsequent calls. Leave it blank for the first call

counter - Used to limit the number of iterations. Leave it blank for the first call. 

 

yamaga
Frequent Visitor

Hi

 

can you please describe how to integrate this on PBI?

 

Thanks

yamaga
Frequent Visitor

Hi

 

Thank you for this solution.

As it was in 2017, do you know of there is a new method to workaround this problem?

Does this script helps to ineract dynamically with the ELS data like as SQL Database?

 

Thanks

Regards

Hey @KenvM,

 

Thanks for sharing this iteration! Have you made any new changes? I'm having an issue after the second loop of this script. First call returns one set of records, second call returns a new set, third and all subsequent calls returns a repeat set of data (not sure if from first or second call, but I could verify if needed). 

Have you encountered this yourself, and if so, have you found a solution?

Link to a more detailed description on the Elastic forums here: Elastic Forums

Thanks!

KenvM
Frequent Visitor

I looked at your example posted on the Elastic Search site, and I think the problem may be in this section:

 

 

        ## If function is called w/o a scrollID, then make the initial query, otherwise, query with a scrolling query
        Results = if (scrollID = null) then
            Json.Document(Web.Contents(url,[Content = Text.ToBinary(urlBody)]))
        else
            Json.Document(Web.Contents(scrollURL,[Content = Text.ToBinary(scrollBody&scrollID&scrollEND)])),

 

 

I believe you need to pass the scrollId as part of the query on subsequent calls. Otherwise, you get a new request each time.

 

Try:

 

        ## If function is called w/o a scrollID, then make the initial query, otherwise, query with a scrolling query
        Results = if (scrollID = null) then
            Json.Document(Web.Contents(url,[Content = Text.ToBinary(urlBody)]))
        else
            Json.Document(Web.Contents(scrollURL&scrollId,[Content = Text.ToBinary(scrollBody&scrollID&scrollEND)])),

 

 

Thanks for pointing that out! I actually am passing the scrollID with that calll, but I'm passing it as part of the body as opposed to the URL. Because the _scroll_id value passed back by Elasticsearch grows with the number of shards you have (if my understanding is correct), it can get crazy long. The one I'm returning is 13,873 characters long. Too long to send back as part of the URL. One forum post on that here.

 

The bolded part below is where I'm passing it.

        ## If function is called w/o a scrollID, then make the initial query, otherwise, query with a scrolling query
        Results = if (scrollID = null) then
            Json.Document(Web.Contents(url,[Content = Text.ToBinary(urlBody)]))
        else
            Json.Document(Web.Contents(scrollURL,[Content = Text.ToBinary(scrollBody&scrollID&scrollEND)])),

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors