Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
ejstutevil
New Member

Help Automatically Paginating Through Data

Hello all, 

Full disclosure here, this is for a class, however it is my senior project and this is a program that the company plans on using after it is finished. The project is to create a dashboard displaying some filtered data that is valuable to the company, in this case about cars who have been sitting in one place for too long. Filtering through it is not really the issue here. They don't have this data going back to a database, so the best bet to get live data is to pull it in with an API. Unfortunately, all of my group and I do not have any experience with APIs, or pulling data using them. The maximum entries we can pull at a time is 100, while we need to pull multiple data sets that have anywhere between 500 to thousands of entries. I think our best solution is having a query to automatically going through the pages and collection them in a table. I found a video on YouTube, Pagination and DO/WHILE in Power BI/Power Query by The Power User, that seemed like a good solution to my problem. However, when I've tried to implement it into my query, through Power Bi, the query ran for a very long time and while it did collect the data as I needed, it was all in the first 5 results while the query went on to collect 1,000 empty records. I'm not quite sure where I went wrong in the code, although I believe I need a different way to check if there's data on the page. Here's the code, although it's essentially what's done in the video:

my GetData:

= (page as number) as list =>
let
    Source = Json.Document(Web.Contents("https://secure.fleetio.com/api/v1/vehicles/?page=" & Number.ToText(page), 
    [Headers=[Authorization="this_is_an_api_key", #"Account-Token"="additional_authentication"]])),
    Data1 = Source{1}[Data],
    RemoveBottom = Table.RemoveLastN(Data1, 3)
in
    Source

my query:

= List.Generate( ()=>
         [Result= try GetData(1) otherwise null, Page =1],
       each [Result] <> null,
       each [Result = try GetData([Page]+1) otherwise null, Page =[Page]+1],
       each [Result]
)

I feel like the answer might be super obvious, but this is my first time seeing this language and I'm a bit lost. I would greatly appreciate any help, or even tips to point me in the right direction would be amazing.

Thanks

1 ACCEPTED SOLUTION
v-yuta-msft
Community Support
Community Support

@ejstutevil ,

 

It seems like your code is from this blog, right? You can also refer to blogs below to try different ways to scrape multiple pages using power query:

https://www.myonlinetraininghub.com/scrape-data-multiple-web-pages-power-query

https://adatis.co.uk/Loop-through-Multiple-Web-Pages-using-Power-Query/

 

However, if still can't work, I think this issue should be the website side has set some anti-spider rules. Unfortunately, power query does not support like User-Agent section in web referencing. So I would suggest you use python instead.

 

Community Support Team _ Jimmy Tao

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

1 REPLY 1
v-yuta-msft
Community Support
Community Support

@ejstutevil ,

 

It seems like your code is from this blog, right? You can also refer to blogs below to try different ways to scrape multiple pages using power query:

https://www.myonlinetraininghub.com/scrape-data-multiple-web-pages-power-query

https://adatis.co.uk/Loop-through-Multiple-Web-Pages-using-Power-Query/

 

However, if still can't work, I think this issue should be the website side has set some anti-spider rules. Unfortunately, power query does not support like User-Agent section in web referencing. So I would suggest you use python instead.

 

Community Support Team _ Jimmy Tao

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.