Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Schwadenfeld
Helper I
Helper I

Web Scrapping from Amazon - Bypass Cookies - Can't access data because of cookies

let
    Source = Excel.Workbook(File.Contents("C:\Users\tombo\OneDrive\Schwadenfeld®\4) Reporting\3) Tierhood\Business Reports 2 Month.xlsx"), null, true),
    #"2 Month_Sheet" = Source{[Item="2 Month",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(#"2 Month_Sheet", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Date", type date}, {"(Parent) ASIN", type text}, {"(Child) ASIN", type text}, {"Title", type text}, {"Sessions - Total", Int64.Type}, {"Sessions – Total – B2B", Int64.Type}, {"Session Percentage - Total", type number}, {"Session Percentage – Total – B2B", type number}, {"Page Views - Total", Int64.Type}, {"Page Views – Total – B2B", Int64.Type}, {"Page Views Percentage - Total", type number}, {"Page Views Percentage – Total – B2B", type number}, {"Featured Offer (Buy Box) Percentage", type number}, {"Featured Offer (Buy Box) Percentage – B2B", Int64.Type}, {"Units ordered", Int64.Type}, {"Units ordered - B2B", Int64.Type}, {"Unit session percentage", type number}, {"Unit session percentage - B2B", Int64.Type}, {"Ordered product sales", type number}, {"Ordered product sales - B2B", type number}, {"Total order items", Int64.Type}, {"Total order items - B2B", Int64.Type}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Title", "Sessions - Total", "Sessions – Total – B2B", "Session Percentage - Total", "Session Percentage – Total – B2B", "Page Views - Total", "Page Views – Total – B2B", "Page Views Percentage - Total", "Page Views Percentage – Total – B2B", "Featured Offer (Buy Box) Percentage", "Featured Offer (Buy Box) Percentage – B2B", "Units ordered", "Units ordered - B2B", "Unit session percentage", "Unit session percentage - B2B", "Ordered product sales", "Ordered product sales - B2B", "Total order items", "Total order items - B2B"}),
    #"Added Custom" = Table.AddColumn(#"Removed Columns", "Custom", each Text.Combine({"https://www.amazon.de/dp/",[#"(Child) ASIN"]},"" )),
    #"Sorted Rows" = Table.Sort(#"Added Custom",{{"Custom", Order.Ascending}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Sorted Rows",{"Date"}),
    #"Removed Duplicates" = Table.Distinct(#"Removed Columns1"),
    #"Filtered Rows" = Table.SelectRows(#"Removed Duplicates", each ([#"(Child) ASIN"] = "B083V8LST6" or [#"(Child) ASIN"] = "B08DR1PSJ7")),
    #"Added Custom1" = Table.AddColumn(#"Filtered Rows", "Custom.1", each Web.BrowserContents([Custom])),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Custom.2", each Html.Table([Custom.1], {{"text", ":root"}}))
in
    #"Added Custom2"

Hi!

 

I have attached my powerbi desktop file.

I'm trying to pull data from the Amazon website via multiple unique URLs.

Example website:
https://www.amazon.de/gp/product/B08FMQ6NLH/ref=s9_acss_bw_cg_brgift_3b1_w?pf_rd_m=A3JWKAKR8XB7XF&pf...

It's working but I can't access because I get an error:
Screenshot 2023-01-31 at 16.27.39.png

I have attached the powerbi file. In the one query you see the desired outcome, and in the other i'm tring to replicate that, BUT I want to use a always-changing table for this. so i can't just paste the same URL all the time.

https://drive.google.com/drive/folders/1uZTih8Eqb6DRD7rU9NguoV7rhVM80lXX

looking forward to help, thanks!

1 REPLY 1
Schwadenfeld
Helper I
Helper I

I was able to solve it like this: https://drive.google.com/drive/folders/1uZTih8Eqb6DRD7rU9NguoV7rhVM80lXX

Only question left: Why it sometimes pulls the website on english, sometimes on german language?

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.