Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
b2wise
Helper III
Helper III

How to model price scrape data for repricing algorithm and historical analysis

Hi all,

 

I know the final report I want to build but I don't know how to go about it. Any help GREATLY appreciated!!

 

The data I'm working with is a scrape of a website where vendors compete to be the lowest price to 'win the buy box' (similar to amazon).

 

My company (Daisy) is scraping the prices for every item daily. I receive a daily scrape with about 200,000 rows daily. I am working on a repricing algorithm to adjust our prices based on the most recent lowest competitor price (and some other criteria).

 

Here is a sample.

 

DateItemVendorQty Break Price 
5/8/20228Daisy1                                     230.00
5/8/20228Carolina Dreaming1                                     360.00
5/8/20228Denways1                                     133.00
5/8/20226Daisy1                                     376.00
5/8/20226Carolina Dreaming1                                     336.00
5/8/20226Denways1                                     125.00
5/8/20227Daisy1                                     395.00
5/8/20227Carolina Dreaming1                                     235.00
5/8/20227Denways1                                     190.00
5/9/20227Daisy1                                     314.00
5/9/20227Carolina Dreaming1                                     300.00
5/9/20227Denways1                                     261.00
5/9/20226Daisy1                                     254.00
5/9/20226Carolina Dreaming1                                     387.00
5/9/20226Denways1                                     360.00
5/9/20228Daisy1                                     319.00
5/9/20228Carolina Dreaming1                                     140.00
5/9/20228Denways1                                     390.00
5/10/20226Daisy1                                     246.00
5/10/20226Carolina Dreaming1                                     226.00
5/10/20226Denways1                                     337.00
5/10/20227Daisy1                                     178.00
5/10/20227Carolina Dreaming1                                     372.00
5/10/20227Denways1                                     341.00
5/10/20228Daisy1                                     306.00
5/10/20228Carolina Dreaming1                                     151.00
5/10/20228Denways1                                        13.00

 

What I am currently doing in my PBI report is overwriting the previous day's prices (excel data source) and basing our new price off of that day's data only. My table visual looks like this (different price data than sample above):

 

ItemScrape DateLowest Competitor(s) NameLowest Competitor PriceCurrent Daisy PriceDaisy Winning?Daisy New Price
65/10/2022Denways$32.41$32.38Winning$32.40
75/10/2022Denways$48.42$48.40Winning$48.41
85/10/2022Carolina Dreaming$65.75$65.94Losing$65.74

 

 

 

The problem with that is two fold:

  1. I cannot look back in time at historical competitor pricing for my algorithm (Lowest competitor price for item last 30 days).
  2. I cannot show historical metrics such as 'Buy box ownership % for the last 7 days', 'Unique items listed for the last 30 days'.

My questions:

  1.  Where do I store 18,000,000 rows? (90 days of price history)
  2. How do I structure my data so that I am able to make the following visual and other similar visuals that can calculate winning vendor or price and show it as a percentage over time?

For this visual it would need to calculate the winner for each item first and divide the items won by total items for each day...

b2wise_0-1652198949350.png

 

Thank you in advance!!

 

 

 

 

 

1 REPLY 1
davehus
Memorable Member
Memorable Member

Hi @b2wise ,

 

The most efficient way to do this would be to build a table in an SQL database (if available) or equivalent and create an integration process to impor the data daily. It would not be practical to try and manage this volume of data in any other way.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.