cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
apmulhearn
Helper I
Helper I

Advice - Best Way to Import Large quantities of data when I can only pull a certain number at a time

Hello,

 

I need help knowing the best way to pull a large quantity of historical data into a new Power BI report.

I have to pull Email Event Activity out of HubSpot, and then join that Activity with another table from HubSpot to get the campaign name. Each week's worth of email activity can be about 200,000 rows of data alone. 


I find that I'm able to pull about 15 days at a time to get the query to load into the report. So, I so far have 8 queries of about 200,000 rows of data each. Those rows contain the same data, but for different date ranges. 

The first of those queries is already joined to the Campaign table. My original thought was that I would Insert a step and append all of the additional queries, then turn off their abiltiy to refresh. However, several hours later, I am still not even seeing the data load into the model. It is simply "waiting for other queries." 

Is there a better way to import all of this data and make it run faster?


This is the primary query:
let
Source = CdataHubspot.DataSource("CData Power BI HubSpot", null, [Query="SELECT Id, RecipientEmail, CampaignID, Type, MAX(CreatedAt)

FROM EmailCampaignEvents

WHERE Type in ('CLICK', 'OPEN', 'DELIVERED') and CreatedAt > '2/28/2022' AND CreatedAt < '3/08/2022' GROUP BY CampaignID, RecipientEmail, Type, Id"]),
#"Merged Queries" = Table.NestedJoin(Source, {"CampaignID"}, HSEmails, {"AllEmailCampaignIDs"}, "HSEmails", JoinKind.LeftOuter),
#"Expanded HSEmails" = Table.ExpandTableColumn(#"Merged Queries", "HSEmails", {"CampaignName", "Name", "Audience", "CampaignYYMM"}, {"HSEmails.CampaignName", "HSEmails.Name", "HSEmails.Audience", "HSEmails.CampaignYYMM"})
in
#"Expanded HSEmails"

And from the picture below, each of queries 1-7 is approximately 200k rows of data with the same type of info as the first part of that main query.

apmulhearn_0-1647271165391.png

 

1 ACCEPTED SOLUTION

Hi @apmulhearn .  Try the following YouTube video to understand how to create a function:

How to create functions in Power Query - YouTube

Custom Functions Made Easy in Power BI - YouTube

Write Your First CUSTOM M FUNCTION in Power BI - YouTube

 

In your example, the custom function needs to receive the date as the parameter.  This date is converted to text to be included in the Query String.  I suggest using a step to prepare the string then include the string into the CData function.  It would look something like this:

(#"Date Parameter" as date) => 
let
#"Date Text" = Date.ToText( #"Date Parameter" , "mm/dd/yyy"),
#"Make Query String" = "Select blah blah" & #"Date Text" & "blah blah(" & #"Date Text" & ")",
#"Call CData" =  CdataHubspot.DataSource("CData Power BI HubSpot", null, [Query= #"Make Query String" ])
in
#"Call CData"

The result of the above should be a Table.  This table can be expanded.

View solution in original post

3 REPLIES 3
Daryl-Lynch-Bzy
Solution Sage
Solution Sage

Hi @apmulhearn , I see you are using the CData connector, so I am going to rule out using Dataflows.  Please consider the following suggestions and ideas:

  1. Don't use the "Merge" in Power Query - There is a simple join so you can leave this until the data is available in the Data Model.
  2. Start with Date Table + Extract Function - Convert the CData call into a function that can be used on Table with Date column (e.g. from 1/1/2022 to 31/12/2022).  The function will call results for each individual CreatedAt date.  Essentially you will create a Query String for each day and call them one by one.  You will then expand the results into a table (i.e. effective append them together). 
  3. Incremental Refresh - you can consider turn on incremental refresh because the Start and End date will  apply to the Date column.  Power BI will create partition for dates to store History and when you hit refresh it should only grab the last couple of days.

I hope this helps.

 

Hi Daryl - I've not yet worked with functions, so I've been trying to do some reading since your response to get a little bit better of an understanding.  I haven't found a good resource yet, though. Do you have a recommendation?  When I click on the query and select "Create Function," I get a message that I haven't set parameters. I don't really know where to start - and if I need to start over, it is ok. If you have any suggestion for a quality post on adding data this way, or if you have the time and interest to dumb your guidance down a little bit for me, I will be greatly appreciative.

Hi @apmulhearn .  Try the following YouTube video to understand how to create a function:

How to create functions in Power Query - YouTube

Custom Functions Made Easy in Power BI - YouTube

Write Your First CUSTOM M FUNCTION in Power BI - YouTube

 

In your example, the custom function needs to receive the date as the parameter.  This date is converted to text to be included in the Query String.  I suggest using a step to prepare the string then include the string into the CData function.  It would look something like this:

(#"Date Parameter" as date) => 
let
#"Date Text" = Date.ToText( #"Date Parameter" , "mm/dd/yyy"),
#"Make Query String" = "Select blah blah" & #"Date Text" & "blah blah(" & #"Date Text" & ")",
#"Call CData" =  CdataHubspot.DataSource("CData Power BI HubSpot", null, [Query= #"Make Query String" ])
in
#"Call CData"

The result of the above should be a Table.  This table can be expanded.

Helpful resources

Announcements
June 2022 update 768X460.jpg

Check it out!

Click here to read more about the June 2022 updates!

Power Platform Conf 2022 768x460.jpg

Join us for Microsoft Power Platform Conference

The first Microsoft-sponsored Power Platform Conference is coming in September. 100+ speakers, 150+ sessions, and what's new and next for Power Platform.

Power BI Dev Camp Session 23 768x460.jpg

Check it Out!

Mark your calendars and join us on Thursday, June 30 at 11a PDT for a great session with Ted Pattison!

June 20 episode 7 with aka link 768x460.jpg

The Power BI Community Show

Join us on June 20 at 11 am PDT when Kim Manis shares the latest on Azure Synapse analytics, the Microsoft Intelligent Data Platform, and notable Power BI Updates from Microsoft Build 2022.

Top Solution Authors