cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
apmulhearn
Helper I
Helper I

Advice - Best Way to Import Large quantities of data when I can only pull a certain number at a time

Hello,

 

I need help knowing the best way to pull a large quantity of historical data into a new Power BI report.

I have to pull Email Event Activity out of HubSpot, and then join that Activity with another table from HubSpot to get the campaign name. Each week's worth of email activity can be about 200,000 rows of data alone. 


I find that I'm able to pull about 15 days at a time to get the query to load into the report. So, I so far have 8 queries of about 200,000 rows of data each. Those rows contain the same data, but for different date ranges. 

The first of those queries is already joined to the Campaign table. My original thought was that I would Insert a step and append all of the additional queries, then turn off their abiltiy to refresh. However, several hours later, I am still not even seeing the data load into the model. It is simply "waiting for other queries." 

Is there a better way to import all of this data and make it run faster?


This is the primary query:
let
Source = CdataHubspot.DataSource("CData Power BI HubSpot", null, [Query="SELECT Id, RecipientEmail, CampaignID, Type, MAX(CreatedAt)

FROM EmailCampaignEvents

WHERE Type in ('CLICK', 'OPEN', 'DELIVERED') and CreatedAt > '2/28/2022' AND CreatedAt < '3/08/2022' GROUP BY CampaignID, RecipientEmail, Type, Id"]),
#"Merged Queries" = Table.NestedJoin(Source, {"CampaignID"}, HSEmails, {"AllEmailCampaignIDs"}, "HSEmails", JoinKind.LeftOuter),
#"Expanded HSEmails" = Table.ExpandTableColumn(#"Merged Queries", "HSEmails", {"CampaignName", "Name", "Audience", "CampaignYYMM"}, {"HSEmails.CampaignName", "HSEmails.Name", "HSEmails.Audience", "HSEmails.CampaignYYMM"})
in
#"Expanded HSEmails"

And from the picture below, each of queries 1-7 is approximately 200k rows of data with the same type of info as the first part of that main query.

apmulhearn_0-1647271165391.png

 

1 ACCEPTED SOLUTION

Hi @apmulhearn .  Try the following YouTube video to understand how to create a function:

How to create functions in Power Query - YouTube

Custom Functions Made Easy in Power BI - YouTube

Write Your First CUSTOM M FUNCTION in Power BI - YouTube

 

In your example, the custom function needs to receive the date as the parameter.  This date is converted to text to be included in the Query String.  I suggest using a step to prepare the string then include the string into the CData function.  It would look something like this:

(#"Date Parameter" as date) => 
let
#"Date Text" = Date.ToText( #"Date Parameter" , "mm/dd/yyy"),
#"Make Query String" = "Select blah blah" & #"Date Text" & "blah blah(" & #"Date Text" & ")",
#"Call CData" =  CdataHubspot.DataSource("CData Power BI HubSpot", null, [Query= #"Make Query String" ])
in
#"Call CData"

The result of the above should be a Table.  This table can be expanded.

View solution in original post

4 REPLIES 4
BayardBradford
Frequent Visitor

@apmulhearn  There is a HubSpot Certified App that is available in the HubSpot App Marketplace for connecting HubSpot to Power BI.  Because the app is a middleware solution that includes a data warehouse, it solves problems caused by HubSpot's API limitations.,  There is a free trial.

 

https://ecosystem.hubspot.com/marketplace/apps/marketing/analytics-data/powerbi

 

Here is a map of data flow from HubSpot to Power BI that the integration provides:

HubSpot Data           Data Flow      Power BI

Companies-->Companies
Contacts-->Contacts
Deals-->Deals
Engagements-->Engagements
Email Events-->Email Events
Tickets-->Tickets
Owners-->Owners
Products-->Products
Forms-->Forms
Companies-->Company Properties
Contacts-->Contact Properties
Associations-->CRM Associations
Deal Pipelines-->Deal Pipelines
Deals-->Deal Properties
CampaignEmailEvents-->Marketing Email
Custom Objects-->CRM Associations
Quotes-->CRM Associations
Contact Lists-->Contact Lists
Website Pages-->Analytics
Workflows-->Workflows

 

Here is a link to the documentation: https://support.datawarehouse.io/hc/en-us/articles/360006051874-Power-Bi-Connector-Manual

 

 

 

Daryl-Lynch-Bzy
Solution Sage
Solution Sage

Hi @apmulhearn , I see you are using the CData connector, so I am going to rule out using Dataflows.  Please consider the following suggestions and ideas:

  1. Don't use the "Merge" in Power Query - There is a simple join so you can leave this until the data is available in the Data Model.
  2. Start with Date Table + Extract Function - Convert the CData call into a function that can be used on Table with Date column (e.g. from 1/1/2022 to 31/12/2022).  The function will call results for each individual CreatedAt date.  Essentially you will create a Query String for each day and call them one by one.  You will then expand the results into a table (i.e. effective append them together). 
  3. Incremental Refresh - you can consider turn on incremental refresh because the Start and End date will  apply to the Date column.  Power BI will create partition for dates to store History and when you hit refresh it should only grab the last couple of days.

I hope this helps.

 

Hi Daryl - I've not yet worked with functions, so I've been trying to do some reading since your response to get a little bit better of an understanding.  I haven't found a good resource yet, though. Do you have a recommendation?  When I click on the query and select "Create Function," I get a message that I haven't set parameters. I don't really know where to start - and if I need to start over, it is ok. If you have any suggestion for a quality post on adding data this way, or if you have the time and interest to dumb your guidance down a little bit for me, I will be greatly appreciative.

Hi @apmulhearn .  Try the following YouTube video to understand how to create a function:

How to create functions in Power Query - YouTube

Custom Functions Made Easy in Power BI - YouTube

Write Your First CUSTOM M FUNCTION in Power BI - YouTube

 

In your example, the custom function needs to receive the date as the parameter.  This date is converted to text to be included in the Query String.  I suggest using a step to prepare the string then include the string into the CData function.  It would look something like this:

(#"Date Parameter" as date) => 
let
#"Date Text" = Date.ToText( #"Date Parameter" , "mm/dd/yyy"),
#"Make Query String" = "Select blah blah" & #"Date Text" & "blah blah(" & #"Date Text" & ")",
#"Call CData" =  CdataHubspot.DataSource("CData Power BI HubSpot", null, [Query= #"Make Query String" ])
in
#"Call CData"

The result of the above should be a Table.  This table can be expanded.

Helpful resources

Announcements
Power Platform Conf 2022 768x460.jpg

Join us for Microsoft Power Platform Conference

The first Microsoft-sponsored Power Platform Conference is coming in September. 100+ speakers, 150+ sessions, and what's new and next for Power Platform.

Power BI Dev Camp Session 23 768x460.jpg

Check it Out!

Mark your calendars and join us on Thursday, June 30 at 11a PDT for a great session with Ted Pattison!

Top Solution Authors