Solved: HOW-TO REST API calls + M + refresh in Power BI Se...

Element115 · ‎12-13-2022

First off, I am writing this up because I would not wish on my worst enemy to go through what I went through just to figure out why the Service refresh would continuously keep emitting this JSON error message:

{
	"error": {
		"code": "DM_GWPipeline_Gateway_MashupDataAccessError",
		"pbi.error": {
			"code": "DM_GWPipeline_Gateway_MashupDataAccessError",
			"parameters": {},
			"details": [
				{
					"code": "DM_ErrorDetailNameCode_UnderlyingErrorCode",
					"detail": {
						"type": 1,
						"value": "-2147467259"
					}
				},
				{
					"code": "DM_ErrorDetailNameCode_UnderlyingErrorMessage",
					"detail": {
						"type": 1,
						"value": "[Unable to combine data] Section1/RT_table/type_change references other queries or steps, so it may not directly access a data source. Please rebuild this data combination."
					}
				},
				{
					"code": "DM_ErrorDetailNameCode_UnderlyingHResult",
					"detail": {
						"type": 1,
						"value": "-2147467259"
					}
				},
				{
					"code": "Microsoft.Data.Mashup.MashupSecurityException.DataSources",
					"detail": {
						"type": 1,
						"value": "[{\"kind\":\"Web\",\"path\":\"https://api.domain.com/oauth/accesstoken\"}]"
					}
				},
				{
					"code": "Microsoft.Data.Mashup.MashupSecurityException.Reason",
					"detail": {
						"type": 1,
						"value": "PrivacyError"
					}
				}
			],
			"exceptionCulprit": 1
		}
	}
}

The first clue is to be found here:
https://learn.microsoft.com/en-us/power-query/dataprivacyfirewall

Now, here's a step-by-step guide explaining as clearly as possible how to do it right. But first, some context:

In this scenario, Power Query needs to get data from an on-prem DB, then combine it with data from a REST API (total of 3 calls: 1 to get the auth token, 1 to get some ID used in the 3rd call to get data associated with this ID). The 2nd and 3rd API calls need a project ID in the relative path of the URL. This will become relevant in a moment. But first a screenshot:

The query that was causing a problem is RT_table. The solution is basically to inline all your queries--here everything underlined in green (which are custom functions) as well as the query in the red box (which is just a string assigned to a variable), which I had forgotten to inline and as a result the Service would not refresh.

But once the query in the red box was inlined inside RT_table, the Service started refreshing properly.

One more thing: if your Source is some sort of database, you need to bring in the code that is referencing the database and its tables or views. I had it in a different query, called OLC, and was only referencing this OLC query, like this:

let
    Source = OLC,
    ...
in
    last_step

and that does not work.

############################### EDIT 2023-01-23 #################################

No joke, you literally have to bring in everything into the query that will call the REST API.

########################### THE ABOVE IS NOT NECESSARY ###########################

As regards the above edit: it appears that you can define all of your custom functions outstide of the query and then call them within the query and the Service will refresh no problem. So, for example, you could put the getToken() function outside of the query (in this case it prevents the token request from being called too often inside the query for each row being iterated). And as regards inlining all the on-prem DB references that already exist in a query situated before the current query in the query chain, you could use the trick @ImkeF mentions in her blog here: https://www.thebiccountant.com/2022/12/16/quick-fix-for-formula-firewall-issues-in-power-query-and-p...

##############################################################################

Now, as regards the data source settings for permissions, I set everything to Credentials = Anonymous and Privacy Level = Public. Had no patience left to figure out whether or not it would also work with other settings seeing as it is very difficult to find where this is documented. Some users also mentioned that Web.Contents() does not work properly if you don't break down the URL into domain name + relative path, meaning you need to specify RelativePath as done in the code above. TBH I don't know if it's true or not as I had not time and was not amused by the whole experience to start testing if it is true or not. So I just added it to my code with all the other changes. In the end it worked, and that's all I care about at this point.

The final M code looks like this:

let 
    Source				=	Sql.Databases("DB_instance"),
    DB_name				=	Source{ [Name="DB_name"] }[Data],
    dbo_vOLC			=	DB_name{ [Schema="dbo", Item="vOLC"] }[Data],
    #"Changed Type"		=	Table.TransformColumnTypes(dbo_vOLC,{ {"ID_External", Int64.Type} }),
    #"Renamed Columns"	=	Table.RenameColumns(#"Changed Type", { {"FK_ID_###", "ID_###"} } ),
	remove_cols			=	Table.RemoveColumns(
								#"Renamed Columns",
								{
									"ID"
								,	"ID_OLC"
								,   "Longitude"
								,   "Latitude"
								,   "Is_remotely_managed"
								,   "Model"
								,   "FirmwareVersion"
								,   "HardwareAddress"
								,   "ControlSystem"
								,   "InstallationDate"
								,   "CommunicationStatus"
								,   "NumberOfMeteringChannels"
								,   "CLO_Enabled"
								,   "LastReportTime"
								,   "Is_metered"
								,   "UserSwitchType"
								,   "FactorySwitchType"
								}
							),	
	project_id			=	"###",
	getToken			=	() => 
    let 
		URL				=	"https://api.###.com",
        body            =   "app_key=###&app_secret=###&service=###&scope=###",
        response        =   Web.Contents(
								URL, 
								[
									RelativePath = "oauth/accesstoken"
								,	Headers =   [
													#"Authorization"="Basic ### base64 encoded creds ###"
												,   #"Content-Type"="application/x-www-form-urlencoded"
												]
								,   IsRetry = true
								,   Content = Text.ToBinary(body, BinaryEncoding.Base64) 
								]
							),
        jsonResponse    =   try Json.Document(response),
        token           =   if jsonResponse[HasError] then 
                                error jsonResponse[Error][Message]
                            else
                                jsonResponse[Value][token]
    in
        token,
	getHandleID		=	(external_ID as number, token as text) => 
	let
		URL				=	"https://api.###.com",
		relative_path	=	"###/api/###/global/realtimelink/v1.0/en-us/" & project_id,
		headers         =   [
								#"Authorization"="Bearer " & token
							,   #"Content-Type" = "application/json"
							],
		external_ID_txt	=	try Number.ToText(external_ID),
		// do not use: Json.FromValue([componentExternalId="###", componentType="OLC"]),
		post_data       =   if external_ID_txt[HasError] then 
								"cannot convert external_ID from number to text"
							else
								Text.ToBinary("[{'componentExternalId':'" & external_ID_txt[Value] & "', 'componentType':'OLC'}]"),
		http_resp       =   if external_ID_txt[HasError] then 
								post_data
							else 
								try										// start HTTP POST request 
									Web.Contents(
										URL
									,   [
											RelativePath	= relative_path
										,	Headers			= headers
										,   IsRetry			= true
										,   Content			= post_data  // <<- including the 'Content' field to this record signifies do a POST request
										]
									),
		response        =   if http_resp[HasError] then error http_resp[Error][Message]
							else try Json.Document(http_resp[Value])
	in
		if response[HasError] then 
			response[Error][Message] 
		else 
			try response[Value]{0}[handleId] otherwise response[Value][Message],
	addColumn			=	(T as table, new_col_name as text, f as function, col_name as text, token as text, col_type as type) =>
	let
		new_col			=	Table.AddColumn(
								T
							,   new_col_name
							,   each f(Record.Field(_, col_name), token)
							,   col_type
							)
	in
		new_col,
	add_handleID	=	addColumn(remove_cols, "handleID", getHandleID, "ID_External", getToken(), type text),
 	type_change0    =	Table.TransformColumnTypes( add_handleID, { {"handleID", type text} } ),
	getRTData		=	(handle_ID as nullable text, token as nullable text) as record => 
	let
		handleID_num	=   try Number.From(handle_ID),
		rt_data     	=   if handleID_num[HasError] then
								[ERROR=handleID_num[Error][Message]]
							else
								let
									URL				=	"https://api.###.com",
									relative_path	=	"###/api/###/global/realtimelink/v1.0/en-us/" & project_id & "/details", //  <<--NOTE: '?' symbol is not necessary, do not add '?' to the URL for the GET query
									headers         =    [
															#"Authorization" = "Bearer " & token
														,   #"Connection"    = "keep-alive"
														],
									query           =   [handleId = handle_ID],
									http_resp       =   try 					// start HTTP GET request
															Web.Contents(
																URL
															,   [
																	RelativePath	= relative_path
																,	Headers			= headers
																,   IsRetry			= true
																,   Query			= query
																]
															)
														,
									data            =   if not http_resp[HasError] then try Json.Document(http_resp[Value])
														else [ERROR=http_resp[Error][Message]],
									result          =   if data[HasError] then [ERROR=data[Error][Message]]
														else if Value.Is(data[Value], type list) and List.IsEmpty(data[Value]) then [ERROR="no data: wait for the next server refresh"]
														// the API returns a list, ie [] JSON array, when data is available
														else if Value.Is(data[Value], type list) and not List.IsEmpty(data[Value]) then data[Value]{0}
														// the API returns a record, ie {} JSON object, only if there was an error
														else if Value.Is(data[Value], type record) then [ERROR=data[Value][Message]] else null
								in
									result
	in
	// returns a record of real-time values or an error msg record
		rt_data,
	rt_table		    	=	addColumn(type_change0, "RT_DATA", getRTData, "handleID", getToken(), type record),
	expand_data             =	Table.ExpandRecordColumn(rt_table, "RT_DATA", {"status", "DateTime", "properties"}, {"status", "DateTime", "properties"}),
	expand_prop_list        =	Table.ExpandListColumn(expand_data, "properties"),
	#"Expanded properties"  =	Table.ExpandRecordColumn(expand_prop_list, "properties", {"Key", "Value", "Unit"}, {"Key", "Value", "Unit"}),
	#"Removed Columns1"		=	Table.RemoveColumns(#"Expanded properties",{"Unit"}),
	#"Filtered Rows"		=	Table.SelectRows(#"Removed Columns1", each ([Key] <> null)),
	type_change1			=	Table.TransformColumnTypes(#"Filtered Rows",{{"Key", type text}, {"Value", type text}}),
	#"Pivoted Column"		=	Table.Pivot(type_change1, List.Distinct(type_change1[Key]), "Key", "Value")
in 
    #"Pivoted Column"

The eagle-eyed amongst you will have noticed the generous use of the try...otherwise keywords. Once I found out we could catch errors, I immediately started using it, especially since I was working with a funky REST API that would fail to return results at random. The important thing to notice here when you use try...otherwise is that even if there is no error, the result that will be stored in the variable will not be of the same data type you would expect as when you do not use try...otherwise.
OK, let's be more clear:

// say you expect some text string to be returned
var_name = Web.Contents(...),

// var_name contains, for example, some text, say a company name
// but if you write the following, now using the try keyword (otherwise is optional)

var_name = try Web.Contents(...),
// now var_name contains a record;  if there is no error caught by try, then the record has 2 fields:
// HasError = FALSE and Value=the value returned by the operation that try is monitoring
// so to access the value later, you cannot write 
// some_var = var_name
// you have to write
// some_var = var_name[Value]
// if an error occurred, try will catch it and save another record into var_name with the fields
// HasError = TRUE and a field ERROR (which is a record), and inside ERROR there is one field called Message 
// so to pass this error along and out to see in a column if something went wrong, all you have to do is write var_name[Error][Message]
// the pattern I find useful to handle errors is in the full code above
// if var_name[HasError] then var_name[Error][Message] // return the description of the error
// else var_name[Value]  // return the value from the underlying operation if no error
// Of course, if the [Value] returned is a list or record, then you have to use the proper
// operator for list element or field access, ie for lists: {i}, where i = 0, ..., n, and for records: [field_name]

You can now put away the noose. You are welcome! 😎

ImkeF · ‎12-16-2022

Hi @Element115 ,
for me, converting the "project_id"-query into a function often worked just fine and I didn't have to rework all my existing queries, inlining them into the API-call: Quick fix for Formula.Firewall issues in Power Query and Power BI (thebiccountant.com)

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

ImkeF · ‎01-11-2023

Hi @Element115 ,
agree, that sounds slow.
I am doing quite a bit with APIs and don't come across such slow loading times (especially for that litte transformation that you do).
Do you really need to retrieve the token in every every row?
Also, I could believe that the addColumn-function slows it down and I cannot see a reason to use it here, so I would refactore that code bit to use the native function instead.
Using error handlers also slows down the code, so just make sure to only use it where you really need it.
Also, it could be that due to the complexity of the code, PQ will go and attempt to download the data from the SQL server multiple times. To prohibit that, I would buffer it at this stage:

remove_cols = Table.Buffer( Table.RemoveColumns( .... )

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

ImkeF · ‎01-13-2023

Hi @Element115 ,
you do that "outside" of the query editor: Go to "Schedule refresh". There, under "Data source credentials" you can adjust it:

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

ImkeF · ‎01-13-2023

Hi @Element115 ,
you do that "outside" of the query editor: Go to "Schedule refresh". There, under "Data source credentials" you can adjust it:

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Element115 · ‎01-14-2023

*** EDIT: forgot I moved on inline function def out into its own query. Need to add that to the dataflow and then maybe it'll let me save&close. ***

Problem though is that PQ in the Service does not let me save and exit to go to the menu you showed me because the database and API connections have not been set, even though they already exist in the gateway for this account as we already have other reports using the DB connection.

Further, since the DB connection comes first in the M code, I try to set this but the Connect button remains grayed out and for some unknown reason, there seems to be no way to progress to the stage of saving the M code and reaching the Dataflow-->Data source credentials menu.

ImkeF · ‎01-14-2023

What happens if you click on "Configure connection"?
That is the direct way to edit connections in the dataflow.
There you should also be able to choose the gateway connection.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Element115 · ‎01-16-2023

Mmmhh very strange... the first time I clicked that button and set the connection, it failed. I had to go through the same process a 2nd time in a row before it accepted the already setup gateway. Now the M code is running and waiting for the API server...

ImkeF · ‎01-12-2023

Then you can try buffering the input step for that pivot-operation.
Or even moving those transformations to a dataflow and just do the pivot in the pbix.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Element115 · ‎01-13-2023

Hi Imke, re creating a dataflow, I noticed that the UI is slightly different from the one in PQ Editor where you set the Data Source Settings. Where is the Edit Permissions to set the Privacy Level? All PQ in the cloud does is fail to automatically pick up the already set gateway and permissions. And by mixing DB and API access inside the same M script, when I select a custom function, say the getToken() func, and then click the Configure connection that appears on the right, the dialog only shows me the DB connection already associated with the gateway, but does not show the internet/API connection that is already set up and used by the report.

Is this a bug? Or is their another UI path?

ImkeF · ‎01-12-2023

Agree, that sounds strange.
Does it load if you omitt the last step (pivoting)?
Also: Have you disabled background data refresh?:

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Element115 · ‎01-12-2023

RE omitting the pivot step, the answer is yes, and it loads a lot faster of course. Pivoting 96 fields out into columns I guess takes much longer than I thought it would.

Element115 · ‎01-12-2023

Yes, I disabled Backgroud Data.

ImkeF · ‎01-12-2023

Hi @Element115 ,
actually, I am not familiar with Table.StopFolding yet.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Element115 · ‎01-12-2023

So I tried both Table.Buffer and Table.StopFolding. With the latter, there is a slight speed improvement. But in the end it does not matter much as the query completion is slowed down mostly in the last steps which consist of pivoting 96 key:value pairs into their respective columns, followed by more than 2 dozen data type changes.

What is nice about the Table.StopFolding step is that one can click and inspect another query in the editor and then when navigating back to the query using Table.StopFolding, the query will show up as a fully populated table without the usual re-load wait time. So that's nice.

What is not nice, and even though I have all the data (1,000 x 96) in a table, upon clicking Close&Apply to get back to Power BI, I now get this OLE DB error message:

Solve one thing, another problem pops up. Seems to be Power BI's SOPs. At least that's how my experience has been with this tool since the beginning, no matter how many updates Microsoft releases.

If any body has any idea why I am getting a type mismatch error even though the Power Query editor does not complain about anything, thus it shows me that all the type casts have been applied without error and so presumably everything is fine, please let me know.

AFAIC there should be no type mismatch as the new transformations applied in Power Query should just override the table in the PBI model, in other words, if the user goes from PQ to PBI, then the PBI table definitions in the model should be updated according to what is coming through from PQ. So a previous table definition in PBI, even if it has incompatible column types should cause no problem as the new stuff coming from PQ should wipe that and override it. Isn't that how it is supposed to work?

And why is changing types inside PBI faster than it is inside the PQ Editor? That too I find very puzzling.

ImkeF · ‎01-11-2023

Hi @Element115 ,
agree, that sounds slow.
I am doing quite a bit with APIs and don't come across such slow loading times (especially for that litte transformation that you do).
Do you really need to retrieve the token in every every row?
Also, I could believe that the addColumn-function slows it down and I cannot see a reason to use it here, so I would refactore that code bit to use the native function instead.
Using error handlers also slows down the code, so just make sure to only use it where you really need it.
Also, it could be that due to the complexity of the code, PQ will go and attempt to download the data from the SQL server multiple times. To prohibit that, I would buffer it at this stage:

remove_cols = Table.Buffer( Table.RemoveColumns( .... )

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Element115 · ‎01-14-2023

Is it enough to buffer only once? In other words, if buffering early on, at the top of the script as you suggested, does this mean that all subsequent ETL operations will use the buffered table?

Element115 · ‎01-12-2023

Actually, as I am reading the doc on Table.Buffer, Microsoft suggests using Table.StopFolding. What do you think? If the table is not fully loaded (max is 1,000 records), would this mean that Table.StopFolding will be ignored?

Element115 · ‎01-12-2023

Yes, re the getToken() call: the reason is that because the token is valid only for one hour, and since it takes longer than one hour to get everything, and not knowing for sure whether it took one hour plus to call the API or internal PQ processing, I did not want to take the chance of having the token expire before all the records got ingested. And since I have no documention from the API provider...

But I noticed something else yesterday about this blackbox API and I think I can chance calling getToken() only once. Will try.

RE error handlers, unfortunately there are unavoidable because... the API again, it is funky. 😉

I'll refactor addColumn (interesting, I wouldn't have thought that this would be so expensive) and add a Table.Buffer. Excellent suggestion! Thanks!

ImkeF · ‎12-16-2022

Hi @Element115 ,
for me, converting the "project_id"-query into a function often worked just fine and I didn't have to rework all my existing queries, inlining them into the API-call: Quick fix for Formula.Firewall issues in Power Query and Power BI (thebiccountant.com)

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Element115 · ‎01-11-2023

Thanks, Imke. You are correct. I was getting confused. I moved the variable assignment for the project ID out into a custom function, and it works fine. I also read your blog post about how to trick the PQ engine into returning a function in the last step inside the let..in block, and then calling that function after the in as the last step. Nifty trick.

Didn't use it for my solution though as keeping custom function code inlined in the query makes it easier to find the custom code. If the function definitions are their own queries, thus sitting outside the query that uses them, it might not be clear and quick to find which query is relying on these functions for the next person that might inherit the ETL solution at a later date, also considering that Power Query does not have what I like to call a real code editor with the typical code management functionality that comes with it (for ex like Visual Studio Code).
In conclusion, I am happy to say that now the Power BI Service refreshed the dataset with no error messages, and all pertinent data is imported ,very slowly though, almost 1,000 records... looking at the diagnostics, the PQ engine seems to complete each API calls in ms, so it's got to be the engine processing all the ETL transformations that must be slow. To give you an idea, it takes almost 2 hours to ingest a 1,000 x 96 dataset table (rows x columns). And the total number of transformations is really small too, with a final pivot as the last step.
I am surprised that a 1,000 x 96 would incur such a performance penalty. It's scary to think what would happen if the dataset to ingest had half a million rows, for ex. Power Query would be running for more than a week!

It is a lot faster to ingest data from a DB than a REST API. So Power Automate to dump the raw data into a DB, then importing from that DB might be the speedier and preferred solution.

v-stephen-msft · ‎12-13-2022

Hi @Element115 ,

Nice sharing and very helpful!

Thank you.

Best Regards,

Stephen Tao

Element115 · ‎01-14-2023

Does this mean I get a free year of Azure SQL for doing the Microsoft user support job? 😋

Element115 · ‎12-16-2022

Stephen, let's hit pause on my most recent question follow up post. It appears the API provider server is experiencing some issues. I will update once I get more details.

HOW-TO REST API calls + M + refresh in Power BI Service (cloud)

Helpful resources

Fabric certifications survey

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024