Showing results for 
Search instead for 
Did you mean: 
Super User V
Super User V

Collecting multiple files from multiple machines

I am trying to make sense of the Gateway logs. Yes, I know, crazy.  Since we are a real company we have real gateways. One is a cluster of six VMs, the other a cluster of four.  (Yes, Microsoft, there are people who use gateways with more than two cluster members!)


Each of the cluster members has their own set of GatewayInfo and GatewayErrors log files, usually a bit more than 24 hrs worth which translates into about 20 files per cluster member and category.


I have a Power Query function that retrieves the content of one type of CSV file from one cluster member . I did that to process Info and Errors separately but in hindsight that might not be necessary, they have largely the same format (or whatever you call that abomination, Microsoft!!!)


Then I have a list of my cluster members, and two referencing queries that use the cluster member list to collect all the Info files and all the Errors files. 


My suspicion is that I am vastly overcomplicating the retrieval.  I have a query that calls a custom function for each of the cluster members, and that custom function calls another custom function for each of the log files.   Each of these functions gets the result contents and expands it to table columns. I am basically  doing that Table.ExpandTableColumn in all three levels.  


Is it possible instead to just take the "raw"  table data and hand that back as the result of the called functions?


Here is the bottom (lowest level) function GetLog - this handles the raw log file contents for a single log file (regardless if Info or Errors)


(Content) => let
        Source = Csv.Document(Content,[Delimiter="#(tab)", Columns=8, Encoding=65001, QuoteStyle=QuoteStyle.None]),
        #"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type),
        #"Added Conditional Column" = Table.AddColumn(#"Added Index", "Action", each if [Column8] = "" then [Column1] else [Column8]),
        #"Added Conditional Column1" = Table.AddColumn(#"Added Conditional Column", "Timestamp", each if Text.StartsWith([Column1], "DM.") then Text.Middle([Column1],Text.PositionOf([Column1],":")+6,28) else null),
        #"Removed Columns" = Table.RemoveColumns(#"Added Conditional Column1",{"Column1", "Column8"}),
        #"Replaced Value" = Table.ReplaceValue(#"Removed Columns","",null,Replacer.ReplaceValue,{"Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Action", "Timestamp"}),
        #"Filled Down" = Table.FillDown(#"Replaced Value",{"Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Action", "Timestamp"})
        #"Filled Down"



The next level up combines all the logs of a certain type on one machine (here showing the GetInfo version)


(Machine) => let
    Source = Folder.Files("\\" & Machine & "\c$\Users\PBIEgwService\AppData\Local\Microsoft\On-premises data gateway"),
    #"Filtered Rows" = Table.SelectRows(Source, each Text.StartsWith([Name], "GatewayInfo")),
    #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Rows", "Transform File", each GetLog([Content])),
    #"Expanded Transform File" = Table.ExpandTableColumn(#"Invoke Custom Function1", "Transform File", {"Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Index", "Action", "Timestamp"}, {"Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Index", "Action", "Timestamp"}),
    #"Removed Other Columns" = Table.SelectColumns(#"Expanded Transform File",{"Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Index", "Action", "Timestamp"})
    #"Removed Other Columns"



Next level up combines the results into one single table for Info (and one for Errors)


    Source = Gateways,
    #"Filtered Rows" = Table.SelectRows(Source, each [Active]=true),
    #"Invoked Custom Function" = Table.AddColumn(#"Filtered Rows", "GetInfo", each GetInfo([Gateway])),
    #"Expanded GetInfo" = Table.ExpandTableColumn(#"Invoked Custom Function", "GetInfo", {"Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Index", "Action", "Timestamp"}, {"Column2", "Column3", "Column4", "GatewaySessionID", "ClientPipelineID", "Column7", "Index", "Action", "Timestamp"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded GetInfo",{{"Timestamp", type datetimezone}, {"Index", Int64.Type}})
    #"Changed Type"



Can I somehow shortcut the GetInfo function to return the raw table data to the last query, instead of expanding the table columns twice? Or asked differently - can a function return a (single column) table of tables, and can the receiving query expand that in one go?

Community Support
Community Support

Re: Collecting multiple files from multiple machines

Hi @lbendlin , 

I am not sure I get your requirement, you could try to add new step to call function again to see whether it work or not. By the way, I suggest you could @ other super users for more suggestions.

Best Regards,
Zoe Zhi


Helpful resources


August Community Highlights

Check out a full recap of the month!

Experience what’s next for Power BI

Join us for an in-depth look at the new Power BI features and capabilities at the free Microsoft Business Applications Launch Event.

Top Solution Authors