cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
Highlighted
simo-climb Frequent Visitor
Frequent Visitor

Export, manage Column/text label transforms ( replacer text) - programmatic data governance

Hi folks

 

I'm trying to implement a range of data governance workflows in an organizational context in Power BI desktop and service. I want to programatically capture transforms that occur on data, specifcally aiming for column and ID/code text label transforms and want to ultimately manage these in the CDM folders on azure storage as entity mapping terms. It's to support a controlled taxonomy of terms in essence. 

 

Even if some parts require some intervention. I can do it all manually but that simply won't be sustainable.

 

This is a long multipart use case ( a thought dump). I don't expect a full answer, but a simple... your dreaming, yes it's possible, use this approach, or look at this tool it does similar things would be so so appreciated ! 🙂

 

Request: I'm hoping for some inspiration on how to automate the extraction of a set of data transformations in the Power BI stack

 

Goal:  Automatically (as much as possible) capture Power Query function transformation actions such as column or text pattern replacements, that occur on original SQL/csv column names or text strings e.g. ID's or short codes which are changed to user labels for final BI reports. 

 

Tech requirements: It's in a government agency built on .NET, Azure and centered on Power BI. So tools in those frameworks are key dependencies.

 

For background.... I use RDF https://www.w3.org/RDF/ ,  SKOS https://www.w3.org/TR/skos-reference/#labels and an enterprise level vocabularby manager to do all this with other clients. But this specific client does not have the option to go down that level of investment.... but they still want data governance ( he bangs head against wall Man Tongue )

 

My envisaged workflow is:

1. BI report authors publish a report and/or dataset to Power BI service

2. a scheduled event .... a  dataflow Power Query script or R script or an Azure service or powershell scriot or cron job or something else ! ... runs and performs introspection on a) each table in the loaded data model and/or b) dumps a copy of the M-code for parsing. 

3. a script extracts key metadata about a set of transformations that I define. e.g. I want to capture all replace.text function calls on columns or text patterns so that textOriginal and textFinal are captured (plus some other metadata). Hence at this stage I see that I need to be able to parse the M-query itself, not just get the "final" table headers or cell contents

3. create, append or export this information to a persisted file or dataset. e.g.

        3a) directly create/append data to a new power bi table: or

         3b. if an intermediate step is required, a csv, for example in sharepoint, then run a script to ingest this into a             Power BI dataset/flow

4. export/serialize the final table/data file in the required CDM folder structure ( I woud use python most likely)

5. load the custom entity mapping to the CDM folder ( python)

 

So what have I acheived so far ....

 

6. I've experimented with some R sctipts to export a table as csv - works fine but only gets end state column labels or cell content

7. I've set up a bunch of regex functions that extract all the text replace functions and NameOriginal, NameFinal on columns or text strings from the M-Code.

 

Next steps/ Gaps

8. work out how to export this data to a sharepoint folder

9. how to run "something" automatically to extract the M-Code to somewhere e.g. sharepoint power bi storage folder? 

10. how to run the regex parsing on this file

       10a) I'm assume Powershell is the line here. And could be scheduled ?

Can powershell tasks be automated on Azure ? i guess yest

        10b) it appears that R can now use regex and so I wonder if R scripts can scheduled in Power BI service in anyway? 

 

11. Key issue..... I haven't found a likely candidate to then run a scheduled/event driven process to create/append this data to a Power BI dataset or report. e.g. M code that can scheduled to create stage table/data

 

At this point I'm expecting to have a set of scripts run on an azure server, likely python flask to orchestrate things. 

 

Ideas/Questions:

I've connected to the tabular model of the power bi desktop files a lot.

I know once in the tabular model or .bim files all dax and I think M-code can be obtained. So I could then parse these. But it's all manual so far as I've seen via Daxstudio or excel.

Any ideas on automating this ?

Can the same be acheived to Power BI service ? or Azure gen2 storage ?

 

Any ideas or feedback most welcome !

 

thanks in advance

 

Simon

1 ACCEPTED SOLUTION

Accepted Solutions
Community Support Team
Community Support Team

Re: Export, manage Column/text label transforms ( replacer text) - programmatic data governance

Hi @simo-climb ,

By my research, I'm afraid that there is no specific document or blogs about getting the M query automatically currently in power bi.

You could post your idea in Power BI ideas Forum and add your comments there to improve Power BI and make this feature coming sooner.

Best  Regards,

Cherry

 

Community Support Team _ Cherry Gao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

1 REPLY 1
Community Support Team
Community Support Team

Re: Export, manage Column/text label transforms ( replacer text) - programmatic data governance

Hi @simo-climb ,

By my research, I'm afraid that there is no specific document or blogs about getting the M query automatically currently in power bi.

You could post your idea in Power BI ideas Forum and add your comments there to improve Power BI and make this feature coming sooner.

Best  Regards,

Cherry

 

Community Support Team _ Cherry Gao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

Helpful resources

Announcements
Exclusive LIVE Community Event #3 – Ask Arun Anything

Exclusive LIVE Community Event #3 – Ask Arun Anything

Join us in the third Triple A event!

Meet the 2020 Season 1 Power BI Super Users!

Meet the 2020 Season 1 Power BI Super Users!

It’s the start of a new Super User season! Learn all about the new Super Users and brand-new tiered recognition system.

January 2020 Community Highlights

January 2020 Community Highlights

Make sure you didn't miss any of the things that happened in the community in January!

Top Solution Authors
Top Kudoed Authors