Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now

Find articles, guides, information and community news

Most Recent
TedPattison
Employee
Employee

Do you every dream of becoming the Heavy-weight Power Query Champion of the World? Join our Power BI Dev Camp session on March 25th for a fast-paced primer on the fundamentals of the M programming language. The goal of this deep dive session is to give campers a stronger foundation for working directly with M code in the Advanced Editor when designing queries for datasets in Power BI Desktop or when designing queries for dataflows in the browser.

Read more...

amitchandak
Super User
Super User

If you are Power BI enthusiastic, what all links you should have handy with you. I compiled a list for you.

Read more...

v-lili6-msft
Community Support
Community Support

This blog is about how to use SEARCH/FIND function when <find_text> is a list values.

Read more...

ibarrau
Super User
Super User

Recently, I have noticed that there was a lot of content talking about gzip decompress with Power Query, but I haven't seen someone talking about decompressing .zip files. The truth is that gzip is a most common file extension for data with the Azure suite. We can handle gzip with Azure Data Lake and Azure Data Factory. Anyway, in this article we will check how to decompress a ZIP file to explore your files inside, like a Windows folder in Power Query, in a very similar way as the gzip is done. If you haven't seen how gzip works, you can check this post about it.

Read more...

ahmedoye
Resolver III
Resolver III

Only 6% of Analytics Reporting in organizations source their data from standard databases. Oh no, I made up that number. I have no idea what the numbers are. But I am certain that the majority of self-service BI & data analytics users must deal with a lot of crappy data.

One of the big issues Power Query users must also deal with is Table Structuring. When should you use “Transpose” and when should you use “Unpivot”?

Read more...

Nolock
Resident Rockstar
Resident Rockstar

I have prepared an article about a transitive closure in Power Query (coming in 2 weeks). I wanted to publish the article already, but I decided to wait a little bit and write another one about a performance boost I’ve used in the code. It is a lookup table in Power Query.

Read more...

Nolock
Resident Rockstar
Resident Rockstar

Every respected computer scientist has heard about recursive functions (more details on Wiki: https://en.wikipedia.org/wiki/Recursion). There are many algorithms which are mentioned as example of such functions, i.e. Fibonacci sequence as the simplest example:

Fib(0) = 0
Fib(1) = 1
For all integers n > 1, Fib(n) = Fib(n-1) + Fib(n-2)

 

How do we rewrite this pseudo code into a function in PowerQuery?

let
    fnFibonacci = (value as number) as number =>
        if value = 0 then
            0
        else if value = 1 then
            1
        else
            @fnFibonacci(value - 1) + @fnFibonacci(value - 2),
    Source = List.Transform({0..10}, fnFibonacci)
in
    Source

 

The most important part is the use of @ before we call the recursive function. We tell the PowerQuery to reference its own name from inside itself.

Let’s test our function:

Screenshot1.PNG

 

Well, that is great but not very useful in real life. I have been searching for long time for a scenario which can use recursive functions in a meaningful way. And I have found one!

In PowerQuery, you can select or rename many columns in one step. Moreover, you can change their data types at once. But repetitive modifications of a table are not so easy anymore.

 

In my ETL process written in PowerQuery, I wanted to split every datetime column in a fact table into 2 separate columns: date and time. It has at least 2 big advantages. First of all, date and time values become a foreign key for my date and time dimensions; second, the size of my dataset will decrease dramatically. This means that instead of many unique values I’ll get only 365 unique values a year for date dimension and 86 400 unique values for time dimension with the precision of one second.

What are my options? I can create a new custom column of date datatype for every datetime column and another custom column for time. Yes, I can, but I do not like doing a repetitive work. Let’s try to automate it.

 

Firstly, we need some test data.

// create a table with some datetime columns
    Source = Table.FromRecords(
        {  
            [OrderID = 1, CustomerID = 1, OrderPlaced_dt = #datetime(2019,6,16,10,0,0), OrderPaid_dt = #datetime(2019,6,16,10,5,0), OrderShipped_dt = #datetime(2019,6,16,11,0,0), Price = 100.0],
            [OrderID = 2, CustomerID = 1, OrderPlaced_dt = #datetime(2019,6,16,12,12,12), OrderPaid_dt = #datetime(2019,6,16,13,13,13), OrderShipped_dt = null, Price = 200.0]  
        },
        type table[OrderID = Int64.Type, CustomerID = Int64.Type, OrderPlaced_dt = DateTime.Type, OrderPaid_dt = DateTime.Type, OrderShipped_dt = DateTime.Type, Price = Decimal.Type]
    ),

 

Next, let’s create a function, which has 2 parameters - a source table and a name of a datetime column. This function does the same what you would do in the UI. It creates 2 new columns and removes the original one.

    // split a datetime column into 2 columns: date and time
    fnSplitDateTimeColumn = (parTbl as table, parColumnName as text) as table =>
        let
            // add a new column with date
            transformDateExpression = Expression.Evaluate("each Date.From([" & parColumnName & "])", #shared),
            addDateColumn = Table.AddColumn(parTbl, parColumnName & "_date", transformDateExpression, type date),

            // add a new column with time
            transformTimeExpression = Expression.Evaluate("each try #time(Time.Hour([" & parColumnName & "]),Time.Minute([" & parColumnName & "]),Time.Second([" & parColumnName & "])) otherwise null", #shared),
            addTimeColumn = Table.AddColumn(addDateColumn, parColumnName & "_time", transformTimeExpression, type time),

            // remove datetime column
            removeDateTimeColumn = Table.RemoveColumns(addTimeColumn, parColumnName)
        in
            removeDateTimeColumn,

 

And finally, in the last step we create another function which is recursive. This function has also 2 parameters - a source table and a list of all datetime column names which we haven’t transformed yet. The function takes the first item from the list (a datetime column name), splits this datetime column into 2 columns, and calls itself recursively. The recursive call uses the last step as the new source table and a list of all datetime column names except the first one (which is already processed). Once the list is empty, the recursion terminates. The recursion function returns a modified source table – instead of one datetime columns there are now 2 columns.

    // recursive function which splits all datetime columns into date and time columns
    // parTbl is a source table to modify, parColumnNameList is a list of columns to split
    fnSplitAllDateTimeColumns = (parTbl as table, parColumnNameList as list) as table =>
        // if parColumNameList is empty, terminate the recursion
        if List.IsEmpty(parColumnNameList) then
            parTbl
        else
            let
                // get one column name to process
                currentColumnName = List.First(parColumnNameList),
                // remove first item from the parColumnNameList
                nextColumNameList = List.RemoveFirstN(parColumnNameList, 1),

                // split current column
                splitOneColumnTable = fnSplitDateTimeColumn(parTbl, currentColumnName),
                // call itself recursively with a new created table and a shortend column name list
                nextIterationTable = @fnSplitAllDateTimeColumns(splitOneColumnTable, nextColumNameList)
            in
                nextIterationTable,

And everything together:

let
    // create a table with some datetime columns
    Source = Table.FromRecords(
        {  
            [OrderID = 1, CustomerID = 1, OrderPlaced_dt = #datetime(2019,6,16,10,0,0), OrderPaid_dt = #datetime(2019,6,16,10,5,0), OrderShipped_dt = #datetime(2019,6,16,11,0,0), Price = 100.0],
            [OrderID = 2, CustomerID = 1, OrderPlaced_dt = #datetime(2019,6,16,12,12,12), OrderPaid_dt = #datetime(2019,6,16,13,13,13), OrderShipped_dt = null, Price = 200.0]  
        },
        type table[OrderID = Int64.Type, CustomerID = Int64.Type, OrderPlaced_dt = DateTime.Type, OrderPaid_dt = DateTime.Type, OrderShipped_dt = DateTime.Type, Price = Decimal.Type]
    ),

    // split a datetime column into 2 columns: date and time
    fnSplitDateTimeColumn = (parTbl as table, parColumnName as text) as table =>
        let
            // add a new column with date
            transformDateExpression = Expression.Evaluate("each Date.From([" & parColumnName & "])", #shared),
            addDateColumn = Table.AddColumn(parTbl, parColumnName & "_date", transformDateExpression, type date),

            // add a new column with time
            transformTimeExpression = Expression.Evaluate("each try #time(Time.Hour([" & parColumnName & "]),Time.Minute([" & parColumnName & "]),Time.Second([" & parColumnName & "])) otherwise null", #shared),
            addTimeColumn = Table.AddColumn(addDateColumn, parColumnName & "_time", transformTimeExpression, type time),

            // remove datetime column
            removeDateTimeColumn = Table.RemoveColumns(addTimeColumn, parColumnName)
        in
            removeDateTimeColumn,

    // recursive function which splits all datetime columns into date and time columns
    // parTbl is a source table to modify, parColumnNameList is a list of columns to split
    fnSplitAllDateTimeColumns = (parTbl as table, parColumnNameList as list) as table =>
        // if parColumNameList is empty, terminate the recursion
        if List.IsEmpty(parColumnNameList) then
            parTbl
        else
            let
                // get one column name to process
                currentColumnName = List.First(parColumnNameList),
                // remove first item from the parColumnNameList
                nextColumNameList = List.RemoveFirstN(parColumnNameList, 1),

                // split current column
                splitOneColumnTable = fnSplitDateTimeColumn(parTbl, currentColumnName),
                // call itself recursively with a new created table and a shortend column name list
                nextIterationTable = @fnSplitAllDateTimeColumns(splitOneColumnTable, nextColumNameList)
            in
                nextIterationTable,

    // get all columns having the datatype datetime
    DateTimeColumnNames = Table.ColumnsOfType(Source, {type datetime}),
    // split all datetime columns
    SplitAllDateTimeColumns = fnSplitAllDateTimeColumns(Source, DateTimeColumnNames)
in
    SplitAllDateTimeColumns

 

And the result:

Screenshot2.PNG

 

Is it a lot of code for such a simple task which you can manage in PowerQuery Editor with shiny UI? Yes and no. If you have many fact tables with a lot of datetime columns and you do an ETL, you’ll be very happy having the opportunity to automate that. But if you have just one table with 2 datetime columns, it is an overkill.

 

Do you know any other pragmatic use case for recursive functions in PowerQuery? Please, let me know down in the comments 😊

rsaprano
MVP

This 2-part blog post is for working with ranges/bands of numbers in the Edit Queries stage of Power BI. Part 1 covers inexact lookups - looking up a number against a set of bands/ranges and returning the appropriate band which the number sits in. The second part looks at finding overlapping ranges - the scenario where we have two ranges/bands of numbers (rather than having one number and one set of bands) and we want to see the areas of overlap between the ranges.

Read more...

Helpful resources

Join Blog
Interested in blogging for the community? Let us know.