cancel
Showing results for
Did you mean:

Most Recent
Super User

## Lookup Table in Power Query

I have prepared an article about a transitive closure in Power Query (coming in 2 weeks). I wanted to publish the article already, but I decided to wait a little bit and write another one about a performance boost I’ve used in the code. It is a lookup table in Power Query.

Super User

## Solving Real Life Problems with Recursive Functions in PowerQuery

Every respected computer scientist has heard about recursive functions (more details on Wiki: https://en.wikipedia.org/wiki/Recursion). There are many algorithms which are mentioned as example of such functions, i.e. Fibonacci sequence as the simplest example:

Fib(0) = 0
Fib(1) = 1
For all integers n > 1, Fib(n) = Fib(n-1) + Fib(n-2)

How do we rewrite this pseudo code into a function in PowerQuery?

```let
fnFibonacci = (value as number) as number =>
if value = 0 then
0
else if value = 1 then
1
else
@fnFibonacci(value - 1) + @fnFibonacci(value - 2),
Source = List.Transform({0..10}, fnFibonacci)
in
Source```

The most important part is the use of @ before we call the recursive function. We tell the PowerQuery to reference its own name from inside itself.

Let’s test our function:

Well, that is great but not very useful in real life. I have been searching for long time for a scenario which can use recursive functions in a meaningful way. And I have found one!

In PowerQuery, you can select or rename many columns in one step. Moreover, you can change their data types at once. But repetitive modifications of a table are not so easy anymore.

In my ETL process written in PowerQuery, I wanted to split every datetime column in a fact table into 2 separate columns: date and time. It has at least 2 big advantages. First of all, date and time values become a foreign key for my date and time dimensions; second, the size of my dataset will decrease dramatically. This means that instead of many unique values I’ll get only 365 unique values a year for date dimension and 86 400 unique values for time dimension with the precision of one second.

What are my options? I can create a new custom column of date datatype for every datetime column and another custom column for time. Yes, I can, but I do not like doing a repetitive work. Let’s try to automate it.

Firstly, we need some test data.

```// create a table with some datetime columns
Source = Table.FromRecords(
{
[OrderID = 1, CustomerID = 1, OrderPlaced_dt = #datetime(2019,6,16,10,0,0), OrderPaid_dt = #datetime(2019,6,16,10,5,0), OrderShipped_dt = #datetime(2019,6,16,11,0,0), Price = 100.0],
[OrderID = 2, CustomerID = 1, OrderPlaced_dt = #datetime(2019,6,16,12,12,12), OrderPaid_dt = #datetime(2019,6,16,13,13,13), OrderShipped_dt = null, Price = 200.0]
},
type table[OrderID = Int64.Type, CustomerID = Int64.Type, OrderPlaced_dt = DateTime.Type, OrderPaid_dt = DateTime.Type, OrderShipped_dt = DateTime.Type, Price = Decimal.Type]
),```

Next, let’s create a function, which has 2 parameters - a source table and a name of a datetime column. This function does the same what you would do in the UI. It creates 2 new columns and removes the original one.

```    // split a datetime column into 2 columns: date and time
fnSplitDateTimeColumn = (parTbl as table, parColumnName as text) as table =>
let
// add a new column with date
transformDateExpression = Expression.Evaluate("each Date.From([" & parColumnName & "])", #shared),

// add a new column with time
transformTimeExpression = Expression.Evaluate("each try #time(Time.Hour([" & parColumnName & "]),Time.Minute([" & parColumnName & "]),Time.Second([" & parColumnName & "])) otherwise null", #shared),

// remove datetime column
in
removeDateTimeColumn,```

And finally, in the last step we create another function which is recursive. This function has also 2 parameters - a source table and a list of all datetime column names which we haven’t transformed yet. The function takes the first item from the list (a datetime column name), splits this datetime column into 2 columns, and calls itself recursively. The recursive call uses the last step as the new source table and a list of all datetime column names except the first one (which is already processed). Once the list is empty, the recursion terminates. The recursion function returns a modified source table – instead of one datetime columns there are now 2 columns.

```    // recursive function which splits all datetime columns into date and time columns
// parTbl is a source table to modify, parColumnNameList is a list of columns to split
fnSplitAllDateTimeColumns = (parTbl as table, parColumnNameList as list) as table =>
// if parColumNameList is empty, terminate the recursion
if List.IsEmpty(parColumnNameList) then
parTbl
else
let
// get one column name to process
currentColumnName = List.First(parColumnNameList),
// remove first item from the parColumnNameList
nextColumNameList = List.RemoveFirstN(parColumnNameList, 1),

// split current column
splitOneColumnTable = fnSplitDateTimeColumn(parTbl, currentColumnName),
// call itself recursively with a new created table and a shortend column name list
nextIterationTable = @fnSplitAllDateTimeColumns(splitOneColumnTable, nextColumNameList)
in
nextIterationTable,```

And everything together:

```let
// create a table with some datetime columns
Source = Table.FromRecords(
{
[OrderID = 1, CustomerID = 1, OrderPlaced_dt = #datetime(2019,6,16,10,0,0), OrderPaid_dt = #datetime(2019,6,16,10,5,0), OrderShipped_dt = #datetime(2019,6,16,11,0,0), Price = 100.0],
[OrderID = 2, CustomerID = 1, OrderPlaced_dt = #datetime(2019,6,16,12,12,12), OrderPaid_dt = #datetime(2019,6,16,13,13,13), OrderShipped_dt = null, Price = 200.0]
},
type table[OrderID = Int64.Type, CustomerID = Int64.Type, OrderPlaced_dt = DateTime.Type, OrderPaid_dt = DateTime.Type, OrderShipped_dt = DateTime.Type, Price = Decimal.Type]
),

// split a datetime column into 2 columns: date and time
fnSplitDateTimeColumn = (parTbl as table, parColumnName as text) as table =>
let
// add a new column with date
transformDateExpression = Expression.Evaluate("each Date.From([" & parColumnName & "])", #shared),

// add a new column with time
transformTimeExpression = Expression.Evaluate("each try #time(Time.Hour([" & parColumnName & "]),Time.Minute([" & parColumnName & "]),Time.Second([" & parColumnName & "])) otherwise null", #shared),

// remove datetime column
in
removeDateTimeColumn,

// recursive function which splits all datetime columns into date and time columns
// parTbl is a source table to modify, parColumnNameList is a list of columns to split
fnSplitAllDateTimeColumns = (parTbl as table, parColumnNameList as list) as table =>
// if parColumNameList is empty, terminate the recursion
if List.IsEmpty(parColumnNameList) then
parTbl
else
let
// get one column name to process
currentColumnName = List.First(parColumnNameList),
// remove first item from the parColumnNameList
nextColumNameList = List.RemoveFirstN(parColumnNameList, 1),

// split current column
splitOneColumnTable = fnSplitDateTimeColumn(parTbl, currentColumnName),
// call itself recursively with a new created table and a shortend column name list
nextIterationTable = @fnSplitAllDateTimeColumns(splitOneColumnTable, nextColumNameList)
in
nextIterationTable,

// get all columns having the datatype datetime
DateTimeColumnNames = Table.ColumnsOfType(Source, {type datetime}),
// split all datetime columns
SplitAllDateTimeColumns = fnSplitAllDateTimeColumns(Source, DateTimeColumnNames)
in
SplitAllDateTimeColumns```

And the result:

Is it a lot of code for such a simple task which you can manage in PowerQuery Editor with shiny UI? Yes and no. If you have many fact tables with a lot of datetime columns and you do an ETL, you’ll be very happy having the opportunity to automate that. But if you have just one table with 2 datetime columns, it is an overkill.

Do you know any other pragmatic use case for recursive functions in PowerQuery? Please, let me know down in the comments 😊

## Working with ranges (Numbers/Dates) in Power Query/Power BI - Part 2

This 2-part blog post is for working with ranges/bands of numbers in the Edit Queries stage of Power BI. Part 1 covers inexact lookups - looking up a number against a set of bands/ranges and returning the appropriate band which the number sits in. The second part looks at finding overlapping ranges - the scenario where we have two ranges/bands of numbers (rather than having one number and one set of bands) and we want to see the areas of overlap between the ranges.