Fun with Graphing in Power BI - Part 5-by-5

Greg_Deckler · ‎04-30-2018

Introduction

This is the latest post in my series "Fun with Graphing in Power BI". It might be my last as well. There are a few different reasons for that. First, and most importantly, I believe I may be running out of cool, nifty tricks to show you in Power BI. Second, I am most definitely starting to run a bit thin on lame math jokes. Third; comparatively, this one was really, really difficult!! All that being said, I may give the Koch Snowflake a go or maybe the Sierpinski Carpet. Who knows, if you want to see this series continue, leave something encouraging in the comments!

Background

OK, so the topic of this article is the Vicsek fractal, which is also known as the Vicsek snowflake or box fractal. This particular fractal was proposed by Tamás Vicsek and has applications in things like cell phone antennas. Now, this is where I was going to make a joke that later in life Tamás Vicsek was ostracized by the mathematician community for having only clubbed 162 baby seals to death when upwards of several thousand were the norm but then I realized that this dude is still alive! Hmm, I wonder if I can connect to him on LinkedIn? So, anyway, no disparaging mathematician jokes this time so as to ward off defamation of character lawsuits and the like. Besides, by this time we are all well aware that mathematicians are evil and would gladly club baby seals to death, no need to belabor the point.

OK, so the Vicsek fractal is constructed thusly. Start with a basic square. This basic square is then decomposed into nine smaller squares in a 3-by-3 grid. The four squares at the corners and the middle square are left, the other squares being removed. The process is repeated recursively for each of the five remaining subsquares. The Vicsek fractal is the set obtained at the limit of this procedure.

Alright, so why was this particular fractal so difficult? Well, there are a number of reasons. First, despite my best efforts, I couldn't really find any mathematical formula for it or code examples out there on the Internet. And, as anyone in technology knows, if you couldn't steal code from someone else, nothing would ever get done in technology...at all...like ever. Therefore, I sort of had to take the description of it and translate that into code.

Second, actual, legit recursion is required to generate this sucker. I know what you are saying. We used recursion to solve our Mandelbrot Set and our Sierpinski Triangle so why is this one so different? Well, the thing is that if we are honest with ourselves; which is something that I typically try to avoid at all costs, while we might be meeting the technical definition of programmatic recursion; having a programmatic function call itself, what we have actually done with the other two fractals really just amounts to glorified looping. If we had a FOR or WHILE loop available to us in Power Query we could have probably just used that instead of recursion.

Finally, the other thing that made this one difficult is that debugging true recursive code in Power Query...well...quite frankly...sucks. When you get back errors in your code, Power Query fails to point you to the particular line of code that is causing the problem. This is exacerbated by the fact that you also have no idea of where in the iterative process that the error occurred. Your code may work perfectly well during initial iterations but run into problems the deeper you go. Power Query's error handling in these circumstances leaves quite a bit to be desired.

The Solution

OK, so you will see some similar kinds of techniques in the solution presented here that are similar to some of the previous articles in this series. However I've refined things a bit where our functions return the actual table that we want versus what I did with the Sierpinski Triangle, for example. I'm just going to paste the code and then do a little mansplaining. And I'm going to see about using the Snippet functionality of Pulse and see how that goes. Cross your fingers.

Alright, so the first function we create is fnVicsek. This one is the Mack Daddy if you will:

let
    fnVicsek = (originX as number, originY as number, span as number, iteration as number, max as number) as table => 
 let
            ySign = if originY = 0 or Number.Sign(originY) = -1 then -1 else 1,
            count = span,
     rows = (count+1)*(count+1)-1,
     Source = List.Generate(()=>0, each _ <= rows, each _ + 1),
     #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
     #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Index"}}),
     #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Index", Int64.Type}}),
     #"Added x" = Table.AddColumn(#"Changed Type", "x", each Number.IntegerDivide([Index],count+1)*count/count - count/2 + originX),
     #"Added y" = Table.AddColumn(#"Added x", "y", each if Number.Mod([Index],count+1) = 0 then if ySign=-1 then ySign*count/2+originY else originY-ySign*count/2 else -1*count/2 + count/count * Number.Mod([Index],count+1) + originY),
     #"Added Color" = Table.AddColumn(#"Added y", "Color", each if iteration=0 then 1 else if ([x]<originX-1*count/2/3 and [y]<originY+count/2/3 and [y]>originY-1*count/2/3) or ([x]>originX+count/2/3 and [y]<originY+count/2/3 and [y]>originY-1*count/2/3) or ([x]>originX-1*count/2/3 and [x]<originX+count/2/3 and ([y]>originY+count/2/3 or [y]<originY-1*count/2/3)) then 2 else 1),
     #"Changed Type1" = Table.TransformColumnTypes(#"Added Color",{{"x", type number}, {"y", type number}, {"Color", Int64.Type}}),
     #"Added Iteration" = Table.AddColumn(#"Changed Type1", "Iteration", each iteration),
     #"Changed Type2" = Table.TransformColumnTypes(#"Added Iteration",{{"Iteration", Int64.Type}}),
     #"Filtered Rows" = Table.SelectRows(#"Changed Type2", each ([Color] = 1)),
            Middle = if iteration < max then Table.Combine({#"Filtered Rows",fnVicsek(originX, originY, if iteration < 1 then count else count/3, iteration+1, max)}) else #"Filtered Rows",
            UpperLeft = if (iteration > 0 and iteration <> max) then Table.Combine({Middle,fnVicsek(if (iteration < 1) then originX else (-1*count/3+originX), if (iteration < 1) then originY else (1*count/3+originY), if iteration < 1 then count else count/3, iteration+1, max)}) else Middle,
            UpperRight = if iteration > 0 and iteration <> max then Table.Combine({UpperLeft,fnVicsek(if (iteration < 1) then originX else (1*count/3+originX), if (iteration < 1) then originY else (1*count/3+originY), if iteration < 1 then count else count/3, iteration+1, max)}) else UpperLeft,
            LowerLeft = if iteration > 0 and iteration <> max then Table.Combine({UpperRight,fnVicsek(if (iteration < 1) then originX else (-1*count/3+originX), if (iteration < 1) then originY else (-1*count/3+originY), if iteration < 1 then count else count/3, iteration+1, max)}) else UpperRight,
            LowerRight = if iteration > 0 and iteration <> max then Table.Combine({LowerLeft,fnVicsek(if (iteration < 1) then originX else (1*count/3+originX), if (iteration < 1) then originY else (-1*count/3+originY), if iteration < 1 then count else count/3, iteration+1, max)}) else LowerLeft,
            
            Return = LowerRight
 in
            Return
in
    fnVicsek

OK, so this function is actually returning a complete table of values, the values we want to plot for our Vicsek fractal. As in, ALL of the values. Since this is a square, we compute how many rows of values we need for each box from the span passed in as a parameter to our function. So, if we pass in a span of 42, our box will go from -21>=x<=21 and -21>=y<=21. We use this rows variable to generate our list of rows, converting it to a table.

After some cleanup, we add our x column similar to what we have done in the past but note that whatever we come up with we add originX to it. This allows us to keep track of which box we are in essentially as we recurse. We add a y column similar to what we have done in the past but with a bit more complicated code. Debugging this y column calculation had me pulling my hair out.

After that we have a complex looking Color calculation, which really never caused any issues. This is where the magic happens to subdivide our box into thirds and assigns a color of 1 to points we want to keep and 2 to points that we do not want to keep. Next we add an iteration column so that we keep track of which iteration we are on, which is vitally important if we want to visualize each iteration separately. Finally, we filter out all of the rows with a color of 2. I say finally here because this completes once iteration of our recursive process. If we feed a box into this function with a certain origin and span, it will subdivide it into thirds.

But, we need to add in the recursive part of this and this is handled by the Middle, UpperLeft, UpperRight, LowerLeft, LowerRight lines. So, we know that once we go through a single iteration of the process, we are going to be left with 5 boxes that also need to be subdivided and that this process continues until we reach the "max" of how many times we wish to iterate. Therefore, each of these 5 lines recursively call our function and pass in the dimensions of our next subdivided box.

So, think about what is happening here. We call this function with a single box. It is going to calculate the subdivisions of that box and then hit the Middle function. That middle function will calculate out the subdivisions of all of the iterations of the middle box up to the limit of our "max" iterations. Let's say that max is 5. Then it is going to "pop" back an iteration (to iteration 4) and call UpperLeft and subdivide that box to the max of the iterations, which includes the Middle of that box first. Once that process is done, we "pop" back to our main thread of iteration 4 and call UpperRight and the process repeats. Once this process repeats again for LowerLeft and LowerRight then we will eventually pop back to iteration 3 and the cycle repeats.

OK, so I'm pretty sure I am doing a good job of mansplaining the recursive process that is going on here because my head is starting to spin. What is probably important to point out here is that we aren't building this in a "normal" way where we would perform iteration 1 and then 2 and then 3, etc and all of the points are known at each cycle of iteration, we are kind of bouncing around all over the place calculating various parts of the fractal at different iterations at different times. Consider that at one point in the recursive process we would know all of the points of our fractal for all iterations except that we wouldn't have even started on the 2nd iteration of the LowerLeft box from our first, original subdivision! Neat.

This is what makes recursion really, really powerful. With about 20 lines of code we could literally generate a near infinite set of data points. That's stupid crazy. When you can break programmatic problems down like this, it's very powerful indeed but it will also make you tear your hair out trying to conceptualize it and debug it.

OK, so now that we have our super cool fnVicsek function, we can simplify its use with an "init" function, fnVicsekInit:

let
    fnVicsekInit = (span as number,max as number) => 
        
    let
        Return = fnVicsek(0, 0, span, 0, max)
    in
        Return
in
    fnVicsekInit

OK, as long as we make the assumption that we want to always center our box fractal on x=0, y=0 we can reduce the number of parameters to 2, the size of our fractal and the max number of iterations we want to perform.

Finally, we just need to create a query that calls our fnVicsekInit function like this Vicsek table:

let
    Source = fnVicsekInit(162,5),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"x", type number}, {"y", type number}, {"Color", Int64.Type}, {"Iteration", Int64.Type}, {"Index", Int64.Type}})
in
    #"Changed Type"

As you can see here, I am creating a Vicsek fractal that is a total of 162 points wide and high (-81<=x<=81, -81<=y<=81) and I want 5 iterations to be performed. This will generate almost 65,000 data points. Well, it generates way more than that, but we are filtering out all of the ones that we don't care about.

In the end, you end up with the following iterations:

Iteration 0:

$boxfractal2.png$

Note that the box is not solid due to sampling.

Iteration 1:

$boxfractal3.png$

Iteration 2:

$boxfractal4.png$

Iteration 3:

$boxfractal5.png$

Iteration 4:

$boxfractal6.png$

Iteration 5:

$boxfractal7.png$

Pretty cool. But, I know what you are saying, there's not really much difference between iterations 3, 4 and 5. You are saying to yourself, "How can we trust you that this is really working at the higher iterations?" Well, first of all, you should definitely never, ever trust me. Especially if you are female. I mean, if you are female, let me just be very clear...you should most definitely never, ever trust a single word that comes out of my mouth. If I were to tell you that 1+1=2 you would be entirely justified if you were at the very least be extremely skeptical.

In this particular case, however, I have nothing up my sleeve. We can actually "zoom in" on a portion of the fractal by limiting our visual to just a single quadrant for the higher iterations. In particular, the following images zoom in on the initial upper right square by limiting our x and y values to greater than 20:

Iteration 2:

$boxfractal8.png$

Iteration 3:

$boxfractal9.png$

Iteration 4:

$boxfractal10.png$

Iteration 5:

$boxfractal11.png$

Conclusion

Tamás Vicsek is probably a very nice man insofar as evil mathematicians go and therefore should not sue me over any perceived disparaging remarks about him having a penchant for clubbing baby seals to death.

Posts in this series: 1, 2, 3, 4, 5