Ah yes, the Pythagorean Theorem. Many a young lass has been woo'd by the dulcet sounds of the Pythagorean Theorem being gently whispered in her ear. And by many, I mean absolutely zero...in like the entire history of mankind. So, while the Pythagorean Theorem has nothing to do with helping you get a date, it does have something to do with this next article in my Fun with Graphing in Power BI series.
As you may or may not have guessed from the picture at the top of this article, the next subject continues our exploration of fractals with the Sierpinski triangle. The Sierpinski who now? The Sierpinski triangle is a fractal named after a Polish mathematician, Waclaw Sierpinski, who went by the evil moniker, "The Claw", in the villainous mathematician underworld. Basically, the Sierpinski triangle is an equilateral triangle subdivided recursively into smaller equilateral triangles.
There are at least seven different ways of constructing a Sierpinski triangle, including methods like the Towers of Hanoi, Pascal's triangle, Arrowhead curve, etc. For this exercise however, we are going to use the "Chaos game" method because it lends itself to a relatively simple mathematical solution and it presents a very interesting challenge for us in Power Query. Plus, it sounds cool. Chaos...
The Chaos game goes a little something like this:
Seems simple enough, right? Well, what you may or may not notice from that series of steps that is different than our last fractal, the Mandelbrot Set, is that with the Sierpinski triangle, we care about the interim values of our recursion versus with the Mandelbrot we did NOT care about the interim values of our recursion. We care about those interim values because they become our plot points. This is actually a significant difference and forces us to do things a little bit differently. But first...
OK, so the first attempt at this one didn't go so well. I figured I would approach it sort of like the Mandelbrot Set so I started out generating a table using a source of:
=List.Generate(()=>0, each _ <= 2000, each _ + 1)
Then I wrote a couple quick little functions, fnSierpinski and fnSierpinskiInit:
fnSierpinskiInit
let fnSierpinskiInit = (start as text) => let startX = Number.FromText(Text.BeforeDelimiter(start,",")), startY = Number.FromText(Text.AfterDelimiter(start,",")), Return = fnSierpinski(-1,0,1,0,0,2,startX,startY) in Return in fnSierpinskiInit
fnSierpinski
let fnSierpinski = (vert1x,vert1y,vert2x,vert2y,vert3x,vert3y,startX,startY) => let originX = startX, originY = startY, randomVert = Number.IntegerDivide(Number.RandomBetween(1,4),1), targetVertX = if randomVert = 1 then vert1x else if randomVert = 2 then vert2x else vert3x, targetVertY = if randomVert = 1 then vert1y else if randomVert = 2 then vert2y else vert3y, plotX = (originX + targetVertX)/2, plotY = (originY + targetVertY)/2, Return = Text.From(plotX) & "," & Text.From(plotY) in Return in fnSierpinski
So, OK, I know what you are thinking. You are looking at that code and thinking, where's the recursion? Well, so I sort of got it in my head that I could use this little M trick of referring the current row back to the row above it and so I created a custom column called "Sierpinski" with the following formula:
=if [Index] = 0 then fnSierpinskiInit("0,1") else fnSierpinskiInit(#"Renamed Columns"{[Index]-1}[Sierpinski])
FYI, #"Renamed Columns" is the step in my query just prior to adding this custom column. Well, the problem is that this didn't really work out too well. The first row gets a value but it's all "Error" the rest of the way. Turns out, you can't reference a column that doesn't quite exist just yet. Hmm, that's a definite problem.
OK, so then I got to thinking. Well, what if I just change the rules of the game a wee bit. My mom always used to tell me that I can't just go around changing the rules of a game, but it never really ever sunk in and so now I do it all the time. It's quite an effective strategy for winning.
So, I figured I could just feed the same starting value into each row and for each row let the program recurse for the same number of times as the row's index. Since it is using random numbers, I'm probably pretty likely to get different points at the end of the day.
So, I put that plan into action:
= List.Generate(()=>0, each _ <= 2000, each _ + 1)
fnSierpinski2Init
let fnSierpinski2Init = (start as text, counter) => let startX = Number.FromText(Text.BeforeDelimiter(start,",")), startY = Number.FromText(Text.AfterDelimiter(start,",")), Return = fnSierpinski2(-1,0,1,0,0,2,startX,startY,counter) in Return in fnSierpinski2Init
fnSierpinski2
let fnSierpinski2 = (vert1x,vert1y,vert2x,vert2y,vert3x,vert3y,startX,startY,counter) => let originX = startX, originY = startY, randomVert = Number.IntegerDivide(Number.RandomBetween(1,4),1), targetVertX = if randomVert = 1 then vert1x else if randomVert = 2 then vert2x else vert3x, targetVertY = if randomVert = 1 then vert1y else if randomVert = 2 then vert2y else vert3y plotX = (originX + targetVertX)/2, plotY = (originY + targetVertY)/2, Return = if counter = 0 then Text.From(plotX) & "," & Text.From(plotY) else fnSierpinski2(vert1x,vert1y,vert2x,vert2y,vert3x,vert3y,plotX,plotY,counter-1) in Return in fnSierpinski2
Finally, I created a custom column, "Sierpinski" in my table like this:
=fnSierpinski2Init("0,1",[Index]))
After that, I added an x column and y column that returned back the portions before and after the comma delimiter, made sure those columns were set to decimal numbers and then I could plot those columns in a Scatter chart like the following:
OK, so that's not a half-bad representation of the Sierpinski triangle if I do say so myself. And I do. And thus, things almost ended right there. You see, engineers have this time-honored tradition called "Eh, close enough". This tradition is probably best explained by that joke involving an engineer, a mathematician and a pretty girl.
The joke goes like this. "An engineer and a mathematician are standing next to one another at a dance and both spot a pretty girl across the floor at exactly the same time. They both decide to halve the distance between themselves and the pretty girl every five minutes. Who gets to dance with the girl?" In the classic telling of this joke it's the engineer because since they are always halving the distance, the mathematician never actually ever reaches the girl while the engineer eventually decides, "Eh, close enough".
It should be noted that in the mathematician's retelling of this joke it is the mathematician because the mathematician treacherously stabs the engineer in the back with a knife and eventually gets close enough to whisper the Pythagorean theorem in the girl's ear. Yeah...I know, right? Mathematicians, in addition to being innately evil, are also really, really bad at telling jokes.
OK, so the last solution just didn't quite sit very well with me. The reason is that I figured that I would eventually get into a situation where a hack like that wasn't going to work. I really needed to find a way to do recursion and keep track of all of the interim values. So, I just sucked it up and solved it like this:
fnSierpinski3Init
let fnSierpinski3Init = (start as text,counter) => let startX = Number.FromText(Text.BeforeDelimiter(start,",")), startY = Number.FromText(Text.AfterDelimiter(start,",")), list = {Text.From(startX) & "," & Text.From(startY)}, Return = fnSierpinski3(-1,0,1,0,0,2,list,counter) in Return in fnSierpinski3Init
fnSierpinski3
let fnSierpinski3 = (vert1x,vert1y,vert2x,vert2y,vert3x,vert3y,list,counter) => let origin = List.Last(list), originX = Number.FromText(Text.BeforeDelimiter(origin,",")), originY = Number.FromText(Text.AfterDelimiter(origin,",")), randomVert = Number.IntegerDivide(Number.RandomBetween(1,4),1), targetVertX = if randomVert = 1 then vert1x else if randomVert = 2 then vert2x else vert3x, targetVertY = if randomVert = 1 then vert1y else if randomVert = 2 then vert2y else vert3y, plotX = (originX + targetVertX)/2, plotY = (originY + targetVertY)/2, end = {Text.From(plotX) & "," & Text.From(plotY)}, list1 = list & end, Return = if counter = 0 then list else fnSierpinski3(vert1x,vert1y,vert2x,vert2y,vert3x,vert3y,list1,counter-1) in Return in fnSierpinski3
So, what I am doing this time is maintaining all of my values in a List and passing that list between my recursive function calls. So, the output of these functions is simply the list of values that I have built up over the course of calling my recursive function recursively a specified number of times. Thus, I can then create a blank query and give it a Source like this:
=fnSierpinski3Init("0,0",3000)
I then just have to convert the list to a table and do a little renaming of columns, etc. The full query looks like this:
let Source = fnSierpinski3Init("0,0",3000), #"Converted to Table" = Table.FromList(Source, Splitter.SplitTextByDelimiter(","), null, null, ExtraValues.Error), #"Changed Type" = Table.TransformColumnTypes(#"Converted to Table",{{"Column1", type number}, {"Column2", type number}}), #"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Column1", "x"}, {"Column2", "y"}}), #"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", 0, 1), #"Added Custom" = Table.AddColumn(#"Added Index", "Color", each if [x]<0 and [y]<1 then 0 else if [x]>0 and [y]<1 then 1 else 2) in #"Added Custom"
And we can plot it like the following:
I added the colors just for the heck of it.
Solving the Sierpinski triangle in Power Query M code demonstrates how we can perform looping/recursion in M to feed the previous loop's output as the input to our next iteration and preserve the calculations made at each step along the way. This capability is crucial for many complex calculations that are needed within the realm of data analytics.
Also, watch your back around mathematicians, they are vicious.
Posts in this series: 1, 2, 3, 4, 5
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.