It has been almost two weeks since I first posted Excel to DAX Translation with the goal to catalog as many DAX solutions to Excel functions as possible. Let’s first start by taking stock of the numbers.
Excel has nearly 500 functions. 478 is the count that I have.
While there are different ways of deciding such things, by my count only 190, or about 40%, of Excel’s functions have direct or nearly direct equivalents in DAX.
I have been able to document an additional 196 DAX solutions for Excel functions
Another 49 Excel functions have been classified as “N/A” or not applicable within DAX or modern information systems practices.
This brings our total percentage of DAX solutions for Excel functions to 80.75%. Adding in the N/A, we reach 91%. This still leaves a total of 43 Excel functions that have yet to have a DAX equivalent identified for them, so still work to be done but I anticipate that progress will slow considerably from here on in if for no other reason than the low hanging fruit has already been picked and there are few large “classes” of Excel functions left (see below).
It has been a pretty interesting journey thus far and I, personally, have learned quite a bit which I will be covering in this series of blog posts (at least 2!)
Excel’s Function Documentation is Pretty Bad
Perhaps the biggest thing I have learned is that Excel’s function documentation is, well, it’s not what I would call stellar or even much better than mediocre; at best. And I am bending over backwards to be generous here because some of it is just absolutely horrendous or even just flat out, demonstrably inaccurate/wrong. I mean, perhaps I just hit a few bad ones early on but I quickly found myself plugging the examples for the functions into Excel just to make sure that the answers being provided in the documentation were actually what Excel returns! Take DISC for example:
Or this gem from RRI:
Folks, the correct calculation is NOT Fv^(1/Nper) / Pv - 1 as is shown, it is actually (Fv/Pv)^(1/Nper) – 1. Order of operations and parentheses man...they matter.
Perhaps we could write these off as trivial but when these sorts of things waste hours of your life, you tend to get somewhat bitter. So, I am struggling not to call Excel’s function documentation complete and utter garbage because it’s at least sort of, kind of mediocre or at the very least has legitimate aspirations of one day becoming relatively close to mediocre. I mean, let’s be honest, it’s a C- at best...maybe?
But then you add in functions that provide no detail at all about how they are calculated and my grade goes down to a D+. By far the most frustrating thing I found was just a total lack of any detail what-so-ever about how a function operated or was calculated. Many Excel functions do actually include the correct formula used in the calculation but there are also more than a few that basically state a brief introduction for the function, the parameters and then an example. Literally no information about how the calculation is done at all. Those are incredibly difficult to reproduce because you essentially have to reverse engineer them!!
A great example of the utter lack of documentation are many of the distribution family of functions like F.DIST. Just utterly devoid of anything even resembling a formula. And then there is perhaps the coup-de-grace, the COUP* family of functions. I mean, the COUP* family of functions are prime examples of this a lack of decent documentation. Except, that’s perhaps not really even the worst thing that’s wrong with the COUP* functions because after dealing with them I began to question whether or not even anyone at Microsoft knows how the flip these things work or are supposed to work or even if there are any quality standards that existed at the time to make sure that naming conventions were followed...
Let’s see, you have COUPNCD (coupon next coupon date), COUPPCD (coupon previous coupon date), COUPDAYS (coupon days), COUPDAYSNC (coupon days next coupon) and then…drum roll...COUPDAYBS (coupon days before settlement?), I mean, shouldn't it be COUPDAYSPC for; oh I don't know, freaking consistency sake if nothing else? It’s not just violating one naming convention; oh no, it goes for the trifecta! Eh, let’s drop the S and we’ll use before instead of previous and then we will specify settlement instead of coupon. That’s just…that's just lazy. If you think about it, it's really kind of BS…or a developer TOTALLY slipped that one by management…
By contrast, DAX’s documentation is marginally better. Sure, there is lots of it that was obviously just copy/pasted from Excel but the new functions generally have fairly decent information and more detail. But, they don’t include everything. For example, in coding IRR, I learned that GENERATESERIES cannot have a number less than 1E-6 as its increment value. That’s not in the documentation. Also, while it has since been fixed, I identified at least four functions that existed in DAX that didn’t exist in the documentation:
So, you know, documentation can always be improved.
Microsoft Did a Pretty Good Job Deciding Which Excel Functions to Include in DAX
I'm not sure what the selection criteria was for which Excel functions were included in DAX. I imagine Microsoft had some telemetry data on it perhaps? I have to wonder if there were raging debates or not. But whatever the criteria or process, Microsoft did a solid job in my opinion. First of all, being able to get to 90%+ coverage from the original 40% means that Microsoft obviously did a pretty decent job of identifying the core Excel functions that they needed to port to DAX. If you can recreate that high of a percentage of functions with just a base set, then that’s a pretty good job in my opinion. But early on in the project, I got to thinking about why certain Excel functions were not included in DAX. I came up with a number of classifications.
Not Applicable (N/A)
This classification of functions is somewhat subjective but I tried to reserve this designation for functions that just truly make absolutely zero sense for DAX. For example, the CUBE* family of functions. I mean, DAX is natively accessing a cube so…kind of pointless to have specific functions for such things. Or the CELL function. A cell is not a concept in DAX so I don’t think it translates. Or the ASC function, we really don’t need that these days with UNICODE. We’ve moved beyond it. Granted, 49 functions are currently classified this way so it is a fairly sizeable classification at just over 10% of all Excel functions. Take a look through the list and let me know if there is anything you do not agree with.
Not to keep harping on Excel, but I never realized just how bloated Excel’s function library had become. And this one comes in a number of flavors. First up are duplicates. There are quite a number of just straight-out duplicates in Excel. And not just the obvious ones like BINOMDIST and BINOM.DIST but bizarre ones like CRITBINOM is exactly BINOM.INV and VALUE versus NUMBERVALUE versus N? Questionable.
Next are the useless functions. Like the function T. What T does is test if something is text and if it is, returns the text or if not returns “”. That’s literally it. No option to return something other than “” if false. Oh no. Either the text or “”. I mean, first, one must question the value of something that can be done with IF and ISTEXT. But to be THAT specific in functionality? I mean, why not include that second parameter for a return value if false? It makes the function so specific as to be generally useless. There were a few of these. Two more are GESTEP and DELTA. I mean, why?
Finally in this category are the functions that left me slapping my forehead. I mean, functions that are so absolutely brainlessly trivial that you are utterly mystified why in the world they were included in the function library. For example, PERMUTATIONA is literally just POWER(x,y). I mean, what? There’s actually more typing involved with PERMUTATIONA than it’s equivalent so, not exactly helping. And CSC is 1/COS so…why bother with a whole new function? The math is not exactly rocket science here folks.
Perhaps because of Excel’s history, there are a ton of financial functions in Excel but scant few made their way to DAX with the notable exceptions of XNPV and XIRR, which I didn’t even realize existed until I had already coded separate solutions. So, technically those wouldn’t have been necessary either. The vast majority of these functions are not difficult to code. The only difficulty is the documentation, which is generally pretty poor. I found myself doing a lot of work to reverse engineer these functions, lots of Algebra, etc.
Overall, I think it was a solid choice to not include the vast majority of these functions in DAX. I mean, is Power BI really the tool for calculating the return on treasury bills? Probably not. And since the math is fairly easy on these, not the end of the world to not have a function for them.
Excel also has a large library of what I classify as conversion functions. Many of these deal with converting between different bases, like decimal to binary to octal to hexadecimal. Others deal with converting units of measurement, which is pretty handy, and still others convert between Roman and Arabic numerals! Again, overall I think this was a wise decision to not include these functions. Maybe BASE and DECIMAL would have been nice to allow you to convert between any two bases but then again, that seems so kind of like an 80’s thing to do. The CONVERT function is wicked, but you’d just handle that in Power BI by importing a table of conversion factors.
By not including conversions to other bases, this automatically also excluded bitwise operators like BITOR, BITAND, etc. Once I was able to come up with the conversions from decimal to binary, these sort of fell like dominos.
Excel is no MatLab but it does have a few nifty functions for doing matrix arithmetic. These didn’t make it to DAX and if you see the hoops I had to jump through to come up with anything even remotely equivalent you will understand why. For these to work in DAX you would really need the ability to know how many columns you have in a table and be able to reference those columns by position. Those are just foreign concepts for DAX.
The first major group here are the statistical distributions. A couple of the statistical distributions made it over, such as BETA*, CHISQ* and NORM*. The rest didn’t. This includes F*, BINOM*, NEGBINOM*, GAMMA*, WEIBULL*, etc. The math tends to be harder I suppose. In some cases it’s no longer simple Algebra, you’re dealing with integrals and other fun things. I had to use numerical estimating methods to solve GAMMA for example. And GAMMA is kind of central to some of the others like F.DIST
The second group is the complex number functions, IM*. These deal with complex numbers (numbers involving imaginary numbers). Don’t get me started. I mean, if they are imaginary…why are we worrying about them again? We might as well spend our time hunting ghosts, vampires and dragons right? But anyway, they were definitely imaginary in DAX, at least up until now. So then does that make them real? At least real in DAX? Regardless, I have mixed feelings about whether they should be included in DAX because while I'm an engineer by education I’m not sure how often imaginary numbers are used in real life. See, I told you to not get me started…
A third group are all of the curve fitting and estimation algorithms. I can see why these were not included, Power BI really does this sort of thing as part of the visuals and more or less dumbs down the math so that mere mortals can leverage it more effectively. Probably a good call.
The fourth, smaller group in this classification are the BESSEL* functions. Those…I’m…I’m still working on those. Would have been nice if DAX had the BESSELJ function, then they could have all been handled fairly easily.
Overall, this is probably the area I would have liked to see a few more functions included in DAX. I mean, DAX is a data analysis tool after all. You know, data analysis…statistics, math, kind of important. However, I am guessing that the integrations with R and Python have made much of this somewhat mute perhaps? I mean, if you wanted to be slick about it, give us an R and/or PYTHON function in DAX, now that would be pretty freaking cool… Hey DAX, go run this Python or R script and return the result back as a variable in DAX. VAR __MyPython = PYTHON(siiiiiiiiiiicccckkkk)!
Excel's function documenation is bad. Then again, I might be the first person to ever poor over it with a fine tooth comb looking to recreate each function from scratch. But more importantly, the important Excel functions made it over to DAX and all but a handful now have DAX equivalent calculations.
The second part of this series will cover some additional things I learned along the way as well as cover some of my favorite functions that I was able to recreate!
I would like to know your thoughts. What Excel functions do you wish were included in DAX?