PowerBI has R Custom visuals to allow R programmers to create their graphs/efforts within PowerBI.
While I'm trying to do so, I notice that the default code (which we can't change) is that we load the data into a data.frame & then deduplicate the data (see screenshot and the code line "unique(dataset)").
The latter (i.e. deduplication) is causing limitations to what you can do with this R visual: e.g. you can't create a histogram (cause all duplicates would have been removed), you can't create a proper Decision tree (as again all duplicates would be removed and the tree would be biased).
Can we remove the deduplication from the R Custom visual core code & make it 'optional'? Any way I can bypass this deduplication in the meantime.
PS: As a result of this deduplication, the decision tree results achieved by the custom visual 'Decision Tree' are wrong. This is how I actually came to found out..
Am I overlooking something?
I have actually posted an Idea on this here:
It is Under Review, please vote for it.
The only work-a-round that I have is to ignore the dataframe that is automatically created and load the data from the source into my own dataframe within the R code itself. Not optimal at all
Proud to be a Datanaut!
Until they go along with the idea of taking that out, what I am doing is creating an ID column which runs from 1 to nrow of the dataset and importing that into R as well. That makes all rows different so none gets deleted, and you can delete the dummy column in R and use your normal code.