You can create an ML model based on your data with a few simple steps using this new capability, so this post aims to provide some light on the next steps, model performance and application, and provide a few tips about the reports.
Each model type (Prediction/Classification/Forecasting) will provide you with a performance report once the it has been trained. This can be accessed by navigating to the ‘Machine learning models’ tab in your dataflow and clicking on the ‘View performance report and apply model’ button for your newly trained model.
When first accessed, the report will take roughly 10-15 minutes to refresh and load. When you create a machine learning model, an empty entity is created with a query to train the model. The entity is populated with content when dataflow is refreshed. Its only when you access the report first time, a report is created using the contents of the Model entity. On subsequent access, the report should not take as long to display.
Each report will contain several pages with the first one being a high-level overview of the model performance. Information includes how the model was tested and how the results should be interpreted.
If the model is Binary Prediction, the report will also provide the possibility of testing out different probability thresholds. This behavior allows you to select a threshold that makes sense for your business case, by balancing false positives with false negatives.
The second part of the report surfaces new perspective on the model performance through a variety of charts (Cumulative Gains, ROC curves, Residual error, Predicted vs Actual), depending on the selected model type.
The last part of the report provides a more in-depth view at the model itself, as well as how it was trained.
The model can be applied through the performance report by clicking on the ‘Apply now’ button and selecting the appropriate entity.
Certain entities will not show up in the dropdown, as the model cannot be applied on the entities created for the model training operation (Train/Test).
If your model is a binary prediction model, you will also be able to provide a threshold for the application. Ideally the threshold would be chosen based on the performance report, as described above.
Once the model has been applied, the dataflow will need to be refreshed so the new data can be scored.
Because the report is a standard PBIX report it can be downloaded and modified as any other report. Go to the workspace view and look for a report with a name that follows the following pattern <ModelType> report for <DataflowName>[MlModelName].
Once downloaded you can see that the actual model is a JSON payload with several properties. Look at the ‘Output’ table. All the other tables are derived from it. You will also see a number of parameters; these are used to connect to the model entity in the dataflow, and also describe some of the properties that have been set during attribute selection (LabelColumnName, EntityName).
At this point the report can be modified/enhanced as any other report.
When done, you can choose to upload the report back with the same name (this will overwrite the existing report and will also make the new report accessible from the 'View performance report' access point), or rename it and upload it, which will not overwrite the existing one.
Issue: My report is not showing, even after 10/15 minutes.
In the workspace view, look for a dataset that matches the following name pattern ‘ModelType report for DataflowName[ModelName]’
Check its refresh history, for the report to show, there should be at least one successful refresh that has taken place
Issue: I deleted my report; but how do I get it back?
Access the ‘View performance report’ button again to recreate the report
Issue: The data in the report is old – what do I do?
The report is backed up by a dataset with the same name. If the model has been refreshed but the performance report still shows old data, refreshing the dataset should update the report to show new data