Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I would enjoy anyone's opinion on using addIndex in Query Editor to use as a primary key in my dimEmployee table and dimVendor table, respectively.
The data source is an Excel file that comes from a Quickbooks list export that I update monthly. I am connected to this file and do whatever cleaning I need in Query Editor. I should mention that the fact table I am doing analysis on also comes from quickbooks and contains vendor names and employee names and no ID's. I prefer to use integer fields to create relationships to my dim tables rather than a string field. To do this, in query editor, I thought about adding an Index field on the dim tables, then doing a join/expand to get the Index field into the fact table and delete the string fields containing the vendor and employee names.
The problem naturally, is that the whenever new names are added in the dim tables, the Index would adjust and re-number everything.
My question is, If I'm refreshing the 2 dim tables every month and the fact table (thus running the join/expand step), is there any issue when new names are added to the dim tables and the Index adjusts to re-Index(?) the list. In essense, would the data model ever "know the difference" that everybody has a new index ID?
built-in add Index function in power query will always re-evalute its value during each data loading.
to remain indexes you should setup your ETL process correctly or (and my opinion, it maybe better) use some surrogate key like string "Vendor-Employee" or whatever you need as key
I do not see any valuable reasons to use exactly integer, not string keys
Thanks for that, and I am working to improve the ETL. It was a broad what if question to see if I would run into an issue in the short-term. Hopefully I won't have to do it this way for long.
What you are proposing is fine, assuming that the correct index # gets assigned in related tables each refresh. It is not required to have an index if you don't need one. Many people just use existing fields, and often people concatenate 2+ existing fields to create a key column with the right granularity. In any case, new index #s each refresh is no problem.
If this solution meets your need, please mark it as the solution. Kudos are appreciated too. Please let me know if not.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
110 | |
94 | |
80 | |
67 | |
59 |
User | Count |
---|---|
150 | |
119 | |
104 | |
87 | |
67 |