Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Ok, so we have a fact table that is missing records (rows). However, our data is such that we can (I hope!) fill in the missing records using information from the records we do have and extrapolating the related info from a dimensional table.
Here's the scenario:
We have a table of student transcripts (online learning courses). The courses are sometimes arranged in predetermined packages (and some packages may even be nested within other packages). If a student is enrolled in a package, the system only generates a record for the top level of the hierarchy. Records for any nested packages/courses are only created once the learner has begun those courses. Thus, our transcript data is incomplete. (Only when a package is Complete will all of the nested coursework have records). Luckily, the trail of package/course hierarchy is outlined in our Course Library Table.
Here's what some sample data looks like:
Table 1 (Fact Transcipt):
Student ID | Course/Package Name | Type {course, package} | Status {not started, started, complete}
Learner001 | Learning Fundamentals Package | Package | Started
Learner001 | LF: Course 1| Course | Complete
Learner001 | LF: Course 2 | Course | Complete
Learner002 | Learning Fundamentals Package | Package | Complete
Learner002 | LF: Course 1 | Course | Complete
Learner002 | LF: Course 2 | Course | Complete
Learner002 | LF: Course 3 | Course | Complete
Learner003 | Learning Fundamentals Package | Package | Not Started
As you can see, everyone was enrolled in the same package, but they have different records based on their varying degrees of completion. All of the learners should have 4 records because they are all enrolled in the same courses.
Table 2 (Dim Course Library):
Course/Package Name | Type| Child Package/Course Name (null if none)
Learning Fundamentals Package | Package | LF: Course 1
Learning Fundamentals Package | Package | LF: Course 2
Learning Fundamentals Package | Package | LF: Course 3
I need an expression that will look at the Transcript rows and, if that row is a Package and not Complete, then it will know there must be missing records, so check the Library for its sub-packages/courses and generate transcript records for them.
Here's the logic, I just don't know how to write it in DAX, M, etc.:
If the Transcript record is Type equals Package and Status not equal Complete, then:
It seems like this should be possible, though a bit complicated. Anyone want to brew up a pot of coffee and take a crack at this?
In my experience, it is better to solve data cleansing and management tasks in tools other than Power BI before or during data load. Personally i would look at solving this in SQL to maintain the full list of records at all time.
Thanks, Matt. Unfortunately I don't have access to change the source database. I've been continuing to research this and it seems R may be the solution I need to use for rebuilding missing data.
OK, so if this were me I would try to solve it using Power Query on load. I don't know if that is possible.
To create an analogy, if you don't have access to a Word processor, you could write a 250 page book using Microsoft Excel. You could probably get the job done this way, but you would create a lot of new problems and pain along the way. Personally I would not do that and instead I would focus on the real problem, which is not having access to a Word processor.
JMO.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
109 | |
99 | |
77 | |
66 | |
54 |
User | Count |
---|---|
144 | |
104 | |
101 | |
86 | |
64 |