Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I am currently building reports for some survey data that asks a bunch of questions. Each question response is its own column in the data. The survey can be completed for multiple people in one entry so there is columns for Person 1, Person 2, Person 3.
For the purpose of reporting on a specific questions values (Ex. How many times quesiton 1 was a no) and bucketing some questions into groups I felt like I needed to unpivot the question columns so they each have their own row per question but then I ran into the issue of reporting on people individually so I then thought to create 3 seperate query for each person and combine them. That creates alot of rows for just one survey record and can put a toll on running the query the bigger the dataset gets. Is there a better way to do it, acheive the same result and not hurt the performance?
Here is an example Data Set
Date | Location | Staff 1 Name | Staff 1 ID | Staff 2 Name | Staff 2 ID | Staff 3 Name | Staff 3 ID | Question 1 | Question 2 | Question 3 | Question 4 | Question 5 | Question 6 | Question 7 | Question 8 | Question 9 | Question 10 |
2/1/2019 | Location A | Person 1 | 001 | Person 8 | 008 | Person 15 | 015 | yes | yes | yes | no | yes | no | yes | no | no | yes |
2/1/2019 | Location B | Person 2 | yes | yes | yes | no | yes | no | yes | no | no | yes | |||||
2/2/2019 | Location C | Person 3 | 003 | Person 10 | 010 | Person 17 | 017 | yes | yes | yes | no | yes | no | yes | no | no | yes |
2/2/2019 | Location D | Person 4 | 004 | Person 11 | 011 | yes | yes | yes | no | yes | no | yes | no | no | yes | ||
2/3/2019 | Location E | Person 5 | 005 | Person 12 | 012 | Person 19 | 019 | yes | yes | yes | no | yes | no | yes | no | no | yes |
2/3/2019 | Location F | Person 6 | yes | yes | yes | no | yes | no | yes | no | no | yes | |||||
2/4/2019 | Location G | Person 7 | 007 | Person 14 | 014 | yes | yes | yes | no | yes | no | yes | no | no | yes |
I thought about not unpivoting the question columns which would save alot of rows but then I felt like I would lose the grouping of the questions. For example:
If I want to group questions 1-5 as Section 1 and 6-10 as Section 2 and then say Section 1 is being answered as yes 20% of the time and Section 2 as yes 80% of the time in a matrix visual (each section getting its own row and then the column would be the percentage value) is that possible?
Hoping I'm getting the end goal across in text the best I can!
In my experience in IRT, the optimal model - even for large scale - is one row per respondent and one column per question.
I would split up the different rows per person and add a column with a an identifier per grouped survey (repeating 1 to 3 times based on the number of people participating).
Lastly, performance wise, you should remap your yes/no into 0/1 (as integer). If it is just one table, Power BI should be able to perform well, even with a large number of rows.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
111 | |
100 | |
80 | |
64 | |
58 |
User | Count |
---|---|
148 | |
111 | |
93 | |
84 | |
66 |