Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi all! I hope someone can help me out with best practice ways to deal with non-ideal data input from Microsoft Forms.
The tricky part of the Forms data is that the survey is evaluating 7 different modules, and so the same identical questions (4 of them in this dummy data) are asked 7 times. The Forms is branched, so that the users only have to answer for those modules that they are using and are able to give feedback to. This is good for data collection, but not ideal for data analysis (at least with my current experience in power query and dax).
The resulting data from the forms (excel file) will look like the table below, except that it contains the questions 7 times rather than 2.
ID | Department | Do you have feedback regarding M1? | What is working well in M1? | What is not working well in M1? | Opportunities for development for M1? | Do you have feedback regarding M2? | What is working well in M2? | What is not working well in M2? | Opportunities for development for M2? |
1 | IT | No | Yes | M2-Q1-ID1 | M2-Q2-ID1 | M2-Q3-ID1 | |||
2 | Finance | No | No | ||||||
3 | Operations | Yes | M1-Q1-ID3 | M1-Q2-ID3 | M1-Q3-ID3 | Yes | M2-Q1-ID3 | M2-Q2-ID3 | M2-Q3-ID3 |
I want the end-result to look like the table below. That means:
- The header "do you have feedback regarding module x?" will be replaced by "module number", and for those that have "Yes" as an answer, the input in the row is the actual module number.
- The question headers that were specific for each module (there are 7 of them in total) are changed to general question headers.
- One row per respondent in forms (with a lot of similarly named columns that actually contain the same data 7 times), are displayed as 7 rows (with common question headers).
ID | Department | Module number | What is working well? | What is NOT working well? | What opportunities for development do you see? |
1 | IT | No | |||
1 | IT | Module 2 | M2-Q1-ID1 | M2-Q2-ID1 | M2-Q3-ID1 |
2 | Finance | No | |||
2 | Finance | No | |||
3 | Operations | Module 1 | M1-Q1-ID3 | M1-Q2-ID3 | M1-Q3-ID3 |
3 | Operations | Module 2 | M2-Q1-ID3 | M2-Q2-ID3 | M2-Q3-ID3 |
I know one way to get to my desired end result (which is probably far away from best practice). It involves the following steps:
- Duplicate the dataset 7 times
- Remove all rows, except for one module (for all 7 duplicates)
- Replacing values in "do you have feedback regarding M1?" from "yes" to the module number, for all 7 duplicates
- Generalise the other question headers for all 7 duplicates.
- Append all 7 queries to one
In example with adding new columns, all duplicates of the query imports the now columns, and so it requires more work than I hope is necessary.
I would much appreciate it if someone could assist on this issue.
Solved! Go to Solution.
Hi @Anonymous ,
you can try this code.
Asusmption is, that each question ends with "Mx?", for example "What is working well in M1?". There is a split of last 3 characters.
Basicaly, each question is changed from: "What is working well in M1?" to "What is working well in the module?"
Code should work with random number of questions.
let
Source = Excel.Workbook(File.Contents("path to excel file"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Promoted Headers", {"ID", "Department"}, "Attribute", "Value"),
#"Split Column by Position" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByPositions({0, 3}, true), {"Question", "Module"}),
#"Add question mark" = Table.TransformColumns(#"Split Column by Position",{{"Question", each _ & "the module?", type text}}),
#"Delete question mark" = Table.ReplaceValue(#"Add question mark","?","",Replacer.ReplaceText,{"Module"}),
#"Pivoted Column" = Table.Pivot(#"Delete question mark", List.Distinct(#"Delete question mark"[Question]), "Question", "Value"),
#"Replace null with empty" = Table.ReplaceValue(#"Pivoted Column",null,"",Replacer.ReplaceValue,List.Difference(Table.ColumnNames(#"Pivoted Column"),{"ID","Department","Module"}))
in
#"Replace null with empty"
Hi, @Anonymous
Try this:
// output
let
Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("rZJBSwMxEIX/SthzCybRtfXiwUXYw1oEQUQ8xO60DU2TkM22iPS/u1lLnYGy7cHje3kD37zJ+3f24Ey7sTy7y8oiGx2k6GQBXoW4ARuPtky2Y1+uZSu1BbYAqD/VfM0CLFWotV2yit8f49dd/HWlItMN27mwTu87MIZpS3I3KGddHMzmXXbmvQuxtTpqaNjCBVbDFozzCbbXeOL2AmjxF58MQaPc9Bw0yvKry6i7kf0InYTje5Qv5A5PDvdsW2NQnVjmVKY63qAh+1Zi/MzHZcHJdskVxO33SLbsbcIqMOujtsrO4d+A0eyEvk6pTIRJEzSJ0WYegora2YbQ4UrSv634byWS/NLkCuLmB1cSd6hkebJkebpkme0/fgA=", BinaryEncoding.Base64),Compression.Deflate))),
pmhd = Table.PromoteHeaders(Source),
toList = Table.ToList(pmhd, each Table.FromRows(List.Transform(fx(List.Split(List.Skip(_,2),4)),(lst)=> List.FirstN(_,2)&lst ))),
cmb = Table.Combine(toList)
in
cmb
// fx
(lsts)=>
let
//lsts = toList{0},
pos = List.Split(List.Positions(lsts),1),
zip = List.Zip({lsts,pos}),
trans = List.Transform(
zip,
each let nlst= List.Combine(_)
in List.RemoveLastN(
List.ReplaceValue(
nlst,
"Yes",
Number.ToText(List.Last(nlst)+1,"Module 0"),
Replacer.ReplaceValue
)
)
)
in
trans
Hi, @Anonymous
Try this:
// output
let
Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("rZJBSwMxEIX/SthzCybRtfXiwUXYw1oEQUQ8xO60DU2TkM22iPS/u1lLnYGy7cHje3kD37zJ+3f24Ey7sTy7y8oiGx2k6GQBXoW4ARuPtky2Y1+uZSu1BbYAqD/VfM0CLFWotV2yit8f49dd/HWlItMN27mwTu87MIZpS3I3KGddHMzmXXbmvQuxtTpqaNjCBVbDFozzCbbXeOL2AmjxF58MQaPc9Bw0yvKry6i7kf0InYTje5Qv5A5PDvdsW2NQnVjmVKY63qAh+1Zi/MzHZcHJdskVxO33SLbsbcIqMOujtsrO4d+A0eyEvk6pTIRJEzSJ0WYegora2YbQ4UrSv634byWS/NLkCuLmB1cSd6hkebJkebpkme0/fgA=", BinaryEncoding.Base64),Compression.Deflate))),
pmhd = Table.PromoteHeaders(Source),
toList = Table.ToList(pmhd, each Table.FromRows(List.Transform(fx(List.Split(List.Skip(_,2),4)),(lst)=> List.FirstN(_,2)&lst ))),
cmb = Table.Combine(toList)
in
cmb
// fx
(lsts)=>
let
//lsts = toList{0},
pos = List.Split(List.Positions(lsts),1),
zip = List.Zip({lsts,pos}),
trans = List.Transform(
zip,
each let nlst= List.Combine(_)
in List.RemoveLastN(
List.ReplaceValue(
nlst,
"Yes",
Number.ToText(List.Last(nlst)+1,"Module 0"),
Replacer.ReplaceValue
)
)
)
in
trans
@ziying35 little bit confusing, but nice 🙂
@Anonymous if you use this solution, you can adjust the code for actual number of questions in this step:
toList = Table.ToList(pmhd, each Table.FromRows(List.Transform(fx(List.Split(List.Skip(_,2),4)),(lst)=> List.FirstN(_,2)&lst )))
Yes, @Bohumil_Uhrin , with your special instructions, my solution is much more complete, and all that's left is for @Anonymous to change the field titles.
Hi @Anonymous ,
you can try this code.
Asusmption is, that each question ends with "Mx?", for example "What is working well in M1?". There is a split of last 3 characters.
Basicaly, each question is changed from: "What is working well in M1?" to "What is working well in the module?"
Code should work with random number of questions.
let
Source = Excel.Workbook(File.Contents("path to excel file"), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Promoted Headers", {"ID", "Department"}, "Attribute", "Value"),
#"Split Column by Position" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByPositions({0, 3}, true), {"Question", "Module"}),
#"Add question mark" = Table.TransformColumns(#"Split Column by Position",{{"Question", each _ & "the module?", type text}}),
#"Delete question mark" = Table.ReplaceValue(#"Add question mark","?","",Replacer.ReplaceText,{"Module"}),
#"Pivoted Column" = Table.Pivot(#"Delete question mark", List.Distinct(#"Delete question mark"[Question]), "Question", "Value"),
#"Replace null with empty" = Table.ReplaceValue(#"Pivoted Column",null,"",Replacer.ReplaceValue,List.Difference(Table.ColumnNames(#"Pivoted Column"),{"ID","Department","Module"}))
in
#"Replace null with empty"
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
100 | |
51 | |
19 | |
12 | |
11 |