Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
AoJueMing
Helper I
Helper I

Casting a wide net as I'm not even sure of how to ask the question - survey data related

Hey

 

I deal with survey data, I have for a long time and recently decided to see how BI can develop my work.  

I'm pretty new to it and I'm not sure of the esoteric data management terminologies so I'm just going to get straight to it and hopefully we can find a way to communicate!

 

Let's say we have a survey with 3 multiple-choice questions.  Each question has 3 possible answers and only one answer can be given per question.  When you pull that data you will very easily be able to organise the data in the fashion that PBI needs it to be in in order to read/use it.  From there is it very simple to create your charts and explore your data.  You will, for example, be able to click on a segment of the piechart you made for the first question, and that will automatically show you how those people from that segment answered the rest of the survey questions.  Very useful, very easy, very good.

 

Now let's say we've added a multiple-choice matrix question.  Let's say we're now using checkboxes instead of radio buttons also, to really spice it up.  If you've followed me to this point and you know what I'm talking about, you'll know that some fairly serious cleansing is required in order to use the data you receive from pulling that survey data.  Typically you'll need to unpivot the data for that question.  You do this, and you end up with a 'flattened' table of the format that is required by PBI.  (Can this flattened question data be merged with the rather more simple data from the other simple questions?  Can anyone point me in a direction?)

 

And here is my problem;

I am trying to find a way to use this flattend data from various questions with the simple data from simpler questions so that I can use the features of PBI that I initially mentioned in paragraph 2.

 

To my knowledge, there is no way to aggregate all of this data into one table which can create charts from all the data you collect from the survey.  I do not know of a way to organise data from simple questions with flattened data from more complex questions in order to do everything from one data source/table.  

 

I have tried breaking down each question into it's own table, using the respondentID as the link between all tables (so PBI knows that answers from table x are linked to answers from table y, etc).  Like an idiot I thought that creating a relationship between all these tables using the respondentID as the link would work - but apparently I cannot create a relationship because the columns need unique entries - the whole point was that they were not unique entries and that's why a relationship should have been easy to create - clearly not.  

 

Ultimately, what I am currently trying to learn is how to take survey data formatted in excel and use that data in PBI - essentially it's a data modelling problem.  I have discovered that if you have a sales list with sales dates, prices, brands etc - then PBI is just about the easiest thing in the world to use - if you're trying to display survey data, it seems to be a whole different sort of trial.

1 ACCEPTED SOLUTION

I believe both are correct, the difference only matters if you are evaluating a 2nd level within the same pie chart.  For example, you might have your pie chart that shows respondents' favorite color, but then within that color show the division by heads or tails.  In that case the Color would be the Legend and Heads/Tails would be the Details.  Note that I wend to the pie chart settings in this example and turned on the Legend so I could see both Heads/Tails and the favorite color.

 

PBI Survey 6.jpg

 

Dan

View solution in original post

18 REPLIES 18
danrmcallister
Resolver II
Resolver II

Can you provide a screenshot of some sample data we could play with?  

 

It seems like you could bring two tables into PBI, unpivot your multiple selection questions, and then merge the tables in query editor before bringing the data in.  If questions 1 and 2 are multiple choice and question 3 is "pick any of these four options", then your table of data would show how any particular survey taker chose for question 1 and question 2, and then for each of the possible options in question 3 if they answered Yes or No.

Sure.  Here's a madeup sample that creates the same issues I am dealing with.  

 

This is a pretty true to life example of the data you'll get back from a basic survey (We're talking about Survey Monkey specifically here if that makes any difference, though I want to solve this issue so I know how to manipulate data from any source and use it in PBI).  You can see that there are 10 respondents, each with their own respondentID.  Questions 1, 2 and 3 are all radio-button multiple-choice questions.  As giving only a single answer is possible for Q1, 2, 3, only a single answer is returned (the possible selection elements of that question are not displayed in the survey data).  All pretty crisp and not so much preparitory work in order to use in PBI.  My favourite.

 

Questions 4 and 5 are examples of data you will receive back from multiple-choice-matrix tick-box question.  That is to say; a question is asked and multiple selection elements are offered to the respondent (as with the first three questions) - the respondent may click none or all of the answers (unlike the first three questions).  You can see that the selection elements are listed (in the second row), and they are listed again on the respondents row if they ticked that selection during the survey. 

 

Perhaps with this example you can see the main issue; Once you have flattened the matrix questions data - how do you then link each respondents data with their previous answers.  Ultimately, I am trying to work out how to model this data so that I can follow respondents through the entire survey with PBIs analysis tools.  

 

RespondentIDHeads or Tales?What is your favourite colour?Where you born in the…Where do you spend time on your phone?  What do you look at on your phone?  
ResponseResponseResponseResponseOn the sofaOn the busOn the trainStanding aroundOn my boatEmailFacebookTwitterWhatsAppBrowser
1TailsBlue80'sOn the sofaOn the busOn the trainStanding around EmailFacebook WhatsAppBrowser
2TailsGreen90'sOn the sofaOn the busOn the train  EmailFacebookTwitterWhatsApp 
3HeadsGreen80's   Standing aroundOn my boatEmail Twitter Browser
4HeadsYellow00'sOn the sofa On the train  EmailFacebook WhatsApp 
5HeadsRed00'sOn the sofaOn the bus   EmailFacebook WhatsApp 
6HeadsBlue70's On the bus   Email  WhatsAppBrowser
7TailsBlue70's On the busOn the train  Email Twitter Browser
8HeadsBlue50'sOn the sofaOn the bus Standing around EmailFacebook  Browser
9HeadsRed80'sOn the sofa  Standing around Email   Browser
10TailsPurple90'sOn the sofa   On my boatEmailFacebookTwitterWhatsApp 

Your columns for the multi-answer questions - should those be a single column where the answers are split by a delimiter, or is it multiple columns with the header only in one column?  This is how it looks when I copy into Excel:

 

PBI Survey.jpg

 

Just want to make sure I have the problem right!

Absolutely.  Much apprecaited.  There was an HTML error when I tried to post so some cells may have become confused.  Here's a screenshot of the original:

 

fgtrf.PNG

 

You should be able to see that it is multiple columns with the header (question) only in one column.  The appearance of the next header (question) (on row 1) shows where data from one question ends, and the next begins. 

 

For clarification, row 2 is occupied by either 'response', which is essentailly a place holder for when a more complex questions data is shown, whereby 'response' is replaced with the title of the columns from the survey.  

Okay, this is not as elegant a solution as I would like and there could quite possibly be an easier way to do it, but I think it does what you're asking for.  Problem is that each time you want to do this for a dataset you'd have to build the following query and add custom columns... not as automated as I'd like.  Anyone have feedback on this?

 

Steps I took:

 

1. In the query editor, I merged the columns for the two multi-answer questions with a delimiter (screenshots below)

2. In the query editor, I removed the columns with the multi-answer questions

3. I created a measure for each possible answer of the multi-answer questions to respond "Yes" or "No" 

4. Now the pie charts I have are interactive when using slicers; I can modify the output of multiple things and see how it impacts other answers

 

Is there a different way you would look at the multi-answer question results than by individual question as I've done here?  I am not experienced with reviewing survey results!

 

I'm happy to send you the pbix file, just send me a private message with your email if you want to test it out.

 

Query editor: merge the columns

PBI Survey Merge Columns.jpg

 

Query editor: remove columns

PBI Survey Remove Columns.jpg

 

Code for measures:

On the bus = IF(
    iferror(SEARCH("On the bus",Data[Where?]),0) > 0,
    "Yes",
    "No"
)

Basically this is saying "return the character number of the phrase, if it is found in the merged column.  If no character number exists, return 0.  If the number is > 0, then "Yes" this person answered "On the bus", if no the answer is "No".  Would need to create a measure for each possible answer to each multi-answer question.

 

Screenshots of visual interactivity:

PBI Survey 1.jpgPBI Survey 2.jpgPBI Survey 3.jpg

 

Let me know what you think,

Dan

Hi Dan

 

Your solution is a stepping stone towards my ultimate goal, so thank you very much for that.  

 

Further to everything however - and if you're still remotely interested - I've been toying around with your solution - but I became stuck on stage 3 - creating the measure.  

I have been playing around a lot and breaking your code down, working out what each bit does.  I'm not totally new to DAX so it was slow but steady - but I simply could not get the full expression to work.  Do you know what to make of this error?

 

jnmuhgnbd.PNG

"A single value for column 'Where' in table 'Data' cannot be determined.  This can happen when a measure formula refers to a column that contains many values without specifying an aggregation such as min, max, count, or sum to get a single result."

 

 

Hmm, it looks like it's not recognizing the "Where" field in your table.  My table was named "Data", is yours named something different?  

I believe you are right.  After incrementally playing around with the code I came to the same conclusion - though sadly I had named my table 'Data' too.

jndtyhyfgjhn.PNG

 

Obviously I couldn't reach the end and play around with your suggested solution - though I did get the jist and learnt a bit more DAX.  Little bit frustrated with this error though - really couldn't see the problem at all.

Yeah, that is very strange.  When you type part of the Data[Where] statement does it try to autofill the table/column name?  If not then it's not recognizing your format for some reason.  Are you adding it as a calculated column or as a measure?  I'm not sure if that would make a difference but I was adding it as a calculated column.

 

PBI Survey 4.jpg

 

If you want to PM me your email address I'd be happy to send over my PBIX file.

Hi Dan

 

Couple quick questions if you're still around.

 

1) Why is your 'RespondentID' field a "sum".  What's the difference between yours and mine?

2) Can you please show me a breakdown (printscreen) of one of your piecharts.  

 

Learning the basics of data modelling really is where I am with BI.  It requires a level of granularity that I'm just not accustomed to.

 

Thanks.

Sure thing.

 

1. The sigma indicates that the number can be used as a "Measure" rather than a dimension - it can be averaged, summed up, etc.  Obviously this doesn't make sense with a respondent ID, I was just lazy in putting together my data since it didn't matter for this particular example.  To have it as "text" is more proper, though I'll note if you ever need to order your IDs in numerical order while having them as text, you'll want to number your respondents "001, 002, 003" etc. rather than "1, 2, 3" as you need the leading zeros for the system to know that, for example, 10 comes after 9 instead of after 1.

 

2. Is this what you're looking for?

PBI Survey 5.jpg

1) OK, cool.  Yes, I have realised the importance of 001, 002 etc already.  A nice little quirk.  

 

2) Yup, that's great, thanks.  I actually had my calculated column 'fields' in 'Details' under the 'Vizualization' tab, rather than in 'Legend' as ytou had them.  presumably that would make a difference at some point?  They both seem to work? Is one more crrect than the other?

 

 

I believe both are correct, the difference only matters if you are evaluating a 2nd level within the same pie chart.  For example, you might have your pie chart that shows respondents' favorite color, but then within that color show the division by heads or tails.  In that case the Color would be the Legend and Heads/Tails would be the Details.  Note that I wend to the pie chart settings in this example and turned on the Legend so I could see both Heads/Tails and the favorite color.

 

PBI Survey 6.jpg

 

Dan

Thanks for your help, Dan.  Seriously apprecaited.

 

 

That did indeed make the difference.  It wasn't happy as a measure, but calculated column worked fine.

 

Thanks very much for your help, Dan.  Much appreciated.  

Hi Dan

 

Have been absolutley back-to-back this last week.  I am aware I've left you hanging for a few days but wanted you to know I absolutley appreciate your attention and I will get round to reading/emulating/testing what you've suggested, probably tomorrow.

 

Thanks very much.

First, can you share your pbix or excel data (anonymized is fine)?

 

Second, you are correct that it is a data modeling problem, but, despite your articulate description, without seeing/working with the data, it is hard to give you a meaningful answer on how to properly model your data.  Maybe someone here has specific domain knowledge and has worked with survey data before, but for us generalists, it is too hard without working with the actual data.

I understand completely - it is quite a specific inquiry.  I did not expect a PBI wizard to come and fix all my problems off the back of that post.  I was more hoping to bump into someone who had already suffered the same issue.

 

More detail follows in my reply to the second response ITT, if you were at all interested.

 

Thanks

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.