Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
jett254
Regular Visitor

Label categorical data - Please help!

Hi,

 

I'm working with market research data that I imported from SPSS (first exporting to Excel and then connecting from there).  Some of my variables are categorical - for instance, Ethnicity -- where in SPSS they were coded as 1=Caucasian 2=AA 3=Asian.  I originally imported my data with the data labels and everything categorical was a string which did not work well, so I reimported my data as numeric -- however, let's say I am reporting a stacked bar by Year on the axis, and I have the variable for Race as Legend.  Now my Race values are labeled as 1, 2, 3 instead of Caucasian, AA, Asian, etc...  

 

What is the best way to handle this?  Is there a way to label the data?  Or compute a new measure with IF statements into a string column?  Help!!

 

Thanks!

1 ACCEPTED SOLUTION
dedelman_clng
Community Champion
Community Champion

Add a table with the translation of ID to Description, join it to the fact table on the ID, then use the Description field as your category.  Below I show a partial data model that I have used, where "Answer" is a number and "Translation" is the description of that number (1 = Strongly Disagree, 2 = Disagree, etc).

 

Capture.PNG

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Hope this helps

David

View solution in original post

4 REPLIES 4
dedelman_clng
Community Champion
Community Champion

Add a table with the translation of ID to Description, join it to the fact table on the ID, then use the Description field as your category.  Below I show a partial data model that I have used, where "Answer" is a number and "Translation" is the description of that number (1 = Strongly Disagree, 2 = Disagree, etc).

 

Capture.PNG

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Hope this helps

David

That is a great approach.  

 

I ended up finding two other solutions, but yours is much better.

 

I did a grouping of the variable, and I also did a new calculated column using a SWITCH function.

 

But moving forward, I will create a data label file from SPSS for Market Research data and set the relationship, so I'm really glad I asked!

I tried out this solution and I'm running into an issue.

 

So, my answer data table looks like:

Unique ID

Question1

Question2

Question3

Question4

Let's say Question 1 responses are 1 and 2 where 1="Yes" and 2="No".  Let's say Question 2 responses in the data are 1, 2 and 3 where 1 = Not at all likely" 2 "Moderately likely" and 3 = "Extremely Likely".  I only have the numeric responses that correspond to those labels in my data.

 

So I have another table that has 3 colums:

Variable Name

Variable Value

Variable Label

 

It would have Question 1, 1, Yes; Question 1, 2, No.

 

How can I link those together so I can label my data in tables and graphs?

- Concatenate the Question# with the response value (e.g. Question1-1, Question2-3)

 

- Do the same on the decode table with name and value

 

- Join concatenated column on fact table to concatenated column on new decode table

 

Note that this method might require a lot of maintenance.

 

You can also look into splitting the responses into two tables, with survey identifier, question# and response values (e.g. "YesNoTab", "LikelyTab") with each pointing to its own decode table.  A table with survey identifier can be used for the one-side from both (or all) response tables.

 

Hope this helps

David

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.