Solved: Label categorical data - Please help!

jett254 · ‎05-25-2017

Hi,

I'm working with market research data that I imported from SPSS (first exporting to Excel and then connecting from there). Some of my variables are categorical - for instance, Ethnicity -- where in SPSS they were coded as 1=Caucasian 2=AA 3=Asian. I originally imported my data with the data labels and everything categorical was a string which did not work well, so I reimported my data as numeric -- however, let's say I am reporting a stacked bar by Year on the axis, and I have the variable for Race as Legend. Now my Race values are labeled as 1, 2, 3 instead of Caucasian, AA, Asian, etc...

What is the best way to handle this? Is there a way to label the data? Or compute a new measure with IF statements into a string column? Help!!

Thanks!

dedelman_clng · ‎05-25-2017

Add a table with the translation of ID to Description, join it to the fact table on the ID, then use the Description field as your category. Below I show a partial data model that I have used, where "Answer" is a number and "Translation" is the description of that number (1 = Strongly Disagree, 2 = Disagree, etc).

Hope this helps

David

View solution in original post

dedelman_clng · ‎05-25-2017

Add a table with the translation of ID to Description, join it to the fact table on the ID, then use the Description field as your category. Below I show a partial data model that I have used, where "Answer" is a number and "Translation" is the description of that number (1 = Strongly Disagree, 2 = Disagree, etc).

Hope this helps

David

jett254 · ‎05-25-2017

That is a great approach.

I ended up finding two other solutions, but yours is much better.

I did a grouping of the variable, and I also did a new calculated column using a SWITCH function.

But moving forward, I will create a data label file from SPSS for Market Research data and set the relationship, so I'm really glad I asked!

jett254 · ‎06-12-2017

I tried out this solution and I'm running into an issue.

So, my answer data table looks like:

Unique ID

Question1

Question2

Question3

Question4

Let's say Question 1 responses are 1 and 2 where 1="Yes" and 2="No". Let's say Question 2 responses in the data are 1, 2 and 3 where 1 = Not at all likely" 2 "Moderately likely" and 3 = "Extremely Likely". I only have the numeric responses that correspond to those labels in my data.

So I have another table that has 3 colums:

Variable Name

Variable Value

Variable Label

It would have Question 1, 1, Yes; Question 1, 2, No.

How can I link those together so I can label my data in tables and graphs?

dedelman_clng · ‎06-26-2017

- Concatenate the Question# with the response value (e.g. Question1-1, Question2-3)

- Do the same on the decode table with name and value

- Join concatenated column on fact table to concatenated column on new decode table

Note that this method might require a lot of maintenance.

You can also look into splitting the responses into two tables, with survey identifier, question# and response values (e.g. "YesNoTab", "LikelyTab") with each pointing to its own decode table. A table with survey identifier can be used for the one-side from both (or all) response tables.

Hope this helps

David

Label categorical data - Please help!

Helpful resources

Fabric certifications survey

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024

How to Get Your Question Answered Quickly