Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Betsy
Helper IV
Helper IV

Best way to create many calculated columns?

Hi all!

 

I'm completely new to Power BI. I work for an education company and the data I am compiling is from multiple data dumps, which have student IDs and demographic info in one csv file, questions they have received in another csv, and responses to those questions in a third csv.

 

I was using excel to answer three basic queries for subsets of students:

 

1. What questions did (each) student receive?

2. What questions did (each) student answer?

3. What did students answer?

 

(Two additional queries are when did they receive the questions, and when did they answer).

 

The first and second queries need to be calculated (not all students receive all questions, and not all students answer all questions they receive) so I calculated a received column with 0/1 values and an answered column with the same, for each question. Here's an example formula, where I named the column with the question ID:

=COUNTIFS(QTR!$B:$B,$A2,QTR!$C:$C,RIGHT(AV$1,3))

 

For each question received column, I just copied across the formula. I did the same for each question answered.

 

For the overall data set, there are approximately 500 questions, so I would need to do this for 1000 columns. It's too much calculation for excel, in addition to the vlookups for what the students answered, and when they received and answered the questions. So I started to investigate Power BI to do this. From what I've been able to learn so far, I don't need to flatten the data for query 3 (what they answered) or for when they received/answered a question, since that data is explicitly in the csv files. But I'm stuck on queries 1 and 2.

 

Any tips for how to go about this? I'm quessing creating 1000 calculated columns is not the best way to go?

 

Thanks!!

 

Betsy

 

 

 

1 ACCEPTED SOLUTION

@Betsy

Thanks for your clarification. However the two of the snapshots are blurred and I can't enlarge them to view details. So your scenario is still a little confusing. So maybe you can upload your pbix to a net storage and share?

 

Based on what I've got from your description. It seems that you're looking for "Merge Query" and "Pivot".

Capture.PNG

View solution in original post

6 REPLIES 6
Eric_Zhang
Employee
Employee

@Betsy

 

I don't quite get your case, so please be more specific. Please post sample data of your 3 csvs and expected output.

 

According to the description, likely below tips would help.

  1. Creating proper relationship among your tables.
  2. You don't have to create 1000 columns manually, probably what your need is a PIVOT operation.

 Hi Eric,

 

Thanks for your help!

 

Here are the samples.

 

Students.png

This first csv lists student info, and the relationship join with the other 2 csvs is student ID (column 1).

 

Packages.png

This csv above lists the questions that students received (and other things like statements and notes which are not important right now). The question ID is Package ID in column 5.

 

Responses.png

 

This csv lists responses (in the text file). It is linked to the question csv above through package ID.

 

So what I can do now is look at what students replied, but not if they received each question and whether they answered the question or not.

 

Here's how I did this with excel:

 

Output.png

 

 

@Betsy

Thanks for your clarification. However the two of the snapshots are blurred and I can't enlarge them to view details. So your scenario is still a little confusing. So maybe you can upload your pbix to a net storage and share?

 

Based on what I've got from your description. It seems that you're looking for "Merge Query" and "Pivot".

Capture.PNG

@Eric_ZhangThank you! That seems to have worked with a little cleaning. Sorry for the delay, I ended up having to work on something else before I was able to try this out.

@Eric_Zhang

 

Not sure if I should ask this as a follow-up, or start a new thread. Merging, then pivoting worked perfectly as you desctibed above for the first set of data, which was the questions that students answered. I cannot get the same procedure to work for the second pivot (questions received). The merge works fine, I expand the column to just include the question #, then run the pivot with question # transformed into text with an added character in a new column (because the numbers are the same as the first pivot which have already been used as column headers, eg., 20144 becomes 20144G) and with count (all), the pivot doesn't complete. It just hangs, no error message. It gets to the same 14.4 MB on the input data sheet, and stays there.

 

I can't figure out what the issue is. 

 

1. It's the second pivot appended to the original student query.

2. Too much data? There will be an additional 460 or so columns on a sheet with 500 columns already (with 460 of those from the first pivot). I ended up severely cleaning the worksheet from which the question received data is being pulled from. I started out just filtering it, but even with a much smaller data input query it's not working.

3. I've tried changing the data type (basically because I've tried everything). I can't set the column header pivot column as number because I had to add a character, but it has been text/number and text, and the value column has also been set as number and as text.

 

Unfortunately, I can't upload this data itself.

 

Anything anyone can think of I should try?

 

Thanks a million!

 

Betsy

@Betsy

 

You can't or you're not allowed to upload the data? In former case, you can try any web storage, such as MS onedrive, dropbox etc.

 

For latter case, can you upload any sample/dummy data? You can mask sensitive infomation before uploading. I can understand your problem better by knowing the data.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.