Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
cole_lehmkuhler
Helper II
Helper II

Trying to get a count of years for each company, but not total years in dataset

Hey all,

 

So I have a dataset of a bunch of customers and their sales data over the last 5 years. I'm trying to get a graph that shows which customers have continued to come back to make purchases. I'm trying to create a column that will say whether a customer has made a purchase in 1 year out of the last 5, 2 years out of the past 5, 3 years out of the last 5, and so on. However, I can't think of a formula or I have tried to think of a pathway I could use multiple columns to help me get the results I want. I can post a screenshot of the dataset if that will help.

1 ACCEPTED SOLUTION

@cole_lehmkuhler Ah, that's a different story. For a column, use this:

 

Years w/ Sales = CALCULATE(
DISTINCTCOUNT('Repeat Customer CSV'[Year]),
FILTER(
ALLEXCEPT('Repeat Customer CSV', 'Repeat Customer CSV'[NAME]),
'Repeat Customer CSV'[Amount] > 0
)
)

 

That assumes that each company Name is unique. Otherwise you could substitute whatever you're using as their unique key. Account number?





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




View solution in original post

18 REPLIES 18
cole_lehmkuhler
Helper II
Helper II

I also need to make sure that I keep the sales amount  with each purchase in this table as well Thanks!

@cole_lehmkuhler you should be able to do this as a measure rather than a column. I'm guessing as to your structure, but I assume there's a company column, a sales amount column, and a sales date column, and that the sales date column has a relationship to a date table's date field.

 

Years w/ Sales = CALCULATE(
DISTINCTCOUNT(DateTable[Year]),
FILTER(
DateTable,
DateTable[Year] >= YEAR(TODAY()) - 5
),
FILTER(
SalesTable,
SUM(SalesTable[Sales]) > 0
)
)

 

Skip the FILTER(DateTable...) part if you don't want to hardcode the last five years part (if for instance you want to do that part with a slicer or a page level filter or whatever).





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




@khorseman There actually is no date table and there's not even a date column just a year column. and there is only the one table. I'll post a screenshot of the table for you!

@cole_lehmkuhler well that's ok. I think the same basic logic will work. Try this:

 

Years w/ Sales = CALCULATE(
DISTINCTCOUNT('Repeat Customer CSV'[Year]),
FILTER(
'Repeat Customer CSV',
'Repeat Customer CSV'[Year] >= YEAR(TODAY()) - 5 &&
SUM('Repeat Customer CSV'[Amount]) > 0
)
)





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




support picture2222.PNG

 

I tried that and it gave me 5 years for every single company. Do you think referencing the table and using the group by function with the count operator would help me? or do you have another possible solution? Some of these companies are on this list up to 5 times due to their purchases in the past 5 years and I know I didn't specify this earlier, but I will now. Don't worrya about the year filtering. I already filtered the entire dataset to only show the past 5 years.

@cole_lehmkuhler  This a measure, not a column. You've added it as a column.

 

Also you can strip out the year filter, and even the SUM statement is probably unnecessary. It should work fine this way:

 

Years w/ Sales = CALCULATE(
DISTINCTCOUNT('Repeat Customer CSV'[Year]),
FILTER(
'Repeat Customer CSV',
'Repeat Customer CSV'[Amount] > 0
)
)





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




Oh okay I was a little unclear. I need this by each customer. It is going to be the Axis of a bar graph so I  can have 5 groups in a bar graph that say "1 Year (w/sales)" "2 Years" "3 Years" etc. and then I can have the sales amounts be the values to these groups. Does that make sense? I can't have it as a measure, because then I won't know from customer to customer

@cole_lehmkuhler Ah, that's a different story. For a column, use this:

 

Years w/ Sales = CALCULATE(
DISTINCTCOUNT('Repeat Customer CSV'[Year]),
FILTER(
ALLEXCEPT('Repeat Customer CSV', 'Repeat Customer CSV'[NAME]),
'Repeat Customer CSV'[Amount] > 0
)
)

 

That assumes that each company Name is unique. Otherwise you could substitute whatever you're using as their unique key. Account number?





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




Well that is where I think the problem comes in. Some lines are unique only by multiple fields (Account # and the Year) I don't know if it is possible to get what I'm wanting for that reason. I tried your formula again, and it populated the entire column with "5" though so it's not counting distinct based off of the name or account # then. Again, thanks for all of your help! If you have any more ideas I'd be thrilled because I've been stuck on this for awhile too long now.

@cole_lehmkuhler there has to be a way to uniquely identify each company. That's all I meant by unique company values.

 

I guess that's a little unclear. I don't mean that each company name should appear only once, but rather that there shouldn't be two companies with the same name. I'm used to working with data where name is not a unique identifier, but there's some key. CustomerName for instance. There could be more than one person named John Smith, but one of them has a CustomerKey of 155645656 and the other has a customer key of 8723401. Each one shows up multiple times in the table because they bought multiple items. I can identify them uniquely with SalesTable[CustomerKey], but not with SalesTable[CustomerName].

 

Like you don't have a machine tool company named Acme as well as a demolition equipment company also named Acme, right? If all your company names are unique companies and there are no double-Acmes, then the formula as given should work and you can ignore the stuff about account #. If you do have two companies with the same name, you'll need a unique identifier column of some sort. Most database tables have keys for this sort of thing.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




Oh no there are not two company's with the same name, and you guessed correctly earlier by saying the unique identifier was account #. I am sure that there are no two company's with the same name. I will try it using the account # just to make sure, but I was getting 5 (meaning all company's have bought from mine in the past 5 years, which I know isn't true) for every row. I'm going to do a quick double check though on everything

@cole_lehmkuhler I don't know. I guess I'd need to take a look at your actual dataset. I made a test set with 10 companies where 9 of them have some sales in each of the last 5 years but one of them only has sales in 3  of those years and it worked fine for that set. Either you're mistaken about the years or I'm mistaken about my understanding of your table. If you want me to test on yours, feel free to PM me a download link to a sample table that gives the incorrect 5s and I'll see what I can figure out.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




I just tried it again and the Power BI Gods let it go through haha or I just made an error typing it in last time anyways thanks a bunch for your help!

@cole_lehmkuhler glad it worked. I was wondering how we could get such different results.

 

I agree with @Vvelarde. This could definitely benefit from a cleaner data model. There should be a fact table for the sales and a company dimension table to hold unique company info, with a relationship between the two. It's not strictly necessary in the specific case described in this thread but the more other stuff you're trying to do in this data model, the more necessary this will become. And troubleshooting formulas is easier with that kind of model anyway.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




 

 




Lima - Peru

Ignore the table above the "Repeat Customer CSV" table that table doesn't relate to the Repeat Customer CSV table at allSupport picture.PNG

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.