Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
I need help in understanding how to optimize some measures and have a few questions. My system configuration is :
I am using PowerQuery to import 1.12 GB of external CSV data from multiple CSV files. After transforming that data by removing unnecessary columns, changing formats etc and adding to PowerPivot Data Model, the size is now reduced to 555MB.
I have created 2 tables in PowerPivot:
I have created a Simple Pivot and added Slicers on sheet for the following:
I have the following measures created:
DistinctCount:=DISTINCTCOUNT(fData[Char Value])
TotalCountWithAllFiltersRemoved:=IF(CALCULATE( COUNTA('fData'[Char Value]), ALL( 'fData'[Category], 'fData'[Country], 'fData'[Brand Owner], 'fData'[Item Appearance Date], 'fData'[Char Description] ) ) = COUNTA('fData'[CHAR VALUE]), COUNTA('fData'[Char Value]) )
UniquesCount:=IF( HASONEFILTER('fData'[Category]) && HASONEFILTER('fData'[Country]) && HASONEFILTER('fData'[Brand Owner]) && HASONEFILTER('fData'[Item Appearance Date]) && HASONEFILTER('fData'[Char Description]) && HASONEFILTER('fData'[Char Value]) , [TotalCountWithAllFiltersRemoved], SUMX ( SUMMARIZE ( 'fData', [Category], [Country], [Brand Owner], [Item Appearance Date], [Char Description], [Char Value] ), [TotalCountWithAllFiltersRemoved] ) )
UniqueChars:=IF(CALCULATE( COUNTA('fData'[Char Value]), ALL( 'fData'[Category], 'fData'[Country], 'fData'[Brand Owner], 'fData'[Item Appearance Date], 'fData'[Char Description] ) ) = COUNTA('fData'[Char Value]), 1 )
UniqueCharsCount:=IF( HASONEFILTER('fData'[Category]) && HASONEFILTER('fData'[Country]) && HASONEFILTER('fData'[Brand Owner]) && HASONEFILTER('fData'[Item Appearance Date]) && HASONEFILTER('fData'[Char Description]) && HASONEFILTER('fData'[Char Value]) , [UniqueChars], SUMX ( SUMMARIZE ( 'fData', [Category], [Country], [Brand Owner], [Item Appearance Date], [Char Description], [Char Value] ), [UniqueChars] ) )
My Questions:
Any help would be most appreciated.
Hi,
Can anybody from the PowerBI community go through the comments and help quickly?
Here is the sample testdata.xlsx file.
hi @Anonymous, I didn't have time to look into it earlier
can you have a look at this measure, I believe it gives the correct values e.g. 0.12OZ has 1 and 0.15OZ has null as it's was sold in 2004 & 2005)
UniquesCount = VAR _Chars = VALUES ( fData[Char Value] ) VAR _CharsAllYears = CALCULATETABLE ( fData, _Chars, ALL ( dDate[Year] ) ) VAR _CharsSummary = GROUPBY ( _CharsAllYears, fData[Char Value], "MaxYear", MAXX ( CURRENTGROUP (), RELATED ( dDate[Year] ) ), "MinYear", MINX ( CURRENTGROUP (), RELATED ( dDate[Year] ) ) ) VAR _SameYearChars = FILTER ( _CharsSummary, [MaxYear] = [MinYear] ) RETURN COUNTROWS ( _SameYearChars )
Hi @Stachu,
Thanks for your quick response.
This measure is pretty fast!
A few observations:
Best,
Sifar
I'm still not clear what you want to measure exactly - e.g. if there is [Char Value] in 1999 in France and in 2000 in DE should it be counted as unique or not?
regarding the error message - do you have relationships between dimensions tables? in order for them to work there should be 1 to many relationship, I don't really see why it would be many to many, and I don't think they're necessary in fData if you have them in dimension tables
Hi @Stachu,
You can see it in the relationship window in PowerPivot. Am i doing it correctly?
Hope this makes it clear.
OK, so I guess something like this should work, as long as you add the additional criteria you want to ignore to the
_CharsAll
UniquesCount:=
VAR _Chars =
VALUES ( fData[Char Value] )
VAR _CharsAll =
CALCULATETABLE ( fData, _Chars, ALL ( dDate[Year] ), ALL(dCountry[Country]) ) --add other filters here
VAR _CharsSummary =
GROUPBY (
_CharsAll,
fData[Char Value],
"MaxYear", MAXX ( CURRENTGROUP (), RELATED ( dDate[Year] ) ),
"MinYear", MINX ( CURRENTGROUP (), RELATED ( dDate[Year] ) ),
"NrOfChars", COUNTAX(CURRENTGROUP(), fData[Char Value])
)
VAR _SameYearChars =
FILTER ( _CharsSummary, [MaxYear] = [MinYear] )
RETURN
SUMX(_SameYearChars,[NrOfChars])
Hi @Stachu,
I am getting the following memory error everytime (tried restarting excel also) when applying the measure on my original dataset, though i am using Excel 2016 Pro 64 bit, with 8 GB RAM on a Windows 10 64 bit machine.
UniquesCount:= // get unique Char values VAR _Chars = VALUES ( fData[Char Value] ) // Filter fData table based on only the unique Char values VAR _CharsAll = CALCULATETABLE ( fData, _Chars, ALL ( dDate[Year] ), ALL ( dDate[Date] ), ALL ( dCountry[Country] ), ALL ( dCategory[Category] ), ALL ( dBrandOwner[BrandOwner] ), ALL ( dBrand[Brand] ), ALL ( dSubBrand[SubBrand] ), ALL ( dInnType[InnType] ), ALL ( dInnSubType[InnSubType] ), ALL ( dCharDescription[CharDescription] ) ) // VAR _CharsSummary = GROUPBY ( _CharsAll, fData[Char Value], "MaxYear", MAXX ( CURRENTGROUP (), RELATED ( dDate[Year] ) ), "MinYear", MINX ( CURRENTGROUP (), RELATED ( dDate[Year] ) ), "NrOfChars", COUNTAX( CURRENTGROUP (), fData[Char Value] ) ) // VAR _SameYearChars = FILTER ( _CharsSummary, [MaxYear] = [MinYear] ) RETURN SUMX( _SameYearChars, [NrOfChars] )
I have added the additional filters. Can this measure be optimized?
Hi @Anonymous
Here is the document about Optimising Memory Usage for your reference. Simplified sample data and expected output will be helpful to provide an accurate solution.
https://www.linkedin.com/pulse/ten-techniques-optimising-memory-usage-microsoft-power-dejan-pajk
Regards,
Cherie
my guess the measure are not performaing beacause of the number of conditionals and iterator function you use
what do you want to measure in business terms with e.g. UniquesCount and UniqueCharsCount? why use IF and 1 as alternative resource?
can you also share anonymised sample of your data in copyable format here?
Hi @Stachu,
Thanks for your quick response.
UniquesCount : is the count of unique Char Values that occurred first time within a year and not found in subsequent years. They can occur more than once within the year. I need to know their Firstdate of appearance (Item Appearance Date).
UniqueCharsCount : is the same as above. The only thing is i am counting their more than once occurence within the year as 1.
I need to know when UNIQUE Char Values appeared First time i.e FirstDate
I also need to know when DISTINCT Char Values appeared First time i.e FirstDate for :
i.e. how their paths have traversed across different Countries, Categories, Manufacturers over the years.
Also, is there a way to reduce file size for sharing purposes?
P.S: How do i anonymize this large data stretching different countries, categories, brand owners and years? Is there some free tool to do this?
can you just share the Top20 rows from your data table and the expected output for the shared sample?
here is a post on how to share:
https://community.powerbi.com/t5/Community-Blog/How-to-Get-Your-Question-Answered-Quickly/ba-p/38490
regarding the measures:
does UniqueCount give you correct results? for the first date of appearance SUMMARIZE by Char with ALL and MIN on date should work
Hi @Stachu,
I have shared the sample anonymized testdata workbook for your inspection that you can download. I have added the 2 tables to the Data Model + the measures and created a sample pivot with Slicers. You can add Unique conditional formatting to E column to know if Char Values are indeed unique across Categories, Countries, BrandOwners & Years.
With the huge amount of data, i am not sure i am getting Uniques correctly. This is because a Char Value may appear in multiple Countries or Categories.
Regarding the FirstCharValDateofAppearance, i am not sure how to write this measure or represent it in pivot. Would need some help here.
Let me know if this is sufficient or you need anything else.
User | Count |
---|---|
140 | |
113 | |
104 | |
77 | |
65 |
User | Count |
---|---|
136 | |
118 | |
101 | |
71 | |
61 |