11-21-2018 12:45 AM - edited 11-21-2018 05:31 AM
I need help in understanding how to optimize some measures and have a few questions. My system configuration is :
- OS : Windows 10 (64 bit)
- Memory : 8 GB RAM
- MSOffice : Office 2016 Pro (64 bit)
- Page File Size (Minimum MB) : 12888 MB
- Page File Size (Maximum MB) : 49152 MB
- "Disable Hardware Graphics Accelaration" is checked.
- "Ignore other applications that use Dynamic Data Exchange (DDE)".
I am using PowerQuery to import 1.12 GB of external CSV data from multiple CSV files. After transforming that data by removing unnecessary columns, changing formats etc and adding to PowerPivot Data Model, the size is now reduced to 555MB.
I have created 2 tables in PowerPivot:
- fData table from multiple csv's.
- dDate table from a linked spreadsheet table. I have then deleted the Spreadsheet table and removed the link from dDate table.
- dDate is linked to fData on Date ---> Item Appearance Date.
I have created a Simple Pivot and added Slicers on sheet for the following:
- Year (from dDate table)
- Country, Category, Brand Owner, Char Description, Char Value (from fData table)
I have the following measures created:
TotalCountWithAllFiltersRemoved:=IF(CALCULATE( COUNTA('fData'[Char Value]), ALL( 'fData'[Category], 'fData'[Country], 'fData'[Brand Owner], 'fData'[Item Appearance Date], 'fData'[Char Description] ) ) = COUNTA('fData'[CHAR VALUE]), COUNTA('fData'[Char Value]) )
UniquesCount:=IF( HASONEFILTER('fData'[Category]) && HASONEFILTER('fData'[Country]) && HASONEFILTER('fData'[Brand Owner]) && HASONEFILTER('fData'[Item Appearance Date]) && HASONEFILTER('fData'[Char Description]) && HASONEFILTER('fData'[Char Value]) , [TotalCountWithAllFiltersRemoved], SUMX ( SUMMARIZE ( 'fData', [Category], [Country], [Brand Owner], [Item Appearance Date], [Char Description], [Char Value] ), [TotalCountWithAllFiltersRemoved] ) )
UniqueChars:=IF(CALCULATE( COUNTA('fData'[Char Value]), ALL( 'fData'[Category], 'fData'[Country], 'fData'[Brand Owner], 'fData'[Item Appearance Date], 'fData'[Char Description] ) ) = COUNTA('fData'[Char Value]), 1 )
UniqueCharsCount:=IF( HASONEFILTER('fData'[Category]) && HASONEFILTER('fData'[Country]) && HASONEFILTER('fData'[Brand Owner]) && HASONEFILTER('fData'[Item Appearance Date]) && HASONEFILTER('fData'[Char Description]) && HASONEFILTER('fData'[Char Value]) , [UniqueChars], SUMX ( SUMMARIZE ( 'fData', [Category], [Country], [Brand Owner], [Item Appearance Date], [Char Description], [Char Value] ), [UniqueChars] ) )
- Measures Issues:
- Is it correct to base the Year Slicer on the Year field in the dDate table, instead of Item Appearance Date from fData table?
- The above measures "UniquesCount" and "UniqueCharsCount" when applied to Pivot, do not give the Unique Char Values that appeared every Year. This is evident when i apply a Conditional formatting on the Char Values list for Uniques. Such Char Values newly appeared in that Year i.e. they do not repeat every year. How do i create such a Meaure?
- I would also like to find the Unique Counts of Char Values for SAMEPERIODLASTYEAR. Such Char Values newly appeared in that Year i.e. they do not repeat every year. How do i create such a Meaure?
- How do i also show the FirstDate a Char Value appeared in a particular Country (i.e. Market) or Category or for a Brand Owner (i.e. Manufacturer)? How do i create such a measure?
- Memory issues:
- If i add the DistinctCount measure or add a "Distinct Count of Char Value" or a "Count of Char Value" in Values section of Pivot, the Pivot table calculates and renders quickly. But if i add any of the other Measures, it mostly results in the following Memory error:
- How do i optimize the above measures for better and quick performance?
- Why should the file size show 555MB when i have loaded external data as Connection to Model? Is it because i had clicked saved data to model? How can i conveniently share this file easily with others excluding the large size?
Any help would be most appreciated.
11-21-2018 02:53 AM
my guess the measure are not performaing beacause of the number of conditionals and iterator function you use
what do you want to measure in business terms with e.g. UniquesCount and UniqueCharsCount? why use IF and 1 as alternative resource?
can you also share anonymised sample of your data in copyable format here?
11-21-2018 05:44 AM
Thanks for your quick response.
UniquesCount : is the count of unique Char Values that occurred first time within a year and not found in subsequent years. They can occur more than once within the year. I need to know their Firstdate of appearance (Item Appearance Date).
UniqueCharsCount : is the same as above. The only thing is i am counting their more than once occurence within the year as 1.
I need to know when UNIQUE Char Values appeared First time i.e FirstDate
- for a BrandOwner (Manufacturer)
- in a Country
- in a Category
I also need to know when DISTINCT Char Values appeared First time i.e FirstDate for :
- other Brand Owners (Manufacturers) and when (FirstDate)
- in other Countries and when (FirstDate)
- in other Categories and when (FirstDate)
i.e. how their paths have traversed across different Countries, Categories, Manufacturers over the years.
Also, is there a way to reduce file size for sharing purposes?
P.S: How do i anonymize this large data stretching different countries, categories, brand owners and years? Is there some free tool to do this?
11-21-2018 06:01 AM
can you just share the Top20 rows from your data table and the expected output for the shared sample?
here is a post on how to share:
regarding the measures:
does UniqueCount give you correct results? for the first date of appearance SUMMARIZE by Char with ALL and MIN on date should work
11-21-2018 12:12 PM - edited 11-25-2018 08:10 PM
I have shared the sample anonymized testdata workbook for your inspection that you can download. I have added the 2 tables to the Data Model + the measures and created a sample pivot with Slicers. You can add Unique conditional formatting to E column to know if Char Values are indeed unique across Categories, Countries, BrandOwners & Years.
With the huge amount of data, i am not sure i am getting Uniques correctly. This is because a Char Value may appear in multiple Countries or Categories.
Regarding the FirstCharValDateofAppearance, i am not sure how to write this measure or represent it in pivot. Would need some help here.
Let me know if this is sufficient or you need anything else.
11-21-2018 07:04 PM
Here is the document about Optimising Memory Usage for your reference. Simplified sample data and expected output will be helpful to provide an accurate solution.
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
11-24-2018 01:28 AM
I have mailed you the sample testdata.xlsx file yesterday. It has the 2 tables added to data model ( links removed).
As you will see, there is a many-to-many relationship between country and category as well as brand owner. Also, you will see, some characteristic description are common across categories. I think this is the reason why uniques are not getting identified correctly. How do we tackle this M2M?
12-03-2018 08:32 AM
hi @sifar786, I didn't have time to look into it earlier
can you have a look at this measure, I believe it gives the correct values e.g. 0.12OZ has 1 and 0.15OZ has null as it's was sold in 2004 & 2005)
UniquesCount = VAR _Chars = VALUES ( fData[Char Value] ) VAR _CharsAllYears = CALCULATETABLE ( fData, _Chars, ALL ( dDate[Year] ) ) VAR _CharsSummary = GROUPBY ( _CharsAllYears, fData[Char Value], "MaxYear", MAXX ( CURRENTGROUP (), RELATED ( dDate[Year] ) ), "MinYear", MINX ( CURRENTGROUP (), RELATED ( dDate[Year] ) ) ) VAR _SameYearChars = FILTER ( _CharsSummary, [MaxYear] = [MinYear] ) RETURN COUNTROWS ( _SameYearChars )
12-03-2018 10:30 AM
Thanks for your quick response.
This measure is pretty fast!
A few observations:
- It works well when i apply Year from dDate table and Char Value from fData table to Rows section of Pivot. I check using Conditional Formatting on the Char Values. However, if i add Country, Category or any other columns to the Columns section of pivot, i do not get unique Char Values in Rows anymore. I get a yellow message in Pivot Filter section saying "Relationships need to be created". This i think is because of Many-to-Many relationship in the fData table (see my previous posts or messages in your Inbox for the columns). I tried solving this by creating the addtional single-column tables for relationship with fData table e.g. dBrandOwner, dDate, dCharDescription, dCountry, dCategory etc which contain distinct values collected from fData only. But i think i does not solve the issue. How do i resolve such M2M issues?
- Will any other measures i create work properly, if i use this measure in the other measures? e.g. =CALCULATE ( [UniquesCount], filter, filter, filter....)
- Can you please also answer my questions from my 1st post & 2nd post?
- Can you please briefly explain how this measure you have created works?