Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
So I'm trying to analyse the metadata of Danbooru2018, to get practice with large datasets. The metadata includes a list of tags for each image : Which work it's from, which characters appear in it, etc.
I think the most common way to work with these types of tags is to make a number of columns, one per tag.
Problem is, I'm pretty sure I can't do that, because a fair amount of images have 100+ tags, and I'm very much not sure my computer can handle a table with 3,3M rows and 100 columns, (that's if I choose to forgo anything over 100 because if I go by the worst picture, it's 874, but I can live without it, I think) (current approach is 92M rows and 4 columns, to compare) and furthermore, when trying to split the list of tags for an image it gives me an error.
So what I've done instead is create another "tags" table and expanded as rows the tag list, as Power BI gave me the option and it worked. So basically for each image I have a number of rows, one for each tag. Here's for instance how it works for an image :
So now, I can look at the tag counts individually just fine, but I can't really filter by one : because when I filter by a tag, I would like it to filter based on the images with that tag, but it filters by that tag instead.
So if I go back to the exemple and I pick "Pokémon" as a filter, well, it's only going to return the fifth row, so I'm not going to be able to see the other tags.
It goes like this : Filter by tag "Pokémon" -> 5th row of Picture #19496 has the tag -> Return this row.
which only gives me the row corresponding to the tag,which means for each image I only see the single tag I selected which... is exactly how Power BI is expected to behave, to be honest, but not what I would like, sadly.
What I would like instead, is for it to filter based on the IDs of the pictures with that tag, not the tag itself. So, if we go back to the exemple I gave, what I would like the system to do is pretty much this logic :
Filter by tag "Pokémon" -> 5th row of Picture #19496 has the tag, thus Picture #19496 has the "Pokémon" tag -> filter table by Picture #19496.
That way it would also return the other rows, so I would be able to see which tags are associated with Pokémon images.
Is there any way to do this ? I've searched for how to work with tags in Power BI, but it's pretty much always with a limited amount of tags, so splitting into columns works.
Solved! Go to Solution.
Hi @Jepacor ,
According to your description, you want to use category fields to filter table instead selected value, right? If this is a case, you can't direct use slicer to achieve this.
I'd like to suggest you create a new table with all tags as source of slicer, then you can write measure to find out selected tag and use its category fields as filter parameter.
Table formula: Selector = DISTINCT ( ALL ( Table[Tag_ID], [Tag] ) ) Measure formula: Measure = VAR selected = ALLSELECTED ( Selector[Tag_ID] ) VAR _list = CALCULATE ( VALUES ( Table[Source_ID] ), FILTER ( ALLSELECTED ( Table ), [Tag_ID] IN selected ) ) RETURN IF ( SELECTEDVALUE ( Table[Source_ID] ) IN _list, "Y", "N" )
After these steps, you can drag this measure to visual level filter to filter unmatched records.
Regards,
Xiaoxin Sheng
Hi @Jepacor ,
According to your description, you want to use category fields to filter table instead selected value, right? If this is a case, you can't direct use slicer to achieve this.
I'd like to suggest you create a new table with all tags as source of slicer, then you can write measure to find out selected tag and use its category fields as filter parameter.
Table formula: Selector = DISTINCT ( ALL ( Table[Tag_ID], [Tag] ) ) Measure formula: Measure = VAR selected = ALLSELECTED ( Selector[Tag_ID] ) VAR _list = CALCULATE ( VALUES ( Table[Source_ID] ), FILTER ( ALLSELECTED ( Table ), [Tag_ID] IN selected ) ) RETURN IF ( SELECTEDVALUE ( Table[Source_ID] ) IN _list, "Y", "N" )
After these steps, you can drag this measure to visual level filter to filter unmatched records.
Regards,
Xiaoxin Sheng
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
114 | |
97 | |
86 | |
70 | |
62 |
User | Count |
---|---|
151 | |
120 | |
103 | |
87 | |
68 |