Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
I am trying to import a large dataset from Redshift into Power BI. To make the data more manageable for PowerBI, I am trying to filter it so it only has the rows that I need in it. However, one of the values I want to include in the filter is not in the dropdown box when I try to filter the column. There is a message saying "List may be incomplete" which is expected as the dataset is very large and PowerQuery is only looking at the top 1000.
So I clicked on "Number Filters" then "equals" and entered the value that is in the rows I want to keep. This results in a message saying "This table is empty."
I thought this might just be for the preview in Power Query, as the value was not in the intial 1000 row preview. However, when I clicked "Close and Apply" the data table in BI was still empty.
What is going on? I am able to filter by the values that DO show up in the preview but that doesn't help me.
Here is the (redacted) code that is in the advanced editor, in case that helps:
let
Source = AmazonRedshift.Database(XXXXXX, XXXXXX, null),
source = Source{[Name="source"]}[Data],
table = source{[Name="table"]}[Data],
#"Filtered Rows1" = Table.SelectRows(table, each Date.IsInPreviousNYears([time_field], 1)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows1",{"XXXXXX"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"XXXXX", type text}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"XXXXX", "XXXXX", "type", "column_to_filter_on", "time_field"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each ([type] <> 2 and [type] <> 4)),
#"Filtered Rows2" = Table.SelectRows(#"Filtered Rows", each [column_to_filter_on] = 9999)
in
#"Filtered Rows2"
Hi @sdoggett
Try querying your dataset from the command line to confirm your data is actually there as expected
https://docs.aws.amazon.com/redshift/latest/mgmt/setting-up-rs-cli.html
https://towardsdatascience.com/redshift-from-the-command-line-5d6b3233f649
Phil
Proud to be a Super User!
Well, if you are sure the data you are filtering for is in the data set, remove the last 2 steps that are doing the filtering, load the entire dataset and check for the values you expect to be there.
Phil
Proud to be a Super User!
I can't load the entire dataset, it is too large. The redshift table I am connecting to has over 100 million records. I want to filter this down to only the records I need for my dashboard before it is loaded into PowerBI, but the value I want to filter by does not show up in the preview.
Hi @sdoggett ,
I doubt you miss the step shown below...Please kindly check the data type of the type and column_to_filter_on columns are number. And you could see the output by clicking the each applied steps.
#"Changed Type" = Table.TransformColumnTypes(Source,{{"type", Int64.Type}, {"column_to_filter_on", Int64.Type}})
Did I answer your question ? Please mark my reply as solution. Thank you very much.
If not, please upload some insensitive data samples and expected output.
Best Regards,
Eyelyn Qin
The type and column_to_filter_on fields were imported as numbers, so this is not the problem.
Hi @sdoggett
The last 2 Filtered Rows steps are causing the empty table. Have you confirmed that your source data actually contains the values you want to keep?
Check the data type of the data you are filering. Although you are filtering numbers,if the data is imported as text, it will be filtered out.
Regards
Phil
If I answered your question please mark my post as the solution.
If my answer helped solve your problem, give it a kudos by clicking on the Thumbs Up.
Proud to be a Super User!
I am sure that my source dataset contains the value I want and that it is imported as a number.
@sdoggett , The last two filters, might be causing no data. remove and check the exact value in the field.
Use data profiling before these steps and check values
https://docs.microsoft.com/en-us/power-query/data-profiling-tools
Will the data profiling tool only be for the preview data or for the entire dataset?
User | Count |
---|---|
139 | |
113 | |
103 | |
73 | |
63 |
User | Count |
---|---|
135 | |
126 | |
110 | |
70 | |
61 |