Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Anonymous
Not applicable

Help filtering in Advanced Editor

Hi all!,

 

So I'm having a problem, in my BI i'm importing big .csv files and i want to filter some information in the advanced editor in order to reduce the size. But I'm new doing this and something is wrong because no filter is being apply when writting this:

 

let
Source = Folder.Files("C:\Users\60047182\Desktop\MARK\Producción\Archivos de produccion"),
#"C:\Users\60047182\Desktop\MARK\Producción\Archivos de produccion\_produccin-de-pozos-de-gas-y-petrleo-2006 csv" = Source{[#"Folder Path"="C:\Users\60047182\Desktop\MARK\Producción\Archivos de produccion\",Name="produccin-de-pozos-de-gas-y-petrleo-2006.csv"]}[Content],
#"Imported CSV" = Csv.Document(#"C:\Users\60047182\Desktop\MARK\Producción\Archivos de produccion\_produccin-de-pozos-de-gas-y-petrleo-2006 csv",[Delimiter=",", Columns=38, Encoding=65001, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(#"Imported CSV", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"idempresa", type text}, {"anio", Int64.Type}, {"mes", Int64.Type}, {"idpozo", Int64.Type}, {"prod_pet", Int64.Type}, {"prod_gas", Int64.Type}, {"prod_agua", Int64.Type}, {"iny_agua", Int64.Type}, {"iny_gas", Int64.Type}, {"iny_co2", Int64.Type}, {"iny_otro", Int64.Type}, {"tef", Int64.Type}, {"vida_util", Int64.Type}, {"tipoextraccion", type text}, {"tipoestado", type text}, {"tipopozo", type text}, {"observaciones", type text}, {"fechaingreso", type datetime}, {"rectificado", type text}, {"habilitado", type text}, {"idusuario", Int64.Type}, {"empresa", type text}, {"sigla", type text}, {"formprod", type text}, {"profundidad", Int64.Type}, {"formacion", type text}, {"idareapermisoconcesion", type text}, {"areapermisoconcesion", type text}, {"idareayacimiento", type text}, {"areayacimiento", type text}, {"cuenca", type text}, {"provincia", type text}, {"tipo_de_recurso", type text}, {"proyecto", type text}, {"clasificacion", type text}, {"subclasificacion", type text}, {"sub_tipo_recurso", type text}, {"fecha_data", type date}}),
#"filter rows" =Table.SelectRows(#"C:\Users\60047182\Desktop\MARK\Producción\Archivos de produccion\_produccin-de-pozos-de-gas-y-petrleo-2006 csv","prod_pet">0,"prod_gas">0)
in
#"Changed Type"

 

Any ideas? Any suggestion in order to use the best practice to reduce BI size and optimize it?

Thanks!

2 ACCEPTED SOLUTIONS
KBO
MVP

Hi @Anonymous ,

I guess because you must mark it as last step in your code :).

 

#"filter rows" =Table.SelectRows(#"C:\Users\60047182\Desktop\MARK\Producción\Archivos de produccion\_produccin-de-pozos-de-gas-y-petrleo-2006 csv","prod_pet">0,"prod_gas">0)
in
#"filter rows"

 

 

Best,

Kathrin

 

 

 

 

If this post has helped you, please give it a thumbs up!
Did I answer your question? Mark my post as a solution!

View solution in original post

Hi @Anonymous ,

 

I got your CSV file. 290MB, over 940,000 records. Given the critera you gave of Pet_Prod > 0 and Pet_Gas > 0 it reduced it to 215,000 records, which took less than 20 seconds here to load in to the data model.

 

The query I used was this:

let
    Source = Csv.Document(File.Contents("C:\Users\Ed Hansberry\OneDrive - eHansalytics\Downloads\produccin-de-pozos-de-gas-y-petrleo-2019.csv"),[Delimiter=",", Columns=38, Encoding=65001, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"idempresa", type text}, {"anio", Int64.Type}, {"mes", Int64.Type}, {"idpozo", Int64.Type}, {"prod_pet", type number}, {"prod_gas", Int64.Type}, {"prod_agua", type number}, {"iny_agua", type number}, {"iny_gas", Int64.Type}, {"iny_co2", Int64.Type}, {"iny_otro", Int64.Type}, {"tef", type number}, {"vida_util", Int64.Type}, {"tipoextraccion", type text}, {"tipoestado", type text}, {"tipopozo", type text}, {"observaciones", type text}, {"fechaingreso", type datetime}, {"rectificado", type text}, {"habilitado", type text}, {"idusuario", Int64.Type}, {"empresa", type text}, {"sigla", type text}, {"formprod", type text}, {"profundidad", type number}, {"formacion", type text}, {"idareapermisoconcesion", type text}, {"areapermisoconcesion", type text}, {"idareayacimiento", type text}, {"areayacimiento", type text}, {"cuenca", type text}, {"provincia", type text}, {"tipo_de_recurso", type text}, {"proyecto", type text}, {"clasificacion", type text}, {"subclasificacion", type text}, {"sub_tipo_recurso", type text}, {"fecha_data", type date}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([prod_pet] > 0) and ([prod_gas] > 0))
in
    #"Filtered Rows"

I did that 100% in the Power Query UI model. No advanced coding at all. Took just a few seconds. Your Table.SelectRows statement had the wrong syntax and too many arguments. You can see the correct syntax above.

 

I strongly recommend until you get a really good handle on M code you use the UI. I have a decent handle on M code and I rarely hand-craft a Table.SelectRows() statement from scratch. Even if I need it in a structured column, I'll often create the core statement with the UI then copy and paste the code where I need it. M is case sensitive and even with Intellisense it isn't the easist language to just type out.

 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

View solution in original post

8 REPLIES 8
v-lili6-msft
Community Support
Community Support

hi  @Anonymous 

The main problem is as @KBO noted, the last statement should generally be

in
   #"Filtered Rows"

And the last step code should be like this:

let
Source = Folder.Files("C:\Users\60047182\Desktop\MARK\Producción\Archivos de produccion"),
#"C:\Users\60047182\Desktop\MARK\Producción\Archivos de produccion\_produccin-de-pozos-de-gas-y-petrleo-2006 csv" = Source{[#"Folder Path"="C:\Users\60047182\Desktop\MARK\Producción\Archivos de produccion\",Name="produccin-de-pozos-de-gas-y-petrleo-2006.csv"]}[Content],
#"Imported CSV" = Csv.Document(#"C:\Users\60047182\Desktop\MARK\Producción\Archivos de produccion\_produccin-de-pozos-de-gas-y-petrleo-2006 csv",[Delimiter=",", Columns=38, Encoding=65001, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(#"Imported CSV", [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"idempresa", type text}, {"anio", Int64.Type}, {"mes", Int64.Type}, {"idpozo", Int64.Type}, {"prod_pet", Int64.Type}, {"prod_gas", Int64.Type}, {"prod_agua", Int64.Type}, {"iny_agua", Int64.Type}, {"iny_gas", Int64.Type}, {"iny_co2", Int64.Type}, {"iny_otro", Int64.Type}, {"tef", Int64.Type}, {"vida_util", Int64.Type}, {"tipoextraccion", type text}, {"tipoestado", type text}, {"tipopozo", type text}, {"observaciones", type text}, {"fechaingreso", type datetime}, {"rectificado", type text}, {"habilitado", type text}, {"idusuario", Int64.Type}, {"empresa", type text}, {"sigla", type text}, {"formprod", type text}, {"profundidad", Int64.Type}, {"formacion", type text}, {"idareapermisoconcesion", type text}, {"areapermisoconcesion", type text}, {"idareayacimiento", type text}, {"areayacimiento", type text}, {"cuenca", type text}, {"provincia", type text}, {"tipo_de_recurso", type text}, {"proyecto", type text}, {"clasificacion", type text}, {"subclasificacion", type text}, {"sub_tipo_recurso", type text}, {"fecha_data", type date}}),
#"filter rows" =Table.SelectRows(#"Changed Type","prod_pet">0,"prod_gas">0)
in
#"filter rows"

 

Regards,

Lin

Community Support Team _ Lin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
edhans
Super User
Super User

@Anonymous Your code has a lot of errors.

as @KBO noted, the last statement should generally be

in
   #"Filtered Rows"

As that is the last statement you've done.

 

But the #"Filtered Rows" is also pointing to your cource, and a file name to boot. It should be Source, but in reailty it should be #"Changed Type" the step above it.

If you can send me your source file so my query works, I can fix it, but Honestly I'd recommend starting over and just using the GUI to create the steps and not the Advanced Editor. It is a simple 3 step process and using the Advanced Editor can get you in trouble if you don't have a good grasp of M code.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
Anonymous
Not applicable

Thanks every one for your answers,

@edhans if I work with steps the problem of having a really big Power Bi will persist and it won't be functional.

This is one of the files I'm using LINK .

 

There is any good guide to strart getting familiar with M??

 

Thanks,

Ignacio.

@Anonymous your linked file was empty - zero bytes

 

Not sure what you are saying about not being able to use Power Query's UI due to the file size. I do it all of the time. Hundreds of MB CSV files. 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
Anonymous
Not applicable

I don't know what happend with the link, here it goes again this is one LINK

In the file you gonna find the information for all the 2019 and I I have one of this files since 2006, as soon as I upload all of them de BI size is already 80mb.

Hi @Anonymous ,

 

I got your CSV file. 290MB, over 940,000 records. Given the critera you gave of Pet_Prod > 0 and Pet_Gas > 0 it reduced it to 215,000 records, which took less than 20 seconds here to load in to the data model.

 

The query I used was this:

let
    Source = Csv.Document(File.Contents("C:\Users\Ed Hansberry\OneDrive - eHansalytics\Downloads\produccin-de-pozos-de-gas-y-petrleo-2019.csv"),[Delimiter=",", Columns=38, Encoding=65001, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"idempresa", type text}, {"anio", Int64.Type}, {"mes", Int64.Type}, {"idpozo", Int64.Type}, {"prod_pet", type number}, {"prod_gas", Int64.Type}, {"prod_agua", type number}, {"iny_agua", type number}, {"iny_gas", Int64.Type}, {"iny_co2", Int64.Type}, {"iny_otro", Int64.Type}, {"tef", type number}, {"vida_util", Int64.Type}, {"tipoextraccion", type text}, {"tipoestado", type text}, {"tipopozo", type text}, {"observaciones", type text}, {"fechaingreso", type datetime}, {"rectificado", type text}, {"habilitado", type text}, {"idusuario", Int64.Type}, {"empresa", type text}, {"sigla", type text}, {"formprod", type text}, {"profundidad", type number}, {"formacion", type text}, {"idareapermisoconcesion", type text}, {"areapermisoconcesion", type text}, {"idareayacimiento", type text}, {"areayacimiento", type text}, {"cuenca", type text}, {"provincia", type text}, {"tipo_de_recurso", type text}, {"proyecto", type text}, {"clasificacion", type text}, {"subclasificacion", type text}, {"sub_tipo_recurso", type text}, {"fecha_data", type date}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([prod_pet] > 0) and ([prod_gas] > 0))
in
    #"Filtered Rows"

I did that 100% in the Power Query UI model. No advanced coding at all. Took just a few seconds. Your Table.SelectRows statement had the wrong syntax and too many arguments. You can see the correct syntax above.

 

I strongly recommend until you get a really good handle on M code you use the UI. I have a decent handle on M code and I rarely hand-craft a Table.SelectRows() statement from scratch. Even if I need it in a structured column, I'll often create the core statement with the UI then copy and paste the code where I need it. M is case sensitive and even with Intellisense it isn't the easist language to just type out.

 



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
Greg_Deckler
Super User
Super User

Yep, @KBO 's got it. I've made that same mistake a few times!!


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...
KBO
MVP

Hi @Anonymous ,

I guess because you must mark it as last step in your code :).

 

#"filter rows" =Table.SelectRows(#"C:\Users\60047182\Desktop\MARK\Producción\Archivos de produccion\_produccin-de-pozos-de-gas-y-petrleo-2006 csv","prod_pet">0,"prod_gas">0)
in
#"filter rows"

 

 

Best,

Kathrin

 

 

 

 

If this post has helped you, please give it a thumbs up!
Did I answer your question? Mark my post as a solution!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.