Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi . I have been trying to import a csv file direct from a web url , which is zipped.
I have attempted to do this with python. but it just seems to run for ever.
Is my code incorrect , or is there a simpler solution please
from io import BytesIO
from zipfile import ZipFile
import pandas
import requests
url = "https://www.anac.gov.br/assuntos/dados-e-estatisticas/dados-estatisticos/arquivos/Dados_Estatisticos..."
content = requests.get(url)
zf = ZipFile(BytesIO(content.content))
for item in zf.namelist():
print("File in zip: "+ item)
# find the first matching csv file in the zip:
match = [s for s in zf.namelist() if "Dados" in s][0]
# the first line of the file contains a string - that line shall de ignored, hence skiprows
df = pandas.read_csv(zf.open(match),encoding='latin-1', error_bad_lines=False)
print(df)
Solved! Go to Solution.
Hello @Pandadev ,
Try these codes.
Repackaged the url.
from io import BytesIO
from zipfile import ZipFile
import pandas
import requests
url = "https://www.anac.gov.br/assuntos/dados-e-estatisticas/dados-estatisticos/arquivos/Dados_Estatisticos.zip/@@download/file/Dados%20Estat%C3%ADsticos.zip"
filename = requests.get(url).content
zf = ZipFile( BytesIO(filename), 'r' )
for item in zf.namelist():
print("File in zip: "+ item)
match = [s for s in zf.namelist() if "Dados" in s][0]
df = pandas.read_csv( zf.open(match), encoding='latin-1', error_bad_lines=False)
print(df)
Best regards
Lionel Chen
If this post helps,then consider Accepting it as the solution to help other members find it more quickly.
Hello @Pandadev ,
Try these codes.
Repackaged the url.
from io import BytesIO
from zipfile import ZipFile
import pandas
import requests
url = "https://www.anac.gov.br/assuntos/dados-e-estatisticas/dados-estatisticos/arquivos/Dados_Estatisticos.zip/@@download/file/Dados%20Estat%C3%ADsticos.zip"
filename = requests.get(url).content
zf = ZipFile( BytesIO(filename), 'r' )
for item in zf.namelist():
print("File in zip: "+ item)
match = [s for s in zf.namelist() if "Dados" in s][0]
df = pandas.read_csv( zf.open(match), encoding='latin-1', error_bad_lines=False)
print(df)
Best regards
Lionel Chen
If this post helps,then consider Accepting it as the solution to help other members find it more quickly.
When I run the python script in PowerBI it does not work , but when I run it direct in Python it works , any ideas why this is the case.
Thanks that worked perfect
@Pandadev , Check if these can help
https://www.youtube.com/watch?v=OzQ44gwi5Kw
https://www.youtube.com/watch?v=6vg7u1WeK7U
https://community.powerbi.com/t5/Desktop/Zip-file-from-web-source/td-p/528194
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
115 | |
100 | |
88 | |
69 | |
61 |
User | Count |
---|---|
151 | |
120 | |
103 | |
87 | |
68 |