Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi All,
Is there a way to get the image URL from a webpage? From example, the poster images from IMDB.
I don't want to manually right click each of them and copy the image url.
Proud to be a Super User!
Solved! Go to Solution.
Just for fun, I scraped together some pretty abhorrent M using the GUI, pointed at a wikipedia article. Here's what it looks like if you scrape all images from https://en.wikipedia.org/wiki/The_Lord_of_the_Rings:_The_Return_of_the_King
After all the mangling and filtering, the M came out like this:
let Source = Table.FromColumns({Lines.FromBinary(Web.Contents("https://en.wikipedia.org/wiki/The_Lord_of_the_Rings:_The_Return_of_the_King"), null, null, 65001)}), #"Filtered Rows" = Table.SelectRows(Source, each Text.Contains([Column1], "src=""//upload")), #"Split Column by Delimiter" = Table.SplitColumn(#"Filtered Rows", "Column1", Splitter.SplitTextByEachDelimiter({"src=""//"}, QuoteStyle.None, true), {"Column1.1", "Column1.2"}), #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type text}}), #"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type", "Column1.2", Splitter.SplitTextByEachDelimiter({""""}, QuoteStyle.None, false), {"Column1.2.1", "Column1.2.2"}), #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Column1.2.1", type text}, {"Column1.2.2", type text}}), #"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"Column1.1", "Column1.2.2"}), #"Filtered Rows1" = Table.SelectRows(#"Removed Columns", each Text.EndsWith([Column1.2.1], ".jpg") or Text.EndsWith([Column1.2.1], ".png") or Text.EndsWith([Column1.2.1], ".gif")), #"Added Custom" = Table.AddColumn(#"Filtered Rows1", "https", each "https://"), #"Reordered Columns" = Table.ReorderColumns(#"Added Custom",{"https", "Column1.2.1"}), #"Merged Columns" = Table.CombineColumns(#"Reordered Columns",{"https", "Column1.2.1"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"), #"Renamed Columns" = Table.RenameColumns(#"Merged Columns",{{"Merged", "Images"}}), #"Duplicated Column" = Table.DuplicateColumn(#"Renamed Columns", "Images", "Images - Copy"), #"Renamed Columns1" = Table.RenameColumns(#"Duplicated Column",{{"Images - Copy", "ImageURLs"}}) in #"Renamed Columns1"
There's definitely a few things I could do here to clean it up, but the point is, you're going to have to load the webpage as a text file and start filtering down to context around the link you need. Once you isolate the links as full URLs, you can set the column type to "Image URL" and use those images in your Power BI report.
Just for fun, I scraped together some pretty abhorrent M using the GUI, pointed at a wikipedia article. Here's what it looks like if you scrape all images from https://en.wikipedia.org/wiki/The_Lord_of_the_Rings:_The_Return_of_the_King
After all the mangling and filtering, the M came out like this:
let Source = Table.FromColumns({Lines.FromBinary(Web.Contents("https://en.wikipedia.org/wiki/The_Lord_of_the_Rings:_The_Return_of_the_King"), null, null, 65001)}), #"Filtered Rows" = Table.SelectRows(Source, each Text.Contains([Column1], "src=""//upload")), #"Split Column by Delimiter" = Table.SplitColumn(#"Filtered Rows", "Column1", Splitter.SplitTextByEachDelimiter({"src=""//"}, QuoteStyle.None, true), {"Column1.1", "Column1.2"}), #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Column1.1", type text}, {"Column1.2", type text}}), #"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type", "Column1.2", Splitter.SplitTextByEachDelimiter({""""}, QuoteStyle.None, false), {"Column1.2.1", "Column1.2.2"}), #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Column1.2.1", type text}, {"Column1.2.2", type text}}), #"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"Column1.1", "Column1.2.2"}), #"Filtered Rows1" = Table.SelectRows(#"Removed Columns", each Text.EndsWith([Column1.2.1], ".jpg") or Text.EndsWith([Column1.2.1], ".png") or Text.EndsWith([Column1.2.1], ".gif")), #"Added Custom" = Table.AddColumn(#"Filtered Rows1", "https", each "https://"), #"Reordered Columns" = Table.ReorderColumns(#"Added Custom",{"https", "Column1.2.1"}), #"Merged Columns" = Table.CombineColumns(#"Reordered Columns",{"https", "Column1.2.1"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"), #"Renamed Columns" = Table.RenameColumns(#"Merged Columns",{{"Merged", "Images"}}), #"Duplicated Column" = Table.DuplicateColumn(#"Renamed Columns", "Images", "Images - Copy"), #"Renamed Columns1" = Table.RenameColumns(#"Duplicated Column",{{"Images - Copy", "ImageURLs"}}) in #"Renamed Columns1"
There's definitely a few things I could do here to clean it up, but the point is, you're going to have to load the webpage as a text file and start filtering down to context around the link you need. Once you isolate the links as full URLs, you can set the column type to "Image URL" and use those images in your Power BI report.
@Anonymous
Yes, a littile finicky but awesome. Been scraping data from IMDB since last night. Lol. Thanks for the help.
Proud to be a Super User!
You could do a web query
All i know is get a table text data? Is there a link to a tutorial how to do this?
Proud to be a Super User!
@danextian wrote:
All i know is get a table text data? Is there a link to a tutorial how to do this?
I'd say it is out of the scope of Power BI. As to tutorial, you could get various blogs with the help of Google.
So it is not possible?
Proud to be a Super User!
It's possible, but it's a bit finnicky at best. You can get the page as a text document and strip out the URL you're looking for.
If it's specifically IMBD or movie information you're looking for, I'd suggest finding an API, rather than attempting to scrape.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
113 | |
99 | |
80 | |
69 | |
59 |
User | Count |
---|---|
150 | |
119 | |
104 | |
87 | |
67 |