Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
sebbyp
Helper III
Helper III

how to import website data including additional table pages under the same web address

I can import website data through the power bi desktop app but the following web address http://www.electionguide.org/elections/past/ has more data when you click next within the table using the same web address.  How do i access all the data in the table?

1 ACCEPTED SOLUTION

Sorry for the late response. On my computer this query works fine:

 

let
    Source = Json.Document(Web.Contents("http://www.electionguide.org/ajax/election/past/?sEcho=3&iColumns=5&sColumns=&iDisplayStart=0&iDisplayLength=100000&mDataProp_0=0&mDataProp_1=1&mDataProp_2=2&mDataProp_3=3&mDataProp_4=4&sSearch=&bRegex=false&sSearch_0=&bRegex_0=false&bSearchable_0=true&sSearch_1=&bRegex_1=false&bSearchable_1=true&sSearch_2=&bRegex_2=false&bSearchable_2=true&sSearch_3=&bRegex_3=false&bSearchable_3=true&sSearch_4=&bRegex_4=false&bSearchable_4=true&iSortCol_0=3&sSortDir_0=desc&iSortingCols=1&bSortable_0=false&bSortable_1=true&bSortable_2=false&bSortable_3=true&bSortable_4=true&_=1461528463674")),
    aaData = Source[aaData],
    #"Converted to Table" = Table.FromList(aaData, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandListColumn(#"Converted to Table", "Column1"),
    #"Added Index" = Table.AddIndexColumn(#"Expanded Column1", "Index", 1, 1),
    #"Inserted Modulo" = Table.AddColumn(#"Added Index", "Inserted Modulo", each Number.Mod([Index], 9), type number),
    #"Added Custom" = Table.AddColumn(#"Inserted Modulo", "Custom", each if [Inserted Modulo] = 1 then [Index] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
    #"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
    #"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns", {{"Inserted Modulo", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Removed Columns", {{"Inserted Modulo", type text}}, "en-US")[#"Inserted Modulo"]), "Inserted Modulo", "Column1"),
    #"Renamed Columns" = Table.RenameColumns(#"Pivoted Column",{{"1", "Flag URL"}, {"4", "Date"}, {"5", "Status"}, {"7", "Election For"}, {"8", "Election Type"}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Renamed Columns",{"2", "3", "6", "0", "Custom"})
in
    #"Removed Columns1"

Can you confirm that you used the same query?

Can you copy and paste the URL below and open it in your browser?

http://www.electionguide.org/ajax/election/past/?sEcho=3&iColumns=5&sColumns=&iDisplayStart=0&iDispl...

 

If you get a valid json response in your browser, it means that the query should work well in Power BI. If you don't get the response, it means that the URL above only works from my computer. If this is the case you will need to install Fiddler and follow the steps I recommended above to extract the URL that was used from your browser when you clicked on of the pages. Then you can use that URL in the query above. I marked the URL to replace in bold red.

 

Hope it helps.

Gil

DataChant.com

 

View solution in original post

15 REPLIES 15

Assuming you are allowed to extract the data, you can run Fiddler and perform reverese engineering on the AJAX that was used on that site.

 

Below is the M query that will pull 100 records from the site you have mentioned, and transform it into a table - My code is quick and dirty, but it works 🙂

The parameters iDisplayStart and iDisplayLength control the offset and count of the results you can fetch

iDisplayStart=0&iDisplayLength=100
let
    Source = Json.Document(Web.Contents("http://www.electionguide.org/ajax/election/past/?sEcho=3&iColumns=5&sColumns=&iDisplayStart=0&iDisplayLength=100&mDataProp_0=0&mDataProp_1=1&mDataProp_2=2&mDataProp_3=3&mDataProp_4=4&sSearch=&bRegex=false&sSearch_0=&bRegex_0=false&bSearchable_0=true&sSearch_1=&bRegex_1=false&bSearchable_1=true&sSearch_2=&bRegex_2=false&bSearchable_2=true&sSearch_3=&bRegex_3=false&bSearchable_3=true&sSearch_4=&bRegex_4=false&bSearchable_4=true&iSortCol_0=3&sSortDir_0=desc&iSortingCols=1&bSortable_0=false&bSortable_1=true&bSortable_2=false&bSortable_3=true&bSortable_4=true&_=1461528463674")),
    aaData = Source[aaData],
    #"Converted to Table" = Table.FromList(aaData, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandListColumn(#"Converted to Table", "Column1"),
    #"Added Index" = Table.AddIndexColumn(#"Expanded Column1", "Index", 1, 1),
    #"Inserted Modulo" = Table.AddColumn(#"Added Index", "Inserted Modulo", each Number.Mod([Index], 9), type number),
    #"Added Custom" = Table.AddColumn(#"Inserted Modulo", "Custom", each if [Inserted Modulo] = 1 then [Index] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
    #"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
    #"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns", {{"Inserted Modulo", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Removed Columns", {{"Inserted Modulo", type text}}, "en-US")[#"Inserted Modulo"]), "Inserted Modulo", "Column1"),
    #"Renamed Columns" = Table.RenameColumns(#"Pivoted Column",{{"1", "Flag URL"}, {"4", "Date"}, {"5", "Status"}, {"7", "Election For"}, {"8", "Election Type"}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Renamed Columns",{"2", "3", "6", "0", "Custom"})
in
    #"Removed Columns1"

 

Im not sure what i am doing wrong but i have copied and pasted your code and it brings up an error for me.  Help please. 

As Inke pointed out, the last parameter is the UNIX time when you access the website, so you can try adding this code instead of the parameter value to dynamically provide the correct value:
FixedLocalNow(), -1) - #datetime(1970, 1, 1, 0, 0, 0))

Hi DataChant,

 

Apologies for my inability to resolve this issue.  I am an amateur and new to PowerBI and some of the language goes over my head.

 

Can i double check what steps i need to follow to pull in all of the data from the paginated data in the tables.

1) Import data from website and paste in wesbite

2) Go to advanced editor and paste in the text you have provided.  This throws up an error

3) I can't move on to the next step as previous step is still an issue

 

Any help would be much appreicated

Please follow these steps:

 

  1. In Power BI Desktop, click the lower part of Get Data, and then click Blank Query.
  2. In Query Editor window, click Advanced Editor.
  3. Replace the default formula with the following one, and click Done.
let
    Source = Json.Document(Web.Contents("http://www.electionguide.org/ajax/election/past/?sEcho=3&iColumns=5&sColumns=&iDisplayStart=0&iDisplayLength=100000&mDataProp_0=0&mDataProp_1=1&mDataProp_2=2&mDataProp_3=3&mDataProp_4=4&sSearch=&bRegex=false&sSearch_0=&bRegex_0=false&bSearchable_0=true&sSearch_1=&bRegex_1=false&bSearchable_1=true&sSearch_2=&bRegex_2=false&bSearchable_2=true&sSearch_3=&bRegex_3=false&bSearchable_3=true&sSearch_4=&bRegex_4=false&bSearchable_4=true&iSortCol_0=3&sSortDir_0=desc&iSortingCols=1&bSortable_0=false&bSortable_1=true&bSortable_2=false&bSortable_3=true&bSortable_4=true&_=1461528463674")),
    aaData = Source[aaData],
    #"Converted to Table" = Table.FromList(aaData, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandListColumn(#"Converted to Table", "Column1"),
    #"Added Index" = Table.AddIndexColumn(#"Expanded Column1", "Index", 1, 1),
    #"Inserted Modulo" = Table.AddColumn(#"Added Index", "Inserted Modulo", each Number.Mod([Index], 9), type number),
    #"Added Custom" = Table.AddColumn(#"Inserted Modulo", "Custom", each if [Inserted Modulo] = 1 then [Index] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
    #"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
    #"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns", {{"Inserted Modulo", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Removed Columns", {{"Inserted Modulo", type text}}, "en-US")[#"Inserted Modulo"]), "Inserted Modulo", "Column1"),
    #"Renamed Columns" = Table.RenameColumns(#"Pivoted Column",{{"1", "Flag URL"}, {"4", "Date"}, {"5", "Status"}, {"7", "Election For"}, {"8", "Election Type"}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Renamed Columns",{"2", "3", "6", "0", "Custom"})
in
    #"Removed Columns1" 

Now click Close & Apply in the Query Editor to load the data.

If you have any issues, please add a screenshot.

 

Thank you,

Gil 

Screenshot.PNG

 

I did exactly as you instructed and an error pops up.  Please see screenshot above

Can you verify that the URL that you pasted is the same as the one I used in my query? The error includes some URL encoded characters in the URL (including the time stamp value which has only numbers, so it doesn't make sense to use URL encoding there).

Yes the same url as i previously mentioned.  Here it is again  http://www.electionguide.org/elections/upcoming/

Sorry for the late response. On my computer this query works fine:

 

let
    Source = Json.Document(Web.Contents("http://www.electionguide.org/ajax/election/past/?sEcho=3&iColumns=5&sColumns=&iDisplayStart=0&iDisplayLength=100000&mDataProp_0=0&mDataProp_1=1&mDataProp_2=2&mDataProp_3=3&mDataProp_4=4&sSearch=&bRegex=false&sSearch_0=&bRegex_0=false&bSearchable_0=true&sSearch_1=&bRegex_1=false&bSearchable_1=true&sSearch_2=&bRegex_2=false&bSearchable_2=true&sSearch_3=&bRegex_3=false&bSearchable_3=true&sSearch_4=&bRegex_4=false&bSearchable_4=true&iSortCol_0=3&sSortDir_0=desc&iSortingCols=1&bSortable_0=false&bSortable_1=true&bSortable_2=false&bSortable_3=true&bSortable_4=true&_=1461528463674")),
    aaData = Source[aaData],
    #"Converted to Table" = Table.FromList(aaData, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandListColumn(#"Converted to Table", "Column1"),
    #"Added Index" = Table.AddIndexColumn(#"Expanded Column1", "Index", 1, 1),
    #"Inserted Modulo" = Table.AddColumn(#"Added Index", "Inserted Modulo", each Number.Mod([Index], 9), type number),
    #"Added Custom" = Table.AddColumn(#"Inserted Modulo", "Custom", each if [Inserted Modulo] = 1 then [Index] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
    #"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
    #"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns", {{"Inserted Modulo", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Removed Columns", {{"Inserted Modulo", type text}}, "en-US")[#"Inserted Modulo"]), "Inserted Modulo", "Column1"),
    #"Renamed Columns" = Table.RenameColumns(#"Pivoted Column",{{"1", "Flag URL"}, {"4", "Date"}, {"5", "Status"}, {"7", "Election For"}, {"8", "Election Type"}}),
    #"Removed Columns1" = Table.RemoveColumns(#"Renamed Columns",{"2", "3", "6", "0", "Custom"})
in
    #"Removed Columns1"

Can you confirm that you used the same query?

Can you copy and paste the URL below and open it in your browser?

http://www.electionguide.org/ajax/election/past/?sEcho=3&iColumns=5&sColumns=&iDisplayStart=0&iDispl...

 

If you get a valid json response in your browser, it means that the query should work well in Power BI. If you don't get the response, it means that the URL above only works from my computer. If this is the case you will need to install Fiddler and follow the steps I recommended above to extract the URL that was used from your browser when you clicked on of the pages. Then you can use that URL in the query above. I marked the URL to replace in bold red.

 

Hope it helps.

Gil

DataChant.com

 

Hi DataChant,

 

I am in my 2nd week entering into the Power BI world. I am trying to extract data from a website to monitor the housing market in our city. The following code is where I have got to so far, however I just couldn't quite understand your solution for to handle the multiple page on the website. I want to extract all the property listings on the website. I know nothing about JSON apart from the name. 2 questions following:

 

Is there another way in Power BI apart from using JSON?

Is there any chance you could explain in a bit more detail about the JSON approach?

Also from my code, what should I do next to make it a table? Tried Transpose and it's not what I want. I only want the header once. 

 

Thanks a lot.

 

let
Source =Table.FromColumns({Lines.FromBinary(Web.Contents("http://www.realestate.co.nz/residential/search/districts/237/property_types/1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C50"),null,null,1252)}),
#"Renamed Columns" = Table.RenameColumns(Source,{{"Column1", "HTML"}}),
#"Added Conditional Column" = Table.AddColumn(#"Renamed Columns", "FindListingHeader", each if Text.Contains([HTML], "id=""listing-") then "ListingNo" else if Text.Contains([HTML], "itemprop=""name""") then "Slogan" else if Text.Contains([HTML], "itemprop=""description""") then "Description" else if Text.Contains([HTML], "itemprop=""streetAddress""") then "StreetAddress" else if Text.Contains([HTML], "itemprop=""addressLocality""") then "Suburb" else if Text.Contains([HTML], "class=""price""") then "Price" else if Text.Contains([HTML], "Bedrooms") then "Bedrooms" else if Text.Contains([HTML], "Bathrooms") then "Bathrooms" else null ),
#"Added Conditional Column1" = Table.AddColumn(#"Added Conditional Column", "FindListingDetails", each if Text.Contains([HTML], "id=""listing-") then [HTML] else if Text.Contains([HTML], "itemprop=""name""") then [HTML] else if Text.Contains([HTML], "itemprop=""description""") then [HTML] else if Text.Contains([HTML], "itemprop=""addressLocality""") then [HTML] else if Text.Contains([HTML], "itemprop=""streetAddress""") then [HTML] else if Text.Contains([HTML], "class=""price""") then [HTML] else if Text.Contains([HTML], "Bedrooms</h6></li>") then [HTML] else if Text.Contains([HTML], "Bathrooms</h6></li>") then [HTML] else null ),
#"Changed Type" = Table.TransformColumnTypes(#"Added Conditional Column1",{{"FindListingDetails", type text}, {"FindListingHeader", type text}}),
#"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"FindListingDetails", Text.Trim}}),
#"Replaced Value" = Table.ReplaceValue(#"Trimmed Text","<span itemprop=""name"">","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","<p itemprop=""description"">","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1","<span class=""location"" itemprop=""addressLocality"">","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Replaced Value3" = Table.ReplaceValue(#"Replaced Value2","<span itemprop=""streetAddress"">","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Replaced Value4" = Table.ReplaceValue(#"Replaced Value3","<div class=""price"">","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Replaced Value5" = Table.ReplaceValue(#"Replaced Value4","<li><h6>","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Replaced Value6" = Table.ReplaceValue(#"Replaced Value5","</h6></li>","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Replaced Value7" = Table.ReplaceValue(#"Replaced Value6","</span>","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Replaced Value8" = Table.ReplaceValue(#"Replaced Value7","</div>","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Replaced Value9" = Table.ReplaceValue(#"Replaced Value8","</p","",Replacer.ReplaceText,{"HTML"}),
#"Remove """ = Table.ReplaceValue(#"Replaced Value9","""","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Replaced Value12" = Table.ReplaceValue(#"Remove ""","<div id=listing-","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Replaced Value13" = Table.ReplaceValue(#"Replaced Value12"," class=listing featuredListing itemscope itemtype=http://schema.org/Residence data-gtm={","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Remove ," = Table.ReplaceValue(#"Replaced Value13",",","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Remove }" = Table.ReplaceValue(#"Remove ,","}","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Remove >" = Table.ReplaceValue(#"Remove }",">","",Replacer.ReplaceText,{"FindListingDetails"}),
#"Filtered Rows1" = Table.SelectRows(#"Remove >", each [FindListingDetails] <> null and [FindListingDetails] <> ""),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows1",{"HTML"})
in
#"Removed Columns"

 

Thats very cool!

I'd be very happy if you could provide a link where a rookie like me could start to learn that AJAX reverse engineering.

 

You might be interested in the power of "Record.FromList" that I've just recently discovered to elegantly open JSON-files or other stuff that comes in list/record format. Add this as a step after #"Converted to Table", delete the rest and expand the results. Just some cleanup and you're done 🙂

    #"Added Custom" = Table.AddColumn(#"Converted to Table", "Custom", each Record.FromList([Column1], type [FlagURL=text, Del1=any, Del2=any, Date=date, Status=text, x=number,ElectionFor=text, ElectionType=text, y=number])),

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Thank you Imke,

 

I like your propsal to use Record.FromList.

 

Regarding the reverse engineering - 

You can install Fiddler (here) and run it. Then go to the website and click on one of the page numbers.

In the Fiddler window you should be able to see the relevant HTTP request and copy the URL.

 

Now, if we are lucky the URL will contain meanginful parameters that represent the page offset and number of results to display, so you can later modify these parameters to get more results, or run a query function to paginate over all the results.

Thank you Gil, works like a charm!

 

Tail of the day: _=1461562205709 🙂 (last element of the URL-string to be replaced)

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

That's really clever, didn't even think of using Fiddler! Duh!!


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...
Greg_Deckler
Super User
Super User

Can't think of a way from looking at the page source. Looks like a lot of code behind and bootstrap doing the interface magic.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.