Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi all,
I have a task of scraping the composite index historical available in Yahoo Finance. Let's take an example of NASDAQ (^IXIC) with the following URL: https://finance.yahoo.com/quote/%5EIXIC/history?period1=1570147200&period2=1601769600&interval=1d&fi...
Start date: 4/10/2019
End date: 4/10/2020
However, I noticed a problem where by default, the URL only displays the first 100 rows. It will only be expanded when you scroll down in the website. As such, the result of my code only returns the earliest date of 13/5/2020 (while obviously expecting the earliest date in the table to be 4/10/2019).
I am not familiar with inline CSS and unfortunately I was not able to understand most of the articles that I could find. I suspect that I could use the second optional argument in Web.BrowserContents( ) by using the WaitFor function. I have inspected the CSS element of the website and I noticed the only thing that changes before the table is fully displayed on web is the class element. I don't know how to add that into my code.
<html id="atomic" class="chrome desktop JsEnabled themelight layoutEnhance(TwoColumnLayout) CollapsibleUh onDemandFocusSupport hasScrolled scrollDown HideNavrail" lang="en-US">
I hope that someone here could help me. Below is my full M code.
let
Source = Web.BrowserContents("https://finance.yahoo.com/quote/%5EIXIC/history?period1=1577836800&period2=1609372800&interval=1d&filter=history&frequency=1d"),
#"Extracted Table From Html" = Html.Table(Source, {
{"Column1",
"TABLE.W\(100\%\).M\(0\) > * > TR > TH:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7),
TABLE.W\(100\%\).M\(0\) > * > TR > TD:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7),
TABLE.W\(100\%\).M\(0\) > * > TR > TD[colspan=""7""]:not([rowspan]):nth-child(1):nth-last-child(1)"},
{"Column2",
"TABLE.W\(100\%\).M\(0\) > * > TR > TH:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7)
+ TH:not([colspan]):not([rowspan]):nth-child(2):nth-last-child(6),
TABLE.W\(100\%\).M\(0\) > * > TR > TD:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7)
+ TD:not([colspan]):not([rowspan]):nth-child(2):nth-last-child(6),
TABLE.W\(100\%\).M\(0\) > * > TR > TD[colspan=""7""]:not([rowspan]):nth-child(1):nth-last-child(1)"},
{"Column3",
"TABLE.W\(100\%\).M\(0\) > * > TR > TH:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7)
+ TH:not([colspan]):not([rowspan]):nth-child(2):nth-last-child(6)
+ TH:not([colspan]):not([rowspan]):nth-child(3):nth-last-child(5),
TABLE.W\(100\%\).M\(0\) > * > TR > TD:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7)
+ TD:not([colspan]):not([rowspan]):nth-child(2):nth-last-child(6)
+ TD:not([colspan]):not([rowspan]):nth-child(3):nth-last-child(5),
TABLE.W\(100\%\).M\(0\) > * > TR > TD[colspan=""7""]:not([rowspan]):nth-child(1):nth-last-child(1)"},
{"Column4",
"TABLE.W\(100\%\).M\(0\) > * > TR > TH:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7)
+ TH:not([colspan]):not([rowspan]):nth-child(2):nth-last-child(6)
+ TH:not([colspan]):not([rowspan]):nth-child(3):nth-last-child(5)
+ TH:not([colspan]):not([rowspan]):nth-child(4):nth-last-child(4),
TABLE.W\(100\%\).M\(0\) > * > TR > TD:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7)
+ TD:not([colspan]):not([rowspan]):nth-child(2):nth-last-child(6)
+ TD:not([colspan]):not([rowspan]):nth-child(3):nth-last-child(5)
+ TD:not([colspan]):not([rowspan]):nth-child(4):nth-last-child(4),
TABLE.W\(100\%\).M\(0\) > * > TR > TD[colspan=""7""]:not([rowspan]):nth-child(1):nth-last-child(1)"},
{"Column5",
"TABLE.W\(100\%\).M\(0\) > * > TR > TH:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7)
+ TH:not([colspan]):not([rowspan]):nth-child(2):nth-last-child(6)
+ TH:not([colspan]):not([rowspan]):nth-child(3):nth-last-child(5)
+ TH:not([colspan]):not([rowspan]):nth-child(4):nth-last-child(4)
+ TH:not([colspan]):not([rowspan]):nth-child(5):nth-last-child(3),
TABLE.W\(100\%\).M\(0\) > * > TR > TD:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7)
+ TD:not([colspan]):not([rowspan]):nth-child(2):nth-last-child(6)
+ TD:not([colspan]):not([rowspan]):nth-child(3):nth-last-child(5)
+ TD:not([colspan]):not([rowspan]):nth-child(4):nth-last-child(4)
+ TD:not([colspan]):not([rowspan]):nth-child(5):nth-last-child(3),
TABLE.W\(100\%\).M\(0\) > * > TR > TD[colspan=""7""]:not([rowspan]):nth-child(1):nth-last-child(1)"},
{"Column6",
"TABLE.W\(100\%\).M\(0\) > * > TR > TH:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7)
+ TH:not([colspan]):not([rowspan]):nth-child(2):nth-last-child(6)
+ TH:not([colspan]):not([rowspan]):nth-child(3):nth-last-child(5)
+ TH:not([colspan]):not([rowspan]):nth-child(4):nth-last-child(4)
+ TH:not([colspan]):not([rowspan]):nth-child(5):nth-last-child(3)
+ TH:not([colspan]):not([rowspan]):nth-child(6):nth-last-child(2),
TABLE.W\(100\%\).M\(0\) > * > TR > TD:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7)
+ TD:not([colspan]):not([rowspan]):nth-child(2):nth-last-child(6)
+ TD:not([colspan]):not([rowspan]):nth-child(3):nth-last-child(5)
+ TD:not([colspan]):not([rowspan]):nth-child(4):nth-last-child(4)
+ TD:not([colspan]):not([rowspan]):nth-child(5):nth-last-child(3)
+ TD:not([colspan]):not([rowspan]):nth-child(6):nth-last-child(2),
TABLE.W\(100\%\).M\(0\) > * > TR > TD[colspan=""7""]:not([rowspan]):nth-child(1):nth-last-child(1)"},
{"Column7",
"TABLE.W\(100\%\).M\(0\) > * > TR > TH:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7)
+ TH:not([colspan]):not([rowspan]):nth-child(2):nth-last-child(6)
+ TH:not([colspan]):not([rowspan]):nth-child(3):nth-last-child(5)
+ TH:not([colspan]):not([rowspan]):nth-child(4):nth-last-child(4)
+ TH:not([colspan]):not([rowspan]):nth-child(5):nth-last-child(3)
+ TH:not([colspan]):not([rowspan]):nth-child(6):nth-last-child(2)
+ TH:not([colspan]):not([rowspan]):nth-child(7):nth-last-child(1),
TABLE.W\(100\%\).M\(0\) > * > TR > TD:not([colspan]):not([rowspan]):nth-child(1):nth-last-child(7)
+ TD:not([colspan]):not([rowspan]):nth-child(2):nth-last-child(6)
+ TD:not([colspan]):not([rowspan]):nth-child(3):nth-last-child(5)
+ TD:not([colspan]):not([rowspan]):nth-child(4):nth-last-child(4)
+ TD:not([colspan]):not([rowspan]):nth-child(5):nth-last-child(3)
+ TD:not([colspan]):not([rowspan]):nth-child(6):nth-last-child(2)
+ TD:not([colspan]):not([rowspan]):nth-child(7):nth-last-child(1),
TABLE.W\(100\%\).M\(0\) > * > TR > TD[colspan=""7""]:not([rowspan]):nth-child(1):nth-last-child(1)"}},
[RowSelector="TABLE.W\(100\%\).M\(0\) > * > TR"]),
#"Promoted Headers" = Table.PromoteHeaders(#"Extracted Table From Html", [PromoteAllScalars=true]),
#"Removed Columns" = Table.RemoveColumns(#"Promoted Headers",{"High", "Low", "Adj Close**"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Close*", "Close"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Date", type date}, {"Open", type number}, {"Close", type number}, {"Volume", type number}}),
#"Removed Errors" = Table.RemoveRowsWithErrors(#"Changed Type", {"Date"}),
#"Sorted Rows" = Table.Sort(#"Removed Errors",{{"Date", Order.Ascending}})
in
#"Sorted Rows"
Thank you
Regards,
Alex
Solved! Go to Solution.
Hi Alex, notice the download link on the right!
= Csv.Document(Web.Contents("https://query1.finance.yahoo.com/v7/finance/download/%5EIXIC?period1=1570147200&period2=1601769600&i..."),[Delimiter=",", Columns=7, Encoding=1252, QuoteStyle=QuoteStyle.None])
Give the above a shot and that should pull everything in!
Hi, what if the page number is hidden in the metadata, for example: https://dailymed.nlm.nih.gov/dailymed/services/v2/ndcs
, this will only give me the first 100 records... any thoughts?
Hi Alex, notice the download link on the right!
= Csv.Document(Web.Contents("https://query1.finance.yahoo.com/v7/finance/download/%5EIXIC?period1=1570147200&period2=1601769600&i..."),[Delimiter=",", Columns=7, Encoding=1252, QuoteStyle=QuoteStyle.None])
Give the above a shot and that should pull everything in!
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
109 | |
96 | |
77 | |
66 | |
54 |
User | Count |
---|---|
144 | |
104 | |
102 | |
88 | |
63 |