The Web Design Group

... Making the Web accessible to all.

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
> Correctly "Point" to a Data Element on Webpage
Crusader
post Nov 26 2022, 11:57 AM
Post #1





Group: Members
Posts: 3
Joined: 26-November 22
Member No.: 28,654



New member, first post.

For quite a while I had been using the formula below to import "Earnings Date" from CNBC to GoogleSheets. For the past few months, the formula has stopped working. My understanding is, data elements have been moved around on the CNBC page (redesigned) and I am no longer correctly "pointing" to the element I wish to import ("Earnings Date"). I don't know HTML to figure out how to correct the formula below so it points to "Earnings Date" (under "Events") on the "redesigned" web page. All my trial and error efforts have been in vain.

A ticker symbol has to be entered in the search box at the top right side of the page; this prompts CNBC to pull data for that ticker symbol. One of the data elements is "Earnings Date." It can be found on the lower half of the page, under the heading "Events." A ticker symbol picked at random: AAL. In this example, the data element I am looking for is "01/18/2023(est)."

Please note: The formula below is only the HTML portion of the formula I use in GoogleSheets; I have left out the part that pertains to the "spreadsheet" portion. I will be happy to share the full formula if that helps.

I am requesting assistance in getting the formula below corrected so it "points" to "Earnings Date" on CNBC's web site.

If this is not the correct forum, please guide me to an appropriate forum.

CODE
"//html/body/div[2]/div/div[1]/div[3]/div/div[2]/div[1]/div[5]/div[2]/section/div[3]/ul/li[1]/span[2]"

All help will be greatly appreciated!
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Christian J
post Nov 29 2022, 11:02 AM
Post #2


.
********

Group: WDG Moderators
Posts: 9,630
Joined: 10-August 06
Member No.: 7



QUOTE
CODE
"//html/body/div[2]/div/div[1]/div[3]/div/div[2]/div[1]/div[5]/div[2]/section/div[3]/ul/li[1]/span[2]"

The above looks like a mix of a URL and a javascript DOM tree, is it a proprietary format used by GoogleSheets? Alas I have no idea how GoogleSheets works.
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Crusader
post Nov 29 2022, 01:53 PM
Post #3





Group: Members
Posts: 3
Joined: 26-November 22
Member No.: 28,654



QUOTE(Christian J @ Nov 29 2022, 12:02 PM) *

The above looks like a mix of a URL and a javascript DOM tree, is it a proprietary format used by GoogleSheets? Alas I have no idea how GoogleSheets works.

I don't know if this is GoogleSheets proprietary format. The complete formula is:
CODE
=IMPORTXML("https://www.cnbc.com/quotes/"&A4,"//html/body/div[2]/div/div[1]/div[3]/div/div[2]/div[1]/div[5]/div[2]/section/div[3]/ul/li[1]/span[2]")

Where A4 represents the ticker symbol - in my example, AAL.
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Christian J
post Nov 29 2022, 05:27 PM
Post #4


.
********

Group: WDG Moderators
Posts: 9,630
Joined: 10-August 06
Member No.: 7



QUOTE(Crusader @ Nov 26 2022, 05:57 PM) *

I don't know HTML to figure out how to correct the formula below so it points to "Earnings Date" (under "Events") on the "redesigned" web page.

How did you arrive at this formula in the first place (when it still worked)? Can't you redo the process for the redesigned CNBC page?
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
jimlongo
post Nov 30 2022, 11:07 PM
Post #5


This is My Life
*******

Group: Members
Posts: 1,128
Joined: 24-August 06
From: t-dot
Member No.: 16



why not use some named elements so you don't have to traverse from the top of the dom?

ul.Summary-events-stock li.Summary-stat span.Summary-value


In the teach a man to fish department … you should learn to use the Inspector.
In your browser right click on the date data you want and choose "Inspect Element".
This will tell you a lot about the structure.

What I'm suggesting is that the <ul class="Summary-events-stock"> is unique on the page, so you can start there and follow to the next <li> and the correct <span>

This post has been edited by jimlongo: Nov 30 2022, 11:22 PM
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post
Crusader
post Dec 2 2022, 05:39 AM
Post #6





Group: Members
Posts: 3
Joined: 26-November 22
Member No.: 28,654



Thank you for all the support: it is greatly appreciated! I was able to update second half of the formula using "XPath."

During my research I learnt websites change ("update") pages and use newer (website) technology to prevent "competitors" from importing data from the website; however, that puts individuals like myself in a quandary.

Jimlongo, thank you for your suggestion. I will work towards incorporating named elements in my formula: it will make my formula more manageable and easier to update.

For the record, the updated second half of the formula is as follows:
CODE
"//html/body/div[2]/div/div[1]/div[3]/div/div[2]/div[1]/div[5]/div[2]/section/div[4]/ul/li[1]/span[2]"
User is offlinePM
Go to the top of the page
Toggle Multi-post QuotingQuote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



- Lo-Fi Version Time is now: 28th March 2024 - 11:02 AM