2

I have recently started learning python and one of my first project is to get live stock prices from Google finance using beautifulsoup. Basically I am looking up for a stock and setting a price alert.

here is what my code looks like.

import requests
import time
import tkinter
from bs4 import BeautifulSoup

def st_Price(symbol):
    baseurl = 'http://google.com/finance/quote/'
    URL = baseurl + symbol + ":NSE?hl=en&gl=in"
    
    page = requests.get(URL)
    
    soup = BeautifulSoup(page.content, 'html.parser')
    
    results = soup.find(class_="YMlKec fxKbKc")

    result = results.__str__()
    #print(result)

    res = result.split("₹")[1].split("<")[0]

    res_flt = float(res.replace(",",""))
      
    return res_flt
        
def main():
    
    sym = input("Enter Stock Symbol : ")
    price = input("Enter desired price : ")
    
    
    x = st_Price(sym)
    
    while x < float(price):
        print(x)
        t1 = time.perf_counter()
        x = st_Price(sym)
        t2 = time.perf_counter()
        print("Internal refresh time is {}".format(t2-t1))
    else:
        print("The Stock {} achieved price greater than {}".format(sym,x))
        root = tkinter.Tk()
        root.geometry("150x150")
        tkinter.messagebox.showinfo(title="Price Alert",message="Stock Price {} greater Than {}".format(x,price))
        root.destroy()
    

if __name__ == "__main__":
    main()

I am looking up following class in the Page HTML:

HTML element for the Stock

The code works perfectly fine but it takes too much time to fetch the information:

Enter Stock Symbol : INFY

Enter desired price : 1578
1574.0
Internal refresh time is 9.915285099999892
1574.0
Internal refresh time is 7.2284357999997155

I am not too much familiar with HTML. By referring online documentation I was able to figure out how to scrape necessary part.

Is there any way to reduce the time to fetch the data ?

GinSan
  • 21
  • 1
  • since the bottleneck seems to be the request i dont think you cant do much about it. Maybe you can split the tasks of fetching the html and parsing it, so you do the fetching first (and maybe periodic into local file or what every storage you prefer) and parse it from there and decouple the the parsing that way. That might not the best solution if you need it every other second – Markus Rosjat Jun 27 '21 at 11:59
  • @Markus ,I found something here: https://stackoverflow.com/a/62218607/16326666 So I changed one line of my code from `soup = BeautifulSoup(page.content, 'html.parser')` to `soup = BeautifulSoup(page.Text, 'html.parser')` and it the refresh went less than 2 seconds. Since I do not understand HTML parsing too much, it is hard for me to understand why it works. – GinSan Jun 29 '21 at 12:58
  • mostlikely something get skiped when you ask for .Text, maybe all the head stuff with the metadata. But if 2 sec is ok for your need then you good to go i guess ;) – Markus Rosjat Jun 29 '21 at 13:05
  • well it seems .test formats the content into "html" style format by translate all the tokens like \t or \n into spaces and newlines. This in turn seems to speed up the html parsing which makes sense – Markus Rosjat Jun 29 '21 at 13:16

1 Answers1

0

Have a look at the SelectorGadget Chrome extension to grab CSS selectors by clicking on the desired element in your browser.

Also, when using the requests library, the default requests user-agent is python-requests so websites understand that it's a bot or a script that sends a request, not a real user. Check what's your user-agent and pass it request headers.

To get just the current price you would need to use such CSS selector AHmHk .fxKbKc via the select_one() bs4 method, which could also change in the future.

from bs4 import BeautifulSoup
import requests, lxml

headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36",
        }

html = requests.get(f"https://www.google.com/finance/quote/INFY:NSE", headers=headers, timeout=30)
soup = BeautifulSoup(html.text, "lxml")

current_price = soup.select_one(".zzDege").text
print(current_price)

# ₹1,860.50

Code and full example in the online IDE to scrape current price and right panel data:

from bs4 import BeautifulSoup
import requests, lxml, json
from itertools import zip_longest


def scrape_google_finance(ticker: str):
    # https://docs.python-requests.org/en/master/user/quickstart/#custom-headers
    # https://www.whatismybrowser.com/detect/what-is-my-user-agent
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36",
        }

    html = requests.get(f"https://www.google.com/finance/quote/{ticker}", headers=headers, timeout=30)
    soup = BeautifulSoup(html.text, "lxml")
    
    ticker_data = {"right_panel_data": {},
                    "ticker_info": {}}
    
    ticker_data["ticker_info"]["title"] = soup.select_one(".zzDege").text
    ticker_data["ticker_info"]["current_price"] = soup.select_one(".AHmHk .fxKbKc").text
    
    right_panel_keys = soup.select(".gyFHrc .mfs7Fc")
    right_panel_values = soup.select(".gyFHrc .P6K39c")
    
    for key, value in zip_longest(right_panel_keys, right_panel_values):
        key_value = key.text.lower().replace(" ", "_")

        ticker_data["right_panel_data"][key_value] = value.text
    
    return ticker_data
    

data = scrape_google_finance(ticker="INFY:NSE")

# ensure_ascii=False to display Indian Rupee ₹ symbol
print(json.dumps(data, indent=2, ensure_ascii=False))
print(data["right_panel_data"].get("ceo"))

Outputs:

{
  "right_panel_data": {
    "previous_close": "₹1,882.95",
    "day_range": "₹1,857.15 - ₹1,889.60",
    "year_range": "₹1,311.30 - ₹1,953.90",
    "market_cap": "7.89T INR",
    "p/e_ratio": "36.60",
    "dividend_yield": "1.61%",
    "primary_exchange": "NSE",
    "ceo": "Salil Parekh",
    "founded": "Jul 2, 1981",
    "headquarters": "Bengaluru, KarnatakaIndia",
    "website": "infosys.com",
    "employees": "292,067"
  },
  "ticker_info": {
    "title": "Infosys Ltd",
    "current_price": "₹1,860.50"
  }
}
Salil Parekh

If you want to scrape more data with a line-by-line explanation, there's a Scrape Google Finance Ticker Quote Data in Python blog post of mine.

Dmitriy Zub
  • 1,265
  • 6
  • 25