0

I am using Selenium via Python in attempts to web scrape. I'm almost where I want to be but I ran into what I am now realizing is not so small of a problem. So the element I am working with is this:

<td class=" ui-datepicker-days-cell-over  ui-datepicker-current-day ui-datepicker-today" 
data-handler="selectDay" data-event="click" data-month="3" data-year="2018">
    <a class="ui-state-default ui-state-highlight ui-state-active" href="#">10
    </a>
</td>

My ultimate goal is to get the 10 that's between the a tags. This is my code so far:

option = selenium.webdriver.ChromeOptions()
option.add_argument(" - incognito")
browser = webdriver.Chrome(executable_path=r"chromedriver.exe")
browser.get(myUrl)
calendar = browser.find_element_by_xpath(
    '/html/body/main/section/div[2]/div[1]/div[2]/div[3]/div/div[1]/div/div[1]/div[2]')
viewCal = browser.find_element_by_name('choice_set[begin_at]')
viewCal.click()

row = calendar.find_elements_by_tag_name('tr')

column = calendar.find_elements_by_tag_name('td')
numb = column[0].find_element_by_tag_name('a')
numb.text

numb.text returns '' instead of 10.

What am I doing wrong here?

Ratmir Asanov
  • 5,889
  • 5
  • 24
  • 38
Chae
  • 133
  • 1
  • 1
  • 12

5 Answers5

9

Try to use the following code:

numb.get_attribute("innerText")
halfer
  • 19,471
  • 17
  • 87
  • 173
Ratmir Asanov
  • 5,889
  • 5
  • 24
  • 38
1

I think you don't select the right WebElements in your code.

I tried the following code with a similar datepicker and it was printing the expected daynumber.

days = driver.find_elements_by_xpath('//a[@class="ui-state-default"]')
daynumber = days[12].text
print(daynumber)
Frank
  • 803
  • 1
  • 9
  • 22
1

A good thing to keep in mind

The text & "innerText" does only work with visible text

If you want to get the text of a hidden or invisible element
then "textContent" is just perfect for you!

get_attribute("textContent")

Source - https://stackoverflow.com/a/43430097/14454151

Herker
  • 389
  • 3
  • 17
0

(Posted answer on behalf of the question author.)

I am very confused about why this is so but I guess I went in too deep. I skipped the last two steps of my code and finished with column[0].text instead and that worked! Also as Ratmir answered on the bottom numb.get_attribute("innerText") also gives the correct answer.

halfer
  • 19,471
  • 17
  • 87
  • 173
0

Core logic for get text from WebElement

  • webElement.text
  • webElement.get_attribute("innerText")
  • webElement.get_attribute("textContent")

Full code:

def getText(curElement):
    """
    Get Selenium element text

    Args:
        curElement (WebElement): selenium web element
    Returns:
        str
    Raises:
    """
    # # for debug
    # elementHtml = curElement.get_attribute("innerHTML")
    # print("elementHtml=%s" % elementHtml)

    elementText = curElement.text # sometime NOT work

    if not elementText:
        elementText = curElement.get_attribute("innerText")

    if not elementText:
        elementText = curElement.get_attribute("textContent")

    # print("elementText=%s" % elementText)
    return elementText

Calll it:

curTitle = getText(h2AElement)
crifan
  • 10,971
  • 1
  • 61
  • 46