0

I want to get all the "a" elements with the href attribute in this form: http(s)://any.example.com where any can be a string containing just letters and/or numbers. I'm new to regex and XPath so i can't get it right. I figured it out the regex but i'm not sure if it's 100% correct: Code:

/(http|https)://+[A-Za-z0-9]+\.example+\.+com/

So the XPath would look like this: Code:

document.evaluate( "//a[@href='/(http|https)://+[A-Za-z0-9]+\.google+\.+com/']" , document , null , XPathResult.ORDERED_NODE_SNAPSHOT_TYPE , null );

but it doesn't work.

I would appreciate if someone could help me.

sideshowbarker
  • 72,859
  • 23
  • 167
  • 174
Iulian Onofrei
  • 8,409
  • 9
  • 67
  • 106

2 Answers2

1

As of today, looks like browsers currently does not support XPATH 2. Applying regex over attributes is only supported in XPATH 2.0

You would want to apply regex after filtering for the elements using XPATH 1.0 (no regex), iterate over the elements & further filter the elements using JS level regex instead

References:

  1. https://stackoverflow.com/a/21405499/211794
  2. https://stackoverflow.com/a/6282877/211794
  3. https://developer.mozilla.org/en-US/docs/Web/API/Document/evaluate#Browser_compatibility
Community
  • 1
  • 1
Ashok Koyi
  • 5,012
  • 8
  • 34
  • 41
0

Your regex looks a little off, and overly complex.
Try this:

https?://[A-Za-z0-9]+\.example\.com/
Bohemian
  • 389,931
  • 88
  • 552
  • 692