module Nokogiri
Nokogiri parses and searches XML/HTML very quickly, and also has correctly implemented CSS3 selector support as well as XPath 1.0 support.
Parsing a document returns either a Nokogiri::XML::Document, or a Nokogiri::HTML4::Document depending on the kind of document you parse.
Here is an example:
require 'nokogiri'
require 'open-uri'
# Get a Nokogiri::HTML4::Document for the page we’re interested in...
doc = Nokogiri::HTML4(URI.open('http://www.google.com/search?q=tenderlove'))
# Do funky things with it using Nokogiri::XML::Node methods...
####
# Search for nodes by css
doc.css('h3.r a.l').each do |link|
puts link.content
end
See also:
-
Nokogiri::XML::Searchable#cssfor more information aboutCSSsearching -
Nokogiri::XML::Searchable#xpathfor more information about XPath searching
Constants
- HTML
-
Alias for
Nokogiri::HTML4 - JAR_DEPENDENCIES
-
generated by the :vendor_jars rake task
- NEKO_VERSION
- VERSION
-
The version of
Nokogiriyou are using - XERCES_VERSION
Public Class Methods
# File lib/nokogiri/html.rb, line 10
Parse HTML. Convenience method for Nokogiri::HTML4::Document.parse
# File lib/nokogiri/html4.rb, line 10 def HTML4(input, url = nil, encoding = nil, options = XML::ParseOptions::DEFAULT_HTML, &block) Nokogiri::HTML4::Document.parse(input, url, encoding, options, &block) end
Parse HTML. Convenience method for Nokogiri::HTML4::Document.parse
# File lib/nokogiri/html5.rb, line 30 def self.HTML5(input, url = nil, encoding = nil, **options, &block) Nokogiri::HTML5::Document.parse(input, url, encoding, **options, &block) end
Since v1.12.0
⚠ HTML5 functionality is not available when running JRuby.
Parse an HTML5 document. Convenience method for {Nokogiri::HTML5::Document.parse}
# File lib/nokogiri.rb, line 83 def Slop(*args, &block) Nokogiri(*args, &block).slop! end
Parse a document and add the Slop decorator. The Slop decorator implements method_missing such that methods may be used instead of CSS or XPath. For example:
doc = Nokogiri::Slop(<<-eohtml)
<html>
<body>
<p>first</p>
<p>second</p>
</body>
</html>
eohtml
assert_equal('second', doc.html.body.p[1].text)
# File lib/nokogiri/xml.rb, line 7 def XML(thing, url = nil, encoding = nil, options = XML::ParseOptions::DEFAULT_XML, &block) Nokogiri::XML::Document.parse(thing, url, encoding, options, &block) end
Parse XML. Convenience method for Nokogiri::XML::Document.parse
# File lib/nokogiri/xslt.rb, line 13
def XSLT(stylesheet, modules = {})
XSLT.parse(stylesheet, modules)
end Create a Nokogiri::XSLT::Stylesheet with stylesheet.
Example:
xslt = Nokogiri::XSLT(File.read(ARGV[0]))
# File lib/nokogiri.rb, line 60
def make(input = nil, opts = {}, &blk)
if input
Nokogiri::HTML4.fragment(input).children.first
else
Nokogiri(&blk)
end
end Create a new Nokogiri::XML::DocumentFragment
# File lib/nokogiri.rb, line 42
def parse(string, url = nil, encoding = nil, options = nil)
if string.respond_to?(:read) ||
/^\s*<(?:!DOCTYPE\s+)?html[\s>]/i.match?(string[0, 512])
# Expect an HTML indicator to appear within the first 512
# characters of a document. (<?xml ?> + <?xml-stylesheet ?>
# shouldn't be that long)
Nokogiri.HTML4(string, url, encoding,
options || XML::ParseOptions::DEFAULT_HTML)
else
Nokogiri.XML(string, url, encoding,
options || XML::ParseOptions::DEFAULT_XML)
end.tap do |doc|
yield doc if block_given?
end
end Parse an HTML or XML document. string contains the document.
© 2008–2023 by Mike Dalessio, Aaron Patterson, Yoko Harada, Akinori MUSHA, John Shahid,
Karol Bucek, Sam Ruby, Craig Barnes, Stephen Checkoway, Lars Kanis, Sergio Arbeo,
Timothy Elliott, Nobuyoshi Nakada, Charles Nutter, Patrick MahoneyLicensed under the MIT License.
https://nokogiri.org/rdoc/Nokogiri.html