Extract a content of a html page in php

Question

There is any way to extract the content of a HTML page that starts from <body> and ends with </body> in php. If there can anyone post some sample code.

possible duplicate of [How to parse and process HTML with PHP?](http://stackoverflow.com/questions/3577641/how-to-parse-and-process-html-with-php) — CodeCaster, Jan 16 '12 at 10:12

Cyclonecode · Answer 1 · 2015-02-19T12:18:27.967

You should have a look at the DOMDocument reference.

This example reads a html document, creates a DOMDocument and gets the body tag:

libxml_use_internal_errors(true);
$dom = new DOMDocument;
$dom->loadHTMLFile('http://example.com');
libxml_use_internal_errors(false);

$body = $dom->getElementsByTagName('body')->item(0);

echo $body->textContent; // print all the text content in the body

You should also check out the following resources:

DOM API Documentation
XPATH language specification

score 1 · Answer 2 · answered Oct 01 '15 at 21:29

You can also try to use non-DOM solution based on strpos function:

$html = file_get_contents($url);
$html = substr($html,stripos($html,'<body>')+6);
$html = substr($html,0,strripos($html,'</body>'));

stripos is case insensitive version of strpos, strripos is case insensitive 'rightmost position' version of strpos.

Hope that it will help you!

score 1 · Answer 3 · answered Jan 16 '12 at 10:10

1

Try PHP Simple HTML DOM Parser

$html = file_get_html('http://www.example.com/');
$body = $html->find('body');

answered Jan 16 '12 at 10:10

Naveed

40,370
32
94
130

Extract a content of a html page in php

3 Answers3

Linked

Related