0

I am trying to convert a simple HTML string with basic formatting like that

<b>my html string</b><br/>
second line of my html string... etc

to XHTML first, then inserting in DOMDocument and convert it to XSLFO using html2fo.xsl transformation stylesheet.

The trouble is that every special char I get on html string is entity-encoded, and when I try to load it on a DOMDocument i get the error

DOMDocument::loadXML() [<a href='domdocument.loadxml'>domdocument.loadxml</a>]: Entity 'eacute' not defined in Entity, line: 7

I actually use tidy library to convert html to xhtml, and then a php xslt processor to get my final XSLFO file. The trouble is that even if I use LIBXML_NOENT property, the error occurs.

private static $tidyConfig      = array (
    'force-output'      => true,
    'clean'             => false,
    'output-xhtml'      => true,
    'show-body-only'    => false,
    'wrap'              => 0,
    'doctype'           => 'omit'
 );

$xp         = new XSLTProcessor();
$xmlDoc     = new Mv_Dom_Document();
$dirXslt        = $GLOBALS['G_config']['XSLT_STYLESHEETS'];
$aXsltSS        = GestionFichiers::getContenuRep($dirXslt, array(), null);
$tidyConfig     = (!is_null($tidyConfig)) ? $tidyConfig : Mv_Html_Utils::$tidyConfig;
$tidy       = new tidy();
$tidy->parseString($html, $tidyConfig);

// on convertit la chaine en XHTML
$tidy->cleanRepair();

// on la charge dans un DOMDocument
$xmlDoc->loadXML($tidy->value, LIBXML_NOENT);
hakre
  • 184,866
  • 48
  • 414
  • 792
kitensei
  • 2,440
  • 2
  • 37
  • 66
  • possible duplicate of [DOMDocument appendXML with special characters](http://stackoverflow.com/questions/4645738/domdocument-appendxml-with-special-characters) – Gordon Jan 12 '11 at 10:38
  • possible duplicate of [php: using DomDocument whenever I try to write UTF-8 it writes the hexadecimal notation of it.](http://stackoverflow.com/questions/3575109/php-using-domdocument-whenever-i-try-to-write-utf-8-it-writes-the-hexadecimal-no/3575326) – Gordon Jan 12 '11 at 10:40
  • Never got the `LIBXML_NOENT` constant working that way, I don't find any information in the manual that would suggest this usage and you also do not even explain what you assume it does and why you use it. To fight errors? Out of the blue? Like fighting windmills? Use `loadHTML` instead if you guess around with that error. – hakre May 03 '13 at 12:15
  • possible duplicate of [PHP DOMDocument error Entity 'nbsp' not defined](http://stackoverflow.com/questions/9760208/php-domdocument-error-entity-nbsp-not-defined) – hakre May 03 '13 at 12:17
  • possible duplicate of [inserting ô into mysql database is part of Rhône results in Rh](http://stackoverflow.com/questions/6975451/inserting-%c3%b4-into-mysql-database-is-part-of-rh%c3%b4ne-results-in-rh) – Paul Sweatte May 25 '15 at 17:16

0 Answers0