I am trying to convert a simple HTML string with basic formatting like that
<b>my html string</b><br/>
second line of my html string... etc
to XHTML first, then inserting in DOMDocument and convert it to XSLFO using html2fo.xsl transformation stylesheet.
The trouble is that every special char I get on html string is entity-encoded, and when I try to load it on a DOMDocument i get the error
DOMDocument::loadXML() [<a href='domdocument.loadxml'>domdocument.loadxml</a>]: Entity 'eacute' not defined in Entity, line: 7
I actually use tidy library to convert html to xhtml, and then a php xslt processor to get my final XSLFO file. The trouble is that even if I use LIBXML_NOENT property, the error occurs.
private static $tidyConfig = array (
'force-output' => true,
'clean' => false,
'output-xhtml' => true,
'show-body-only' => false,
'wrap' => 0,
'doctype' => 'omit'
);
$xp = new XSLTProcessor();
$xmlDoc = new Mv_Dom_Document();
$dirXslt = $GLOBALS['G_config']['XSLT_STYLESHEETS'];
$aXsltSS = GestionFichiers::getContenuRep($dirXslt, array(), null);
$tidyConfig = (!is_null($tidyConfig)) ? $tidyConfig : Mv_Html_Utils::$tidyConfig;
$tidy = new tidy();
$tidy->parseString($html, $tidyConfig);
// on convertit la chaine en XHTML
$tidy->cleanRepair();
// on la charge dans un DOMDocument
$xmlDoc->loadXML($tidy->value, LIBXML_NOENT);