0

I use Jsoup library.

After the execution of the following code:

Document doc = new Document(language);

File input = new File("filePath" + "filename.html");
PrintWriter writer = new PrintWriter(input, "UTF-8");

String contentType = "<%@ page contentType=\"text/html; charset=UTF-8\" %>";
doc.appendText(contentType);

writer.write(doc.toString());
writer.flush();
writer.close();

In the output html file I receive the following line of text:

&lt;%@ page contentType=&quot;text/html; charset=UTF-8&quot; %&gt;

instead of

<%@ page contentType="text/html; charset=UTF-8" %>

What could be the problem?

Dan
  • 383
  • 1
  • 3
  • 18
  • Its not quite clear what you want that code to do actually, maybe you could include the rest of code also ? – JonasCz Apr 24 '15 at 17:51

2 Answers2

1

Those are escape characters for preventing the browser from treating them as html tags. It's not a problem. It will render correctly when you open the page via a browser

Aswin
  • 531
  • 4
  • 12
0

Some problems here:

Document doc = new Document(language);

Don't do this. Use Jsoup.parse(...) instead.

<%@ page contentType="text/html; charset=UTF-8" %>

This is not HTML, and will probably not get parsed correctly.

Now, for your problem. You should use something like

Document document = Jsoup.parse(new ByteArrayInputStream(myHtmlString.getBytes(StandardCharsets.UTF_8)), "ISO-8859-1", BaseUrl);

Check this, this, and this for the outputSetting which you may need.

Community
  • 1
  • 1
JonasCz
  • 11,660
  • 6
  • 43
  • 64