0

I'm attempting to parse out the html tags from a Java string and that is working fine using the below Jsoup parse method. The only thing is that when I call the .text method it removes the line breaks ("\n") tags. I want to keep those but still have the method return a String, any ideas?

 private static String stripHTML(String html) {
     return Jsoup.parse(html).text();
 }
c12
  • 9,185
  • 45
  • 147
  • 248

1 Answers1

1

Newlines aren't any different from spaces (or consecutive spaces or tabs) in HTML. What you pull out won't have any semantic meaning. <p> and <br />, on the other hand...

David Ehrmann
  • 7,036
  • 1
  • 26
  • 36
  • While this is true, see http://stackoverflow.com/a/12580364/14731 or http://stackoverflow.com/q/5640334/14731 if you want to preserve newlines. – Gili Aug 28 '15 at 15:20