5

I am working with Google documents that contain hundreds of empty paragraphs. I want to remove these blank lines automatically.

In LibreOffice Writer you can use the Find & Replace tool to replace ^$ with nothing, but that didn't work in Google Docs.

My search for ^$ or ^\s*$ returned 0 results even though there should be 3 matches

How can I remove the blank paragraphs with Google Apps Script?

I already tried body.findText("^$");, but that returns null

function removeBlankParagraphs(doc) {
    var body = doc.getBody();
    result = body.findText("^$");

}
Rens Jaspers
  • 53
  • 1
  • 5

2 Answers2

7

I think there has to be a last empty paragraph but this seems to work.

function myFunction() {
  var body = DocumentApp.getActiveDocument().getBody();

  var paras = body.getParagraphs();
  var i = 0;

  for (var i = 0; i < paras.length; i++) {
       if (paras[i].getText() === ""){
          paras[i].removeFromParent()
       }
}
}
Tom Woodward
  • 1,563
  • 13
  • 20
  • 4
    There is one issue: the script removes all images from a document, because recognises them as empty paragraphs. Here is workaround: `function myFunction() { var body = DocumentApp.getActiveDocument().getBody(); var paras = body.getParagraphs(); var i = 0; for (var i = 0; i < paras.length; i++) { if (paras[i].getText() === ""){ if (paras[i].findElement(DocumentApp.ElementType.INLINE_IMAGE,null) === null) { paras[i].removeFromParent();} } } }` – apmouse Jun 21 '17 at 04:53
  • @apmouse, your workaround seems relevant enough to be moved into its own answer... – Giuseppe Apr 28 '18 at 06:15
3

Adding to Tom's answer and apmouse's comment, here's a revised solution that: 1) prevents removing paragraphs consisting of images or horizontal rules; 2) also removes paragraphs that only contain whitespace.

function removeEmptyParagraphs() {
  var pars = DocumentApp.getActiveDocument().getBody().getParagraphs();
  // for each paragraph in the active document...
  pars.forEach(function(e) {
    // does the paragraph contain an image or a horizontal rule?
    // (you may want to add other element types to this check)
    no_img = e.findElement(DocumentApp.ElementType.INLINE_IMAGE)    === null;
    no_rul = e.findElement(DocumentApp.ElementType.HORIZONTAL_RULE) === null;
    // proceed if it only has text
    if (no_img && no_rul) {
      // clean up paragraphs that only contain whitespace
      e.replaceText("^\\s+$", "")
      // remove blank paragraphs
      if(e.getText() === "") {
        e.removeFromParent();
      }
    }    
  })
}
Giuseppe
  • 4,926
  • 4
  • 37
  • 37
  • Works but also deletes white space from the non blank pages, so need to add whitespace to page after running the script. For example in a CV where there is normally whitespace between sections. – Brian Var Apr 27 '19 at 18:13
  • 1
    If you are making a structured document, section space should be controlled by the definition of Section Header. – Sherwood Botsford Feb 13 '20 at 15:34