When reading about discontinuous Galerkin methods one finds the argument that these methods allow higher-order accuracy while maintaining a compact stencil (a cell only communicates with its direct neighbors) and that this is beneficial for parallel computations.
I can understand why a wider stencil would be bad for parallelization with domain decomposition: it would require more than one layer of overlap and thereby increase the communication cost. But how big is this penalty in practice?
As you can see, the answer to that question is fairly contested even in the answer thread.
– Jesse Chan Mar 23 '15 at 17:34