6

I have occasionally encountered a very mysterious problem that really bugs me and is also kind of dangerous if one creates a large document. However, although I have tried very hard, I'm not able to create a MWE. [But see updates below!]

The general problem is that sometimes the end of a line simply disappears in my PDF output. I'll show one example here for reference, hoping that someone has an idea what the cause might be. This is an excerpt from the second page of a chapter of the book I'm working on and it looks like this if I only compile this chapter with \includeonly:

with text

However, if I comment out the \includeonly line and change nothing else, it suddenly looks like this:

without text

If I only make a minimal change, like changing one word or the spacing, the problem disappears.

My first suspect was the microtype package, but if I use it with the draft option, the problem persists. The font is from the libertinus package, but I've had this problem with other fonts as well. FWIW, I'm using pdfTeX from MiKTeX on a Windows machine. The whole document consists of more than 100 files with lots of packages and tweaks. As I said, I've desperately tried to create a MWE, but to no avail.

Any ideas?

Added because of question in comments: FWIW, the PDF viewer is not the problem. I usually use SumatraPDF, but with Adobe Acrobat oder PDF Annotator, the text is also invisible.

Update (2022-08-17): If some expert wants to dig into this, information on a way to reproduce the problem can be found at http://weitz.de/files/problem_old.zip. (Look at the README file and the comments below.)

Update (2022-09-07): I just came across exactly the same problem again and this time I made sure to create a MWE with which you should be able to reproduce the problem with the latest MiKTeX version (as of today). It is at http://weitz.de/files/problem.zip.

Additional information. In my source file I have Erst im 20.\ Jahrhundert and the problem of the vanishing text goes away if I remove the backslash and make it Erst im 20. Jahrhundert.

Update (2022-09-09): As some people seem to be offended by the files being encoded as Latin-1, I have now changed them to UTF-8. The problem is still there and the MWE is still at http://weitz.de/files/problem.zip.

FWIW, I can reproduce the problem with MiKTeX as described, but not with TeX Live (pdfTeX, Version 3.14159265-2.6-1.40.20) on Ubuntu/WSL.

Update (2024-04-06): For further information see here and here.

Frunobulax
  • 2,218
  • As if two bytes are being interpreted as %?Like some stray byte got copied in, maybe. Does re-typing that line help? – Cicada Aug 17 '22 at 15:56
  • 1
    Definitely a mysterious one. I doubt it's exactly what Cicada suggests, since it looks like the space for the missing text is still being assigned; it's more as if the text is there in the output but not visible, so isn't being commented out. It might be good to rule out the possibility that the fault is on the side of the PDF software. Have you tried it in multiple viewers? Are you able to make the PDF available, or even just this page, for others to examine? – frabjous Aug 17 '22 at 16:03
  • @Cicada Retyping doesn't help. And a stray byte wouldn't explain why it works with \includeonly, would it? – Frunobulax Aug 17 '22 at 16:11
  • Yes, like invisible font - 5c00 (+next half glyph) goes to 尀, which won't be in a roman font. - I was thinking something like a ZWNJ got in. PDF itself will help. – Cicada Aug 17 '22 at 16:12
  • Can you select the part that is invisible in the PDF? If yes, can you copy and paste it? If yes, what did you copy? – Jasper Habicht Aug 17 '22 at 16:15
  • 1
    @frabjous It's not the PDF software, I've added that to the question. I have put the offending page here, but I can't provide the whole PDF because it's a book project that's supposed to be published next year. – Frunobulax Aug 17 '22 at 16:17
  • @JasperHabicht No, if I move the mouse over the empty part, the cursor changes and I can't select anything. Looks like the PDF viewer really thinks there is nothing (as opposed to invisible text). – Frunobulax Aug 17 '22 at 16:18
  • This reminds me of https://tex.stackexchange.com/q/422089/107497. We weren't able to solve that one either, but some of the discussion may give some ideas. – Teepeemm Aug 17 '22 at 16:25
  • 2
    There is a large shift to the left in the pdf: [(Erst)]TJ -32626.8 0 Td [(im)-250(20.) which means that your line is outside the media box. – Ulrike Fischer Aug 17 '22 at 16:30
  • @UlrikeFischer Thanks, but I'm afraid I don't know what that means. Does it mean some of my settings are wrong? Maybe the page geometry? – Frunobulax Aug 17 '22 at 16:39
  • 1
    no, it looks like an engine bug. But without an example it will be impossible to track. – Ulrike Fischer Aug 17 '22 at 16:45
  • which pdftex version are you using? – Robert Aug 17 '22 at 17:44
  • 2
    tl;dr we know almost-certainly that it's an engine bug, but there's no way we can help without an example file. Everyone who can produce the bug doesn't want to share the text, and the bug is very rare to begin with. ■ It's clearly exactly the same problem as the linked question though. ■ one option is to privately contact Ulrike or someone else in the LaTeX3 team, if you trust them and they have free time. – user202729 Aug 17 '22 at 18:22
  • Side note, when you create a large document you only need to proofread one last time before publishing, no need to proofread on every edit... – user202729 Aug 17 '22 at 18:22
  • 1
    @user202729 Yes, that's technically correct, but it's not how it really works. I have already published four books and the thing is that you will almost certainly find some typo or some minor issue you'd like to fix once everything is "ready" and checked. This kind of bug means that you'll have to go through all 300 pages or more again and it's not that easy to spot half a missing line. FWIW, I've spent the last hour trying to boil this down to a small example and it seems I'll succeed. – Frunobulax Aug 17 '22 at 18:36
  • At least if some information is known about the bug (very large translation in the generated PDF) it might be possible to write some script to automatically determine if the bug happens in the generated PDF. // Alternatively maybe compiling the compiler with integer overflow checking helps, but compiling an engine from source is not easy. – user202729 Aug 17 '22 at 19:38
  • Damnit! I spent hours to generate a minimal example and when I finally had it I updated MiKTeX just to be sure and now the bug is no longer there. (The last update was a few months ago.) The problem is I'm not convinced the bug was fixed in the meantime. When trying to generate a MWE, the bug was sensitive to pretty much everything that seemed totally unrelated - change a word here, remove a library there, whatever. The update might have caused a miniscule change somewhere without removing the underlying cause. But, whatever, I can't reproduce it anymore. – Frunobulax Aug 17 '22 at 19:56
  • If someone is interested in a test case to reproduce the bug with what was in MiKTeX at the beginning of 2022, let me know. – Frunobulax Aug 17 '22 at 19:56
  • well add it to your question. If the bug reappears it could be helpful to have an general idea of the context. – Ulrike Fischer Aug 17 '22 at 19:58
  • @UlrikeFischer If it were a matter of typing a few lines, I would have already done that. I managed to reduce it from more than 100 files to about half a dozen most of which are essentially empty now but need to be in there in order for the bug to happen. I can put a ZIP archive somewhere, but I can't just edit the question. – Frunobulax Aug 17 '22 at 20:07
  • I'm not sure, since I'm not a MikTeX user and never tried this, but does the update process create logs of what was changed from which version to which other version? With that log you could create a point in time of the MikTeX repositories and together with your ZIP-file people who have the time might be able to further track it down. – Skillmon Aug 17 '22 at 20:28
  • 1
    @Skillmon That was a good idea and there is actually such a log file. I have now put everything into one (pretty small) ZIP file and updated the question with the URL. – Frunobulax Aug 17 '22 at 20:50
  • Testing your 2022-09-07 example with TL22 on Linux and several PDF viewers. The rendered output looks normal to me. – AlexG Sep 07 '22 at 11:08
  • @AlexG Hmm. You specifically looked at the line starting with "Erst"? – Frunobulax Sep 07 '22 at 11:11
  • @AlexG Well, it certainly is a very strange bug that only surfaces with a specific combination of libs and inputs. That's why I couldn't reproduce it after the August MiKTeX update. But for me, it is now reproducible given the setup described in my MWE. – Frunobulax Sep 07 '22 at 11:14
  • First line from the log, if that matters: This is pdfTeX, Version 3.141592653-2.6-1.40.24 (TeX Live 2022) (preloaded format=pdflatex 2022.9.4), and TL packages as of today. – AlexG Sep 07 '22 at 11:14
  • I thought the luainputenc package is only for use with lualatex not pdflatex? Your sample document is using it. – Herb Schulz Sep 07 '22 at 12:10
  • If it helps: compiled OK with TL2020/Windows lualatex and xelatex, but when I tried to align everything to utf-8, got � for all the accented characters and then ¿½, and lots of "String contains an invalid utf-8 sequence."s. – Cicada Sep 07 '22 at 12:21
  • @HerbSchulz I read somewhere that luainputenc can be used for both and it obviously works. But I think that's orthogonal to the problem at hand. – Frunobulax Sep 07 '22 at 14:42
  • From the docs: "luainputenc automatically loads inputenc if called with an old engine." – Frunobulax Sep 07 '22 at 14:45

0 Answers0