5

xurl is a great solution.

Here is a quote from the pramble:

Package xurl loads package url by default and defines possible url breaks for all alphanumerical characters and = / . : * - ~ ' "

How do I add underscore = / . : * - ~ ' " and _ as preferred characters to break the following url

https://www.weibull.com/pubs/2011_RAMS_planning_a_reliability_growth_program_utilizing_historical_data.pdf

It should only break words as a last resort.

Minimal code showing the issue:

\documentclass{minimal}
\usepackage{xurl}
\begin{document}
Here is the url \url{https://www.weibull.com/pubs/2011_RAMS_planning_a_reliability_growth_program_utilizing_historical_data.pdf}.

\end{document}

skvery
  • 597
  • 1
  • 5
  • 20
  • 4
    I don't understand your question. xurl is set up to line-break a URL string at all characters, including _ (underscore) characters. Are you maybe asking how to line-break the URL only at underscore characters? – Mico Jun 10 '20 at 20:53
  • 2
    Did you actually use \url to typeset the URL in question? As Mico says, xurl allows line breaks in \url after all characters, so in particular after _. If that's not what you are seeing, something is up. (All of this URL line breaking relies on TeX knowing it is a URL. So the \url{...} macro is essential.) Note that even with the standard settings of url, a line break is allowed after _, so for this URL you may not even need xurl, url might be enough. – moewe Jun 11 '20 at 03:02
  • @Mico and moewe, see example added. The minimal code example shows that underscores are not used as line-breaks, at least not in Overleaf. – skvery Jun 11 '20 at 11:21
  • 1
    Please clarify: Do you want linebreaks to only occur at a _? – leandriis Jun 11 '20 at 11:39
  • 1
    @skvery - I compiled your program both on Overleaf and on MacTeX2020. In both cases, a line break occurs before "am" in "program". The result looks just fine to me. The likelihood of a line break in this URL string occurring between two letters rather than after an underscore character is very high ex ante, given that only 8 out of the 98 characters of the URL string (not counting the https:// prefix) are underscores. As @moewe and I have been wont to point out to you, the whole point of the xurl package is to let line breaks occur anywhere in the argument of \url. – Mico Jun 11 '20 at 11:42
  • @Mico I expected the break to occur next to a underscore, not in the middle of a word, but I understand now after testing with dash instead of underscore. – skvery Jun 11 '20 at 11:57
  • @Mico I also added the preamble from the xurl package and added a sentence to clarify my question. – skvery Jun 11 '20 at 12:03
  • 4
    @skvery - So, basically, you're looking for something completely different from what the xurl package is designed for. You should say so up front. – Mico Jun 11 '20 at 12:12

2 Answers2

4

In theory it is possible to allow line breaks after A-Z, a-z, 0-9 in \url in addition to other possible break points, but with a higher penalty (which means LaTeX will prefer other hyphenation points).

I have my doubts that this will have an effect that is significantly different from just loading xurl, which allows breaks everywhere with the same penalty, though. If a URL takes up a significant portion of the line, chances are there is little space left to shrink and enlarge, which means that TeX has little leeway in deciding where exactly to break the line: Either the URL can be broken pretty much where it meets the margin or not.

Anyway, here is how biblatex does it

\documentclass{article}
\usepackage{url}
\usepackage{lipsum}

\mathchardef\UrlNotSoGreatBreakPenalty=800

\usepackage{etoolbox} \begingroup \def\do#1{% \gappto\UrlSpecials{% \do#1{% \mathchar`#1 \penalty\UrlNotSoGreatBreakPenalty}}} \do\A\do\B\do\C\do\D\do\E\do\F\do\G\do\H\do\I\do\J \do\K\do\L\do\M\do\N\do\O\do\P\do\Q\do\R\do\S\do\T \do\U\do\V\do\W\do\X\do\Y\do\Z \do\a\do\b\do\c\do\d\do\e\do\f\do\g\do\h\do\i\do\j \do\k\do\l\do\m\do\n\do\o\do\p\do\q\do\r\do\s\do\t \do\u\do\v\do\w\do\x\do\y\do\z \do\1\do\2\do\3\do\4\do\5\do\6\do\7\do\8\do\9\do\0 \endgroup

\begin{document} \lipsum[1]

Text \url{https://www.weibull.com/pubs/2011_RAMS_planning_a_reliability_growth_program_utilizing_historical_data.pdf}.

\lipsum[2] \end{document}

Broken URL

This assigns a penalty of 800 to not so great breaks after A-Z, a-z, 0-9. Breaks after punctuation has penalty 700, big breaks after : a penalty of 500.


You could allow some stretchable space as suggested in Ulrike Fischer's answer to 'Uneven' breaks in long URLs (\url) (which has already been shamelessly stolen by biblatex: https://github.com/plk/biblatex/issues/850, https://github.com/plk/biblatex/pull/886). This may make it possible to allow your URLs to break in nicer places at the cost of some more whitespace between characters.

\documentclass{article}
\usepackage{url}
\usepackage{lipsum}

\mathchardef\UrlNotSoGreatBreakPenalty=800

\newmuskip\urlalnumskip \setlength{\urlalnumskip}{0mu plus 2mu}

\usepackage{etoolbox}

\begingroup \def\do#1{% \gappto\UrlSpecials{% \do#1{% \mathchar`#1 \mskip\urlalnumskip \penalty\UrlNotSoGreatBreakPenalty}}} \do\A\do\B\do\C\do\D\do\E\do\F\do\G\do\H\do\I\do\J \do\K\do\L\do\M\do\N\do\O\do\P\do\Q\do\R\do\S\do\T \do\U\do\V\do\W\do\X\do\Y\do\Z \do\a\do\b\do\c\do\d\do\e\do\f\do\g\do\h\do\i\do\j \do\k\do\l\do\m\do\n\do\o\do\p\do\q\do\r\do\s\do\t \do\u\do\v\do\w\do\x\do\y\do\z \do\1\do\2\do\3\do\4\do\5\do\6\do\7\do\8\do\9\do\0 \endgroup

\begin{document} \lipsum[1]

Text \url{https://www.weibull.com/pubs/2011_RAMS_planning_a_reliability_growth_program_utilizing_historical_data.pdf}.

\lipsum[2] \end{document}

Nicely broken, but spaced URL.

Note how the characters in the first line are spaced out quite a bit to make the nicer break possible.

If you play around with the penalty and the stretchable space you will find that the penalty only has a real effect if the stretchable space can give it enough leeway.

moewe
  • 175,683
3

This creates a \url command that allows breaks at specified characters, here given as /, ., and _.

\documentclass{article}
\usepackage[pass,showframe]{geometry}
\catcode`_=12 
\newcommand{\url}[1]{%
  \begingroup
  \ttfamily
  \begingroup\lccode`~=`/\lowercase{\endgroup\def~}{/\penalty0 }%
  \begingroup\lccode`~=`.\lowercase{\endgroup\def~}{.\penalty0 }%
  \begingroup\lccode`~=`_\lowercase{\endgroup\def~}{_\penalty0 }%
  \catcode`/=\active\catcode`.=\active\catcode`_=\active
  \scantokens{#1\noexpand}%
  \endgroup
}
\catcode`_=8 
\begin{document}
Here is the url \url{https://www.weibull.com/pubs/2011_RAMS_planning_a_reliability_growth_program_utilizing_historical_data.pdf}.
\end{document}

enter image description here