44

What do double slashes often found in URLs mean?

For example:

  • http://www.example.com/A/B//C/

Please note: I'm not referring to the beginning right after http:.

Stephen Ostermiller
  • 98,758
  • 18
  • 137
  • 361
aneuryzm
  • 1,897
  • 1
  • 17
  • 20

4 Answers4

43

As mentioned by @RandomBen, the double slash is most likely the result of an error somewhere.

That the page loads has nothing to do with the browser, but rather that the server ignores the extra slash. The browser doesn't do anything special with extra slashes in the URL, it just sends them along in the request:

GET /A/B//C/D HTTP/1.1
Host: www.example.com
...

It looks like current versions of Apache and IIS both will ignore the extra slashes while resolving the path and return the document that would have been returned had the URL not had extra slashes. However, browsers (I tested IE 8 and Chrome 9) get confused by any relative URLs (containing parent path components) of resources in the page, which produces bad results. For example, if a page has:

<link rel="stylesheet" href="../../style.css" type="text/css" />

Upon loading the page /a/b/c/, the browser will request /a/style.css. But if—for whatever reason—/a/b//c/ is requested (and the server ignores the extra slash), the browser will end up requesting /a/b/style.css, which won't exist. Oops, the page looks ugly.

(This obviously won't happen if the URL doesn't have a parent path component (..) or is absolute.)

It is my opinion that Apache and IIS (and probably others) are acting incorrectly as /a/b/c/ and /a/b//c/ technically represent two different resources. According to RFC 2396, every slash is significant:

  path          = [ abs_path | opaque_part ]

  path_segments = segment *( "/" segment )
  segment       = *pchar *( ";" param )
  param         = *pchar

  pchar         = unreserved | escaped |
                  ":" | "@" | "&" | "=" | "+" | "$" | ","

So, /a/b/c/ consists of three segments: "a", "b", and "c"; /a/b//c/ actually consists of four: "a", "b", "" (the empty string), and "c". Whether or not the empty string is a valid filesystem directory is a detail of the server's platform. (And logically, this means the browsers are actually operating correctly when parsing relative URLs with parent path components – in my example, they go up past the "c" directory and the "" directory, leaving us to request style.css from "b".)

If you're using Apache with mod_rewrite, there is a pretty simple fix:

# remove multiple slashes anywhere in url 
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$ 
RewriteRule . %1/%2 [R=301,L] 

This will issue a HTTP 301 Moved Permanently redirect so that any double slashes are stripped out of the URL.

josh3736
  • 541
  • 4
  • 6
  • 2
    Wouldn't it be better to have your mod_rewrite solution take into account 3, 4, ... slashes too? Something along the lines of /{2,}? (Assuming Apache allows that kind of quantifier, I'm not too familiar with it) – Ward Muylaert Jan 28 '11 at 00:16
  • +1 - Thanks for the extra info. I didn't think of it that way! – Ben Hoffman Jan 28 '11 at 11:18
  • @Ward: Admittedly, I'm not an Apache expert; that rewrite rule was in one of the top Google results for "url double slash". My assumption is that it will operate recursively -- you'll get redirects from /a///b//c/ -> /a//b//c/ -> /a/b//c/ -> /a/b/c/. Of course, that's 4 HTTP roundtrips (which is horribly inefficient), but this is a fix for something that shouldn't be happening anyway. – josh3736 Jan 28 '11 at 13:51
  • 3
    It's not incorrect behavior: a/b and a//b indeed are two distinct URL paths, but nothing forbids the server from returning the same resource for both of them if it wants. I do agree with you, however, that in practice returning a 301 redirect would seem more useful. – Ilmari Karonen Apr 09 '12 at 19:44
  • 4
    @IlmariKaronen: It absolutely is incorrect behavior because (1) this behavior automatically creates an infinite number of potential duplicate references to a single resource (which, if not in violation of the letter of any spec, certainly violates the spirit), and more practically (2) it "breaks" relative-path handling in browsers that do properly count the empty string in a//b as a directory (see the stylesheet example above). – josh3736 Apr 09 '12 at 20:14
  • 2
    ...and anyway, I'd argue that RFC 2396 does forbid a server from returning the same resource by auto-collapsing slashes because the spec says every slash is significant. Automatically ignoring consecutive slashes is in violation of that spec. (It's one thing if someone programmed their server to do that, even if doing so would be silly. However, servers doing this by default is incorrect.) – josh3736 Apr 09 '12 at 20:21
  • @josh3736, you're right these URLs should be treated as different, but collapsing double slashes into one has nothing to do with the URL spec, but how the web server chooses to map this to the file system. This is not specified and is implementation dependent. Multiple resources (in the distinct URIs sense) may be automatically mapped to the same file in the backend (like symbolic links would). It makes sense to do so when the resources are files/directory. It wouldn't if it's executed by a script (e.g. /index.php/something//something/). – Bruno Apr 09 '12 at 21:59
  • @josh3736, on I'd argue that RFC 2396 does forbid a server from returning the same resource. This is actually not true. The RFC specifically allows the same resource to be returned for different URIs. It is part of separating the naming scheme from the resource retrieval. http://x.y/z and http://a.b/c may retrieve the exact same document, even though the URIs are 100% different. Same rule applies to treatment of double slashes. Whether you want it, or configure it that way, is a different question altogether. – Abel Apr 04 '14 at 14:14
  • @Abel: That's something different; of course you can return the same content on different URIs if you choose. I'm saying that automatically collapsing slashes is not spec-compliant because the spec says that every slash is significant. – josh3736 Apr 04 '14 at 17:22
  • @josh3736, sorry I misunderstood. I thought you meant that a server may not auto-collapse slashes in the comment above. But a server can do that (because it is allowed to return the same source from different urls). A client cannot (you explain that nicely in your post). You suggest in your post also that it is better to send a redirect instead, I agree too, it is neater, though not required by any spec. – Abel Apr 05 '14 at 01:24
  • I'm afraid I cannot agree with redirects being a general purpose replacement here. It is very important to note that a redirect will not "just work" if verbs other than GET are being used. In that case, either the double slashes need to be fixed on the client side, or the webserver needs to handle them similarly to what Apache is doing. – GrandOpener Jan 07 '20 at 15:22
37

That is an error in the programmers'/developers' code. If you compare these two URLS:

  • http://www.example.com/A/B/C/
  • http://www.example.com/A/B//C/

They look different but if you were to visit either, both of them would work in most modern browsers.

This is something you want to fix. If you have the double slash it could confuse Google's web crawlers and make them think there is 2 versions of the page.

Simon Hayter
  • 32,999
  • 7
  • 59
  • 119
Ben Hoffman
  • 12,768
  • 4
  • 41
  • 62
  • 15
    Actually, that the page loads has nothing to do with the browser, but rather that the server ignores the extra slash. This got long, so see the answer I posted. – josh3736 Jan 27 '11 at 21:29
6

The double slash has a meaning when it's used in resource URLs. For example, when it's used in CSS for the URL of a background image:

.classname {
    background : url("//example.com/a/b/c/d.png");
}

Here it means this background image is fetching from a different domain other than the domain of the present web page. Or in other words, http:// can be written as just // when using that in resource URLs.

But this double slash in between the URLs (e.g.: /a//b/c/d.htm) doesn't have any meaning.

Dale Harris
  • 103
  • 3
Alan Joseph
  • 61
  • 1
  • 1
  • 2
    well, this is not whole truth. The double slash is ised when one needs to avoid mixed content problem, thus when the site is loaded from http the doubleslash will expand to http, when the site is loaded from https the doubleslash is expanded to https. – andrej May 14 '16 at 17:33
2

As mentioned, some servers are setup to ignore a double slash in the URL path, but Amazon S3 static hosting will not. If you want to handle/ignore them in that case, you can use Redirection Rules in the properties panel.

If you want to ignore a double slash following the domain name then you could use something like this:

<RoutingRules>
  <RoutingRule>
    <Condition>
      <KeyPrefixEquals>/</KeyPrefixEquals>
    </Condition>
    <Redirect>
      <ReplaceKeyPrefixWith/>
    </Redirect>
  </RoutingRule>
</RoutingRules>

You can probably also find and replace them throughout, but that was enough for me.

Simon Hayter
  • 32,999
  • 7
  • 59
  • 119
orlade
  • 121
  • 2