What do double slashes often found in URLs mean?
For example:
http://www.example.com/A/B//C/
Please note: I'm not referring to the beginning right after http:.
What do double slashes often found in URLs mean?
For example:
http://www.example.com/A/B//C/Please note: I'm not referring to the beginning right after http:.
As mentioned by @RandomBen, the double slash is most likely the result of an error somewhere.
That the page loads has nothing to do with the browser, but rather that the server ignores the extra slash. The browser doesn't do anything special with extra slashes in the URL, it just sends them along in the request:
GET /A/B//C/D HTTP/1.1
Host: www.example.com
...
It looks like current versions of Apache and IIS both will ignore the extra slashes while resolving the path and return the document that would have been returned had the URL not had extra slashes. However, browsers (I tested IE 8 and Chrome 9) get confused by any relative URLs (containing parent path components) of resources in the page, which produces bad results. For example, if a page has:
<link rel="stylesheet" href="../../style.css" type="text/css" />
Upon loading the page /a/b/c/, the browser will request /a/style.css. But if—for whatever reason—/a/b//c/ is requested (and the server ignores the extra slash), the browser will end up requesting /a/b/style.css, which won't exist. Oops, the page looks ugly.
(This obviously won't happen if the URL doesn't have a parent path component (..) or is absolute.)
It is my opinion that Apache and IIS (and probably others) are acting incorrectly as /a/b/c/ and /a/b//c/ technically represent two different resources. According to RFC 2396, every slash is significant:
path = [ abs_path | opaque_part ]
path_segments = segment *( "/" segment )
segment = *pchar *( ";" param )
param = *pchar
pchar = unreserved | escaped |
":" | "@" | "&" | "=" | "+" | "$" | ","
So, /a/b/c/ consists of three segments: "a", "b", and "c"; /a/b//c/ actually consists of four: "a", "b", "" (the empty string), and "c". Whether or not the empty string is a valid filesystem directory is a detail of the server's platform. (And logically, this means the browsers are actually operating correctly when parsing relative URLs with parent path components – in my example, they go up past the "c" directory and the "" directory, leaving us to request style.css from "b".)
If you're using Apache with mod_rewrite, there is a pretty simple fix:
# remove multiple slashes anywhere in url
RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . %1/%2 [R=301,L]
This will issue a HTTP 301 Moved Permanently redirect so that any double slashes are stripped out of the URL.
That is an error in the programmers'/developers' code. If you compare these two URLS:
http://www.example.com/A/B/C/ http://www.example.com/A/B//C/They look different but if you were to visit either, both of them would work in most modern browsers.
This is something you want to fix. If you have the double slash it could confuse Google's web crawlers and make them think there is 2 versions of the page.
The double slash has a meaning when it's used in resource URLs. For example, when it's used in CSS for the URL of a background image:
.classname {
background : url("//example.com/a/b/c/d.png");
}
Here it means this background image is fetching from a different domain other than the domain of the present web page. Or in other words, http:// can be written as just // when using that in resource URLs.
But this double slash in between the URLs (e.g.: /a//b/c/d.htm) doesn't have any meaning.
As mentioned, some servers are setup to ignore a double slash in the URL path, but Amazon S3 static hosting will not. If you want to handle/ignore them in that case, you can use Redirection Rules in the properties panel.
If you want to ignore a double slash following the domain name then you could use something like this:
<RoutingRules>
<RoutingRule>
<Condition>
<KeyPrefixEquals>/</KeyPrefixEquals>
</Condition>
<Redirect>
<ReplaceKeyPrefixWith/>
</Redirect>
</RoutingRule>
</RoutingRules>
You can probably also find and replace them throughout, but that was enough for me.
mod_rewritesolution take into account 3, 4, ... slashes too? Something along the lines of/{2,}? (Assuming Apache allows that kind of quantifier, I'm not too familiar with it) – Ward Muylaert Jan 28 '11 at 00:16/a///b//c/->/a//b//c/->/a/b//c/->/a/b/c/. Of course, that's 4 HTTP roundtrips (which is horribly inefficient), but this is a fix for something that shouldn't be happening anyway. – josh3736 Jan 28 '11 at 13:51a/banda//bindeed are two distinct URL paths, but nothing forbids the server from returning the same resource for both of them if it wants. I do agree with you, however, that in practice returning a 301 redirect would seem more useful. – Ilmari Karonen Apr 09 '12 at 19:44a//bas a directory (see the stylesheet example above). – josh3736 Apr 09 '12 at 20:14/index.php/something//something/). – Bruno Apr 09 '12 at 21:59I'd argue that RFC 2396 does forbid a server from returning the same resource. This is actually not true. The RFC specifically allows the same resource to be returned for different URIs. It is part of separating the naming scheme from the resource retrieval.http://x.y/zandhttp://a.b/cmay retrieve the exact same document, even though the URIs are 100% different. Same rule applies to treatment of double slashes. Whether you want it, or configure it that way, is a different question altogether. – Abel Apr 04 '14 at 14:14