Hi on Facebook the home link has a query string on it like this: facebook.com/?ref=home when you click the link and navigate to the home page the query is automatically removed. However if I was to manually type in that link the query is NOT removed. Any idea on how they did this?
- 8,733
- 5
- 39
- 46
- 26,540
- 96
- 273
- 470
-
I am not sure I understand this question. – Mike Caron Nov 24 '10 at 19:23
-
I think he wants to know how to detect if the page was called by a click in the browser or by typing in the URL. The title is a bit misleading. – Jan Thomä Nov 24 '10 at 19:28
-
Looks like Facebook uses Javascript for this: http://stackoverflow.com/questions/824349/modify-the-url-without-reloading-the-page – Yarin Nov 10 '13 at 15:59
-
1Unfortunately, because of the way this question contains questions, most answers only seem to be answering how to remove a query string from a URL in PHP, and aren't answering how Facebook detects the behaviour you describe. – Flimm Dec 08 '17 at 08:29
11 Answers
Easiest way in PHP:
$url = preg_replace('/\?.*/', '', $url);
What Facebook does is probably a JavaScript thing, in that fashion:
if (location.href.match(/\?.*/) && document.referrer) {
location.href = location.href.replace(/\?.*/, '');
}
- 64,266
- 17
- 119
- 142
-
2@diyism: This yields an E_STRICT error, because you're passing an expression by reference to `array_shift`. Use `current()` instead. – netcoder Feb 24 '12 at 14:59
Here's another party-trick of an answer:
$url = strtok($url, '?');
This is the answer you want if you're ever trying to win at Code Golf. It has...
- the least characters
- the least lines
- the least function calls
- a sensible URL, whether or not there is a query string
- 1,412
- 14
- 20
-
Bonus: If you KNOW there's a query string, you can access it in the next line thusly: `$query_string = strtok('?');` You can then pass it straight to `parse_str`. – haz Dec 05 '17 at 00:49
-
This doesn't answer how Facebook is doing what it is doing, as asked in the OP's question. – Flimm Dec 08 '17 at 08:26
-
7@Flimm it answers the explicit question in the title, which is what everyone who comes to this SO question is looking for. Given the age of this question, it's hard to know if in 2010, FB used Javascript or PHP to handle this; but if they handled the routing in PHP, then they would have used one of these techniques to trim the query string before re-routing. (Although as per Yarin, it's likely they used JS.) – haz Jun 27 '18 at 22:51
Use parse_url to check for a well-formed URL and remove the query string:
$link = 'http://facebook.com/page.php?ref=home';
if ($url = parse_url($link)) {
printf('%s://%s%s', $url['scheme'], $url['host'], $url['path']);
}
- 36,760
- 23
- 97
- 128
-
Note that `$url['path']` does not include the query string. `parse_url` will allow work if you give it a relative URL. – Flimm Dec 08 '17 at 08:21
-
This doesn't answer how Facebook is doing what it is doing, as asked in the OP's question. – Flimm Dec 08 '17 at 08:26
-
Warning ! This will remove the **port part** of the url if it's set. `http://example.com:8080/thepath?query=value` will be transformed in `http://example.com/thepath` – Gabriel Glenn Apr 26 '19 at 14:17
-
Yes, this example will also omit username, password, and fragment components. The PHP docs have a more comprehensive example that covers all the use cases: https://www.php.net/manual/en/function.parse-url.php#106731 – leepowers Apr 26 '19 at 23:11
Without regular expressions or actually parsing the URL with parse_url, tolerant of URLs without a query string as well:
$url = reset((explode('?', $url)));
- 400
- 2
- 9
-
Strict Standards: Only variables should be passed by reference. Try instead `$url = reset(explode('?', $url));` – Anthony Hatzopoulos Dec 20 '13 at 19:03
-
@AnthonyHatzopoulos Your "try instead" code is an exact copy of the code in the answer. – Chris Baker Mar 17 '14 at 17:12
-
3@Chris sorry. lol. There are supposed to be extra parens there `$url = reset((explode('?', $url)));` [Example](http://sandbox.onlinephpfunctions.com/code/3ed64996b6456710ba03e6af60d5b8d658eec76c) – Anthony Hatzopoulos Mar 17 '14 at 20:28
-
-
Finally updated answer to include extra parentheses to comply with strict standards due to being used in the wild – highvolt Jun 11 '15 at 19:50
-
Parentheses don't fix the pass-by-reference problem. The func. def. of reset is: `function reset(&array);` So you need to pass a reference to an array (i.e. a variable), not an array itself. So this will work: `$url_fragments = explode('?',$url);` `$url = reset($url_fragments);` – haz Feb 27 '17 at 00:45
-
This doesn't answer how Facebook is doing what it is doing, as asked in the OP's question. – Flimm Dec 08 '17 at 08:26
-
As a single-page application, Facebook intercepts link clicks with JavaScript, virtually routing the user using the browser history API and rendering partial templates/components using AJAX. In that case, the click `referer` info is consumed for analytics purposes but not maintained in the displayed URL. When a user directly navigates to a URL, the entire document is rendered, and rather than performing a redirect via URL rewrite, the URL parameter is maintained and the latency of a redirect blocking the entire page render is avoided given the lack of benefit of removing the query string. – highvolt Oct 29 '18 at 13:56
I asssume they check the HTTP Referrer header and see if the click originated from facebook. That way they can decide wether to remove the query string or not. Something like:
$refer=$_SERVER["HTTP_REFERER"];
if ($refer == "facebook.com") {
// this request was done by clicking a link on facebook
.. remove query string.
}
else {
// this request was done by typing the url into the browser
}
You can remove the query string by using the method netcoder suggested.
- 12,855
- 5
- 52
- 81
A quick and dirty alternative for PHP/5.4.0 and greater:
$url = explode('?', $url, 2)[0];
- 135,557
- 38
- 250
- 339
-
This doesn't answer how Facebook is doing what it is doing, as asked in the OP's question. – Flimm Dec 08 '17 at 08:27
Try this:
$url = strtr('scheme://hostpath', parse_url($url));
- 2,168
- 2
- 25
- 44
-
This doesn't answer how Facebook is doing what it is doing, as asked in the OP's question. – Flimm Dec 08 '17 at 08:27
Easier and more efficient, because you dont need regular expressions.
$url = substr($url,0,strpos($url, '?'));
Another solution (if you want to retrieve the query string also)
list($url,$querystring) = array_pad(explode('?', $url, 2), 2, null));
- 124,572
- 19
- 146
- 171
-
1$url = substr($url,0,strpos($url, '?')); //if url doesn't have query string, this function will return empty, which will be a diasater. – angry kiwi Mar 21 '11 at 06:36
-
Of course it is empty. If you want to get sure without any further validation: `list($url, $queryString) = array_pad(explode('?', $url, 2), 2, null));` – KingCrunch Mar 21 '11 at 08:53
-
This doesn't answer how Facebook is doing what it is doing, as asked in the OP's question. – Flimm Dec 08 '17 at 08:27
You seem to be asking two questions. How to detect if a page was visited via facebook or from an outside location and how to remove the query string from a url.
You can parse the referrer to see if the domain is facebook.
$parts = parse_url($_SERVER['HTTP_REFERER']);
if (preg_match('/(^|.)facebook.com/', $parts['host'])) {
// remove query string
}
The safest way to remove the query string is to also parse the url and then rebuild it.
$parts = parse_url('http://www.facebook.com/?ref=home');
$newUrl = $parts['scheme'].'://'.$parts['host'].$parts['path']; // http://www.facebook.com/
- 47,388
- 8
- 87
- 99
-
Very Good Observation webbiedave...he did indeed ask two questions. I'm not an expert, but I'm certain that Facebook uses a URL rewriting engine/module. Perhaps for SEO purposes, and perhaps to simplify URLs being served to the browser. That would explain why an internal link ended up removing the query if it came from an internal link. The answer above would explain how they did it from a coding point of view. – Annatar May 04 '12 at 05:18
-
@Paul: They use JavaScript to do that. They use [`history.pushState`](https://developer.mozilla.org/en/DOM/Manipulating_the_browser_history/) to modify the URL in the address bar without refreshing the page. – Nathan Aug 03 '12 at 04:28
You can use this :
function removeQueryStringFromURL ( $url )
{
$urlparts = parse_url($url);
if ( $urlparts != FALSE)
{
$url = http_build_url("http://user@www.example.com/pub/index.php?a=b#files",
array( "scheme" => $urlparts['scheme'],
"host" => $urlparts['host'],
"path" => $urlparts['path']
));
return $url;
}
return $url;
}
Remember that this code needs pecl_http extension to work.
- 383
- 1
- 10
-
This doesn't answer how Facebook is doing what it is doing, as asked in the OP's question. – Flimm Dec 08 '17 at 08:27
Try this
$url_with_querystring = 'www.mydomian.com/myurl.html?unwantedthngs';
$url_data = parse_url($url_with_querystring);
$url_without_querystring = str_replace('?'.$url_data['query'], '', $url_with_querystring);
- 1,217
- 1
- 7
- 18
-
This doesn't answer how Facebook is doing what it is doing, as asked in the OP's question. – Flimm Dec 08 '17 at 08:28