2

I'm seeing this in the iis-logs of two websites that I maintain:

GET /an/existing/page/on/my/site+ForceRecrawl:+0 - 80 - 207.46.195.105 HTTP/1.1 Mozilla/5.0+(compatible;+bingbot/2.0;++http://www.bing.com/bingbot.htm)

I get about one or two of these per day from these IP addresses: 207.46.195.105, 65.52.110.190.. an more, all belonging to msnbot-ip.search.msn.com

Probably Microsoft has a bug in their crawler? Any way, doing a search on "ForceRecrawl: 0" in major search engines comes up with a bunch of random sites. Doing the search on StackOverflow or here gave no results (to my amazement). Am I the only one seeing this? I first noticed these on the 9th of this month, and I'm seeing them pass almost daily since...

Another thing that I think is crazy, is that the URL http://www.bing.com/bingbot.htm redirects to mail.live.com (hotmail).

Currently I'm returning 404's but I'm considering to catch these, strip the trailing " ForceRecrawl: 0" and process as if it were a legitimate url.

Could anyone shed some light on this? Could it have to do with some configuration or so in Bing's Webmaster Tools?

danlefree
  • 12,838
  • 4
  • 42
  • 59
Louis Somers
  • 572
  • 7
  • 18
  • 1
    We are getting a bunch of these and we have not asked for any recrawls. We are using URL normalization which I wonder is what is trig erring these. Over three-quarters of the 404s in my logs are due to this. It doesn't impress me with Bing one bit as it has me chasing down issues which should not exist. Is there any help in sight? –  Nov 26 '11 at 20:30
  • We just got one of these also: GET /xxxxx/yyyy/zzzzz ForceRecrawl: 0 HTTP/1.1 Cache-Control: no-cache Connection: Keep-Alive Pragma: no-cache Accept: / Accept-Encoding: gzip, deflate User-Agent: Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) And we have IIS hosting the website which gives the error: System.Web.HttpException (0x80004005): A potentially dangerous Request.Path value was detected from the client (:). at System.Web.HttpRequest.ValidateInputIfRequiredByConfig() at System.Web.HttpApplication.PipelineStepManager.ValidateHelper(HttpContext context) Hopefully – Michael Ferrante Nov 03 '11 at 00:48
  • I have not configured any URL normalization in Webmaster tools, so that is not the cause. I'm still getting them, and it is annoying. – Louis Somers Nov 27 '11 at 01:15
  • Well, over a month and no relief in sight. Really rather sad the no one from Bing has even taken the time to answer here but, then again, the answers in their forums are mostly BS. –  Nov 29 '11 at 08:27
  • Hmm, when I click on your bingbot.htm link, it takes me to http://onlinehelp.microsoft.com/en-us/bing/hh204496.aspx, not to mail.live.com. Curious. – Ilmari Karonen Feb 27 '12 at 15:33
  • Looks like they corrected the redirect. It also looks like they fixed the problem. The last ForceRecrawl error was on the 24th of november 2011. Before that date I got a few of them daily. – Louis Somers Feb 29 '12 at 14:52

2 Answers2

3

You're not the only one. It seems to stem from Bing Webmaster tools which includes the option to force the bot to recrawl specific url's. However, this seems to be happening without user request for such forced recrawls.

The bot seems to be adding the instruction %20ForceRecrawl%3A%200 to the end of url's and trying to crawl the url plus the bit on the end, this of course throws up a 404 error.

We've removed some of these using the block function in BWT but it is still throwing up others. It might correct itself, if not, expect a 301 redirect might be needed.

knooq
  • 146
  • 2
  • Thanks, glad to know I'm not the only one :-) I tried redirecting but it seems IIS is handling it before it reaches my code, giving the error "A potentially dangerous Request.Path value was detected from the client (:)". Guess I'll leave it up to the Bing development to fix it. – Louis Somers Nov 02 '11 at 14:43
  • Hi, don't know if you are still having the same issue. I'm still seeing it with certain urls. I've mentioned it in their Webmaster Forum, but still no response. – knooq Dec 09 '11 at 14:58
-2

You should block the bots,

simple dissalow them on the robots.txt and the problem will never appear again, except if they change the name of the bot or create a new one, like the microsoft do, they use the msnbot, now they use the bingbot

teste
  • 1