22

I don't like that I see a lot of 404 errors in the access.log of my web server. I'm getting those errors because crawlers try to open a robots.txt file, but couldn't find any. So I want to place a simple robots.txt file that will prevent the 404 errors from appearing in my log file.

What is a minimum valid robots.txt file that will allow everything on the site to be crawled?

dan
  • 15,123
  • 11
  • 44
  • 52
bessarabov
  • 343
  • 2
  • 7

3 Answers3

25

As indicated here, create a text file named robots.txt in the top-level directory of your web server. You can leave it empty, or add:

User-agent: *
Disallow:

If you want robots to crawl everything. If not, then see the above link for more examples.

dan
  • 15,123
  • 11
  • 44
  • 52
7

The best minimal robots.txt is a completely empty file.

Any other "null" directives such as an empty Disallow or Allow: * are not only useless because they are no-ops, but add unneeded complexity.

If you don't want the file to be completely empty - or you want to make it more human-readable - simply add a comment beginning with the # character, such as # blank file allows all. Crawlers ignore lines starting with #.

Maximillian Laumeister
  • 15,972
  • 3
  • 31
  • 62
-1

I would say this;

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

It will allow Google to crawl everything but will disallow Google to Crawl your aadminn panel. Which is an ideal situation for you.