I have a sub-directory that I would like to hide from web crawlers.
One way to do this is to use a robots.txt in the root directory of the server and add a rule to exclude that sub-directory. However, anyone with basic web knowledge can access the robots.txt contents and figure out the disallowed directories.
I thought a way to avoid this, but I am not sure if will work.
Let X be the name of the sub-directory that I want to exclude. One way to stop web crawlers indexing the X directory and at the same time make it harder for someone to identify X directory from root's robots.txt is to add a robots.txt in the X directory.
My questions are the following:
- Will the web crawlers find the
robots.txtin the sub-directory given that arobots.txtalready exist in the root directory as well? If
robots.txtis in theXsub-directory, then should I use relative or absolute paths?User-agent: * Disallow: /X/or
User-agent: * Disallow: /
/admin_secretsausageI don't want Google to index that, but I also don't want to expose that URL. – BugHunterUK Apr 13 '20 at 15:52