Robot Text File

Related Articles

What is the Robot Text File?

The robot text file is used to disallow specific or all search engine spider’s access to folders or pages that you don't want indexed.

Why would you want to do this?

You may have created a personnel page for company employees that you don't want listed. Some webmasters use it to exclude their guest book pages so to avoid people spamming. There are many different reasons to use the robots text file.

How do I use it?

You need to upload it to the root of your web site or it will not work - if you don't have access to the root then you will need to use a Meta tag to disallow access. You need to include both the user agent and a file or folder to disallow.

What does it look like?

It's really nothing more than a "Notepad" type .txt file named "robots.txt"
The basic syntax is:
User-agent: spiders name here
Disallow:/ filename here

If you use:
User-agent: *
The * acts as a wildcard and disallows all spiders. You may want to use this to stop search engines listing unfinished pages.

To disallow an entire directory use:
Disallow:/mydirectory/

To disallow an individual file use:
Disallow:/file.htm

You have to use a separate line for each disallow. You cannot you for example use:
Disallow:/file1.htm,file2.html

Instead, you should use:
Use-agent/*
Disallow:/file1.htm
Disallow:/file2.htm

For a list of spider names visit http://www.searchengineworld.com/cgi-bin/robotcheck.cgi
Make sure you use the right syntax if you don't it will not work. You can check you syntax here http://www.robotstxt.org/wc/active/html/index.html

For help on creating robot text files there is a program call robogen. There is a free version and an advanced version, which costs $12.99 http://www.rietta.com/robogen/


Publication Date: Tuesday 9th November, 2004
Author: Alan Murray View profile

Related Articles