Topic: robots.txt

Have a default robots.txt file that blocks post.php, profile.php, userlist.php, login.php, etc. etc.

2 (edited by guardian34 2007-04-05 01:19)

Re: robots.txt

They already use something like this:

<meta name="ROBOTS" content="NOINDEX, FOLLOW" />

Doesn't robots.txt have to be at the root of a server?

3 (edited by Jérémie 2007-04-05 02:00)

Re: robots.txt

orlandu63 wrote:

Have a default robots.txt file that blocks post.php, profile.php, userlist.php, login.php, etc. etc.

Well, some people might want to have some of those indexed (mainly profile and userlist).

Re: robots.txt

guardian34 wrote:

They already use something like this:

<meta name="ROBOTS" content="NOINDEX, FOLLOW" />

Doesn't robots.txt have to be at the root of a server?

Exactly: the meta tags have the necessary effects without the need for a seperate file
If you don't want a page indexed, add

define('PUN_ALLOW_INDEX', 1);

before the call to header.php

Re: robots.txt

guardian34 wrote:

They already use something like this:

<meta name="ROBOTS" content="NOINDEX, FOLLOW" />

Doesn't robots.txt have to be at the root of a server?

No, it does not. Example:

http://www.sitepoint.com/forums/robots.txt

Re: robots.txt

orlandu63 wrote:
guardian34 wrote:

They already use something like this:

<meta name="ROBOTS" content="NOINDEX, FOLLOW" />

Doesn't robots.txt have to be at the root of a server?

No, it does not. Example:

http://www.sitepoint.com/forums/robots.txt

Yes it does.

http://www.robotstxt.org/wc/exclusion-admin.html

The Robot will simply look for a "/robots.txt" URL on your site, where a site is defined as a HTTP server running on a particular host and port number. For example:

    Site URL    Corresponding Robots.txt URL
    http://www.w3.org/     http://www.w3.org/robots.txt
    http://www.w3.org:80/     http://www.w3.org:80/robots.txt
    http://www.w3.org:1234/     http://www.w3.org:1234/robots.txt
    http://w3.org/     http://w3.org/robots.txt

Note that there can only be a single "/robots.txt" on a site. Specifically, you should not put "robots.txt" files in user directories, because a robot will never look at them. If you want your users to be able to create their own "robots.txt", you will need to merge them all into a single "/robots.txt". If you don't want to do this your users might want to use the Robots META Tag instead.

The fact that certain search engines may or may not look at the file outside of the root does not change the fact that it SHOULD be at the root.
And as I said, the meta tag accomplishes the same thing without the need for an extra file