Loading...
X

How to prevent search engines from indexing only the main page of the site

To prevent search engines from indexing only the main page, while allowing indexing of all other pages, you can use several approaches, depending on the characteristics of a particular site.

1. Using the robots.txt file

If the main page has its own address (usually it is index.php, index.html, index.htm, main.html and so on), and while trying to open a link like w-e-b.site/ a website redirects to the main page, for example, to w-e-b.site/index.htm, then you can use the robots.txt file with something like the following content:

User-agent: *
Disallow: /index.php
Disallow: /index.html
Disallow: /index.htm
Disallow: /main.html

In fact, using an explicit name for the main page is the exception rather than the rule. So let's look at other options.

You can use the following approach:

  1. Deny site-wide access with the “Disallow” directive.
  2. Then allow the indexing of the entire site using the “Allow” directive, except for the main page.

Sample robots.txt file:

User-agent: *
Allow: ?p=
Disallow: /

The “Allow” directive must always come before “Disallow”. The “Allow” directive allows all pages with a URL like “?p=”, and the “Disallow” directive disables all pages. As a result, the following result is obtained: indexing of the entire site (including the main page) is prohibited, except for pages with an address like “?p=”.

Let's look at the result of checking two URLs:

  • https://suay.ru/ (main page) – indexing is prohibited
  • https://suay.ru/?p=790#6 (article page) – indexing allowed

In the screenshot, number 1 marks the contents of the robots.txt file, number 2 is the URL being checked, and number 3 is the result of the check.

2. Using the robots meta tag

If your site is separate files, then add the robots meta tag to the HTML code of the main page file:

<meta name="robots" content="noindex,nofollow>

3. With .htaccess and mod_rewrite

Using .htaccess and mod_rewrite, you can block access to a specific file as follows:

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} Google [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Yandex [NC]
RewriteRule (index.php)|(index.htm)|(index.html) - [F]

Please note that when you try to open a link like https://w-e-b.site/ (that is, without specifying the name of the main page), a specific file is still requested on the web server side, for example, index.php, index.htm or index. html. Therefore, this method of blocking access (and, accordingly, indexing) works even if the main page of your site opens without specifying a specific file name (index.php, index.html, index.htm, main.html, and so on), as is usually the case.


Leave Your Observation

Your email address will not be published. Required fields are marked *