Skip to main content
All CollectionsSEO settings
Managing site indexing
Managing site indexing

An approach to indexing website pages, managing website indexing, and combating duplicate content

Updated over a month ago

Our approach to managing the indexing of website pages, expressed in a set of different settings, is the result of implementing the recommendations of search engines and SEO specialists, as well as the result of practical observations of the indexing of launched websites. We continue to accept and implement new reasoned recommendations.

We use a combination of several tools to manage website indexing and combat duplicate content:

  • Indexing disallowance in the robots.txt file is used to reduce the server load that occurs when indexing a large number of filtering pages.

  • Prohibit indexing using the robots tag to ensure that unnecessary pages are not indexed and are not included in the secondary index.

  • Specifying canonical pages (rel canonical) to link identical pages into one, for those pages that still need to be indexed.

Below is a detailed description of the indexing settings for each type of page.

Category pages

Category pages are the main landing pages for promotion, so they are always open for indexing by default. In some cases, it is possible to close them for indexing.

How it is implemented:

  • Category pages are not closed for indexing in robots.txt;

  • By default, category pages do not have the robots tag and thus are open for indexing.

If necessary, you can control the content of the robots tag for each category page. In the page properties, you can enable the nofollow and/or noindex attributes. If you enable those, they will also be set for all products for which this category is the parent. However, this setting will not affect child categories and their products.

Links to these pages are present in the sitemap.xml file (unless the Include to sitemap option is disabled for the page).

Filter pages

All filter pages are closed for indexing by default. Since there are a lot of filter combinations and the content of the pages is displayed dynamically, even accessing these pages from several search engines at the same time can create a significant load on the server.

How it is implemented:

  • In robots.txt, all filter combinations that contain more than three filters at the same time are not indexed.

  • All filter pages have robots tags with the noindex, follow, and canonical values with a link to the category page without a filter.

Filters to index

It is also possible to open some filters or their combinations (up to two filters at a time) for indexing using the Filters to index feature.

When setting up indexed filters, you can specify the category for which they will be opened (you can specify the catalog root) and select 1 or 2 filter properties that will be used to open them for indexing. If two properties are specified, pages with filters by each of these properties separately and by a combination of these two properties will be opened. But filters by two values for one property are always closed for indexing.

To open page indexing, the following settings are used:

  • The pages are not initially closed from indexing in robots.txt;

  • These pages contain the robots tag with the index, follow, and canonical values with a link to the filter page being indexed.

Sorting and product display pages

Sorting pages (contain filter/sort_ in the link) and pages of different product display formats (contain view_type= in the link) are unambiguous duplicates, so they are closed for indexing by default by all possible means without the possibility to open them for indexing.

  • Sorting/display pages are closed in robots.txt;

  • By default, these pages have the <meta name="robots" content="noindex, follow"> tag set to disable indexing but allow further linking;

  • These pages have the canonical tag, which leads to a similar page without specifying a sort or display format;

  • These pages are not included in the sitemap.xml.

Pagination pages

  • Pagination pages contain unique content (different products), so all of them should be indexed.

  • Pagination pages are not closed for indexing in robots.txt or using the robots tag.

  • In order to merge all pages into one, according to Google's recommendations, we use the <rel="next"> and <rel="prev"> tags.

  • For all pagination pages, we do not use the canonical tag with a link to the first page.

  • All pagination pages have a canonical tag with a link to their own pagination page.

  • Only the page=all page has the canonical tag with a link to the first page, because it does not have the <rel="next"> and <rel="prev"> tags.

  • Pagination pages are not included in the sitemap.xml.

There are also two alternative settings for pagination pages that contradict Google's requirements, but are found in the recommendations of some SEO companies. These options are disabled by default, but can be enabled in the site admin panel in the SEO Additional SEO settings section:

  • Set the canonical tag with a link to the first page of the pagination.

  • On all pagination pages except the first one, set the robots tag with the noindex, follow values.

Brand pages

Brand pages are indexed using the same logic as category pages:

  • Root links of brand pages are not closed in robots.txt.

  • Filter pages with more than two levels, sorting, and different display format pages are not indexed in robots.txt.

  • The sorting and different display format pages are closed using the robots tag with the noindex, follow values.

  • Pagination pages are open for indexing according to the same logic as category pagination pages.

Product variants

  • Each product variant has its own URL.

  • At the same time, all these pages contain a link to the canonical page of the main variant.

  • Product pages are not closed for indexing in robots.txt.

  • Pages can be closed for indexing using the robots tag if their parent category is closed.

Filter presets

  • Pages with filter presets are created specifically for promotion and therefore they are open for indexing by default, without the ability to close their indexing.

  • There is no ban on indexing filter presets in robots.txt.

  • There are no robots tags on the preset pages.

  • Preset pages have a canonical tag that leads to a similar preset page.

  • Links to preset pages can be found in filters and in the sitemap.xml file.

Personal account, checkout

The checkout and personal account pages are closed for indexing in robots.txt and using the robots tag with the noindex, follow values.

Text pages, news pages

  • They are not closed for indexing in robots.txt.

  • By default, they do not contain the robots tag.

  • In the properties of each page, you can configure the value of the robots tag, which allows you to exclude pages from indexing at a specific point.

  • Pages with the noindex are not included in sitemap.xml.

Product comparison

The product comparison table is displayed on the catalog pages, without generating separate pages with their own links. Therefore, product comparison pages do not exist as such and are not indexed.

Language versions

By default, all language versions available to users are open for indexing. If necessary, you can close each language version for indexing separately.

  • Language versions are not closed for indexing in robots.txt.

  • Links to alternative translations are provided in the head block of each page and in sitemap.xml.

Language versions have the No index option. If it is enabled, the following settings take effect:

  • For all pages of this language version, the robots tag is set with the noindex, nofollow values.

  • Links are not placed in the head block and in sitemap.xml.

  • In the header, a link to a language version that is closed from indexing is indicated by the rel=nofollow attribute.

Search results

All pages with search results have the robots tag set to noindex, follow by default.

Did this answer your question?