Improve Magento SEO Using The robots.txt File

Magento is a great open source E-Commerce system but when it comes to search optimisation it needs some help. There are some simple changes you can make that will improve your standing in the search engines.

When working with Magento we found that the best way to improve SEO was using extensions. These often cost money and some of them are priced at $250 for one extension. We wish Magento was better equipped for search optimisation without needing extensions.

Google uses automated bots that visit websites, follow links and updates Google. The robots.txt file is used to help these search engine bots (like GoogleBot and BingBot) determine what content they should look at.

Creating a robots.txt file for Magento is important if you want to improve your stores optimisation. By default there is no robots.txt in Magento Community or Enterprise distribution so you need to create it yourself.

Please note: The robots.txt file is set up once for each domain. If you have multiple domains or sub-domains for your Magento store then you’ll need to copy the robots.txt file to the other domains.

How Does robots.txt Improve Magento SEO?

  • It will help prevent duplicate content issues that could damage your ranking in search engines
  • Magento creates a lot of pages to show and filter products. These files don’t need to be added to Google so we use the file to control this
  • It can speed up your website by blocking and reducing the number of server file requests
  • You can help prevent errors logs, reports, core files, .SVN/.git files from being indexed accidentally

Example Magento robots.txt File

You should never blindly copy and paste example files and use it on your store without reviewing it first. Every Magento store has its own structure so you may need to change the robots.txt file above to suit your needs.

## Enable robots.txt rules for all crawlers
User-agent: *

## Don't crawl development files and folders
Disallow: .cvs
Disallow: .svn
Disallow: .idea
Disallow: .sql
Disallow: .tgz

## Don't crawl Magento admin page
Disallow: /admin/

## Don't crawl common Magento folders
Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /magento/
Disallow: /media/*
Disallow: /pkginfo/
Disallow: /report/
Disallow: /scripts/
Disallow: /shell/
Disallow: /skin/
Disallow: /stats/
Disallow: /var/ ## Don't crawl common Magento files Disallow: /api.php Disallow: /cron.php Disallow: /cron.sh Disallow: /error_log Disallow: /get.php Disallow: /install.php Disallow: /LICENSE.html Disallow: /LICENSE.txt Disallow: /LICENSE_AFL.txt Disallow: /README.txt Disallow: /RELEASE_NOTES.txt Disallow: /STATUS.txt ## Don't crawl sub-category pages that are sorted or filtered. Disallow: /*?dir* Disallow: /*?dir=desc Disallow: /*?dir=asc Disallow: /*?limit=all Disallow: /*?mode* ## Do not crawl links with session IDs Disallow: /*?SID= ## Don't crawl the checkout and user account pages Disallow: /checkout/ Disallow: /onestepcheckout/ Disallow: /customer/ Disallow: /customer/account/ Disallow: /customer/account/login/ ## Don't crawl search pages and catalogue links Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalog/product/gallery/
Disallow: /catalogsearch/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/
Disallow: /checkout/
Disallow: /onestepcheckout/ ## Don't crawl common server folders / files Disallow: /cgi-bin/ Disallow: /cleanup.php Disallow: /apc.php Disallow: /memcache.php Disallow: /phpinfo.php
# Paths that can be safely ignored (no clean URLs)
Disallow: /*?p=*&
Disallow: /*.php$
Disallow: /*?SID=
## Un-comment if you don't want Google and Bing to index your images (Not recommended) # User-agent: Googlebot-Image # Disallow: / # User-agent: msnbot-media # Disallow: /

There are Magento extensions that can be installed to give you more flexibility over what pages are indexed in the search results, but this is the simplest approach that will help improve your search engine optimisation efforts.

Check Everything Is Working

There are a number of online robots.txt file checkers online but these three are ones that we commonly use.

Affiliate Disclosure:

We may link to products and online services provided by third-parties. Soe of the links that we post on our site are affiliate links, which means that we receive commission if you purchase the item. We will never recommend a product or service that we have not used ourselves. Our reviews will be honest and we will only recommend something if we have found it useful.

Disclaimer:

Lacey Tech Solutions publish blog articles to help small businesses. We are not liable for any damages if you choose to follow the advice from our blog.

Leave a Comment

Your email address will not be published. Required fields are marked *