Kategorien
Web Publishing

Magento robots.txt

Robots.txt für Magento 1.7.02 ff.

Beispiel für eine robots.txt für Magento

Heute bei neuem Shop (Magento 1.7.02) eingerichtet; Testbericht folgt später. Weitreichende Einstellungen. Änderungs- /Ergänzungsvorschläge sind willkommen!

Soll der /media Ordner indexierbar sein für die Bots oder nicht (siehe auch letzte Zeilen zu “Googlebot-Image” und “msnbot-media”)? Sollen die Bedingungen mit Wildcard ‘*’ belassen werden?

## robots.txt for Magento Community and Enterprise
## Adapted from: http://turnkeye.com/blog/optimize-robots-txt-for-magento/
## 16.04.2013 by zenzero.ch | Validator: http://whois.net/robots-txt-validator/
#
## GENERAL SETTINGS
#
## Enable robots.txt rules for all crawlers
User-agent: *
#
## Crawl-delay parameter: number of seconds to wait between successive requests to the same server.
## Set a custom crawl rate if you're experiencing traffic problems with your server.
# Crawl-delay: 30
#
## Magento sitemap: uncomment and replace the URL to your Magento sitemap file
Sitemap: http://example.ch/sitemap.xml
#
#
## DEVELOPMENT RELATED SETTINGS
## Do not crawl development files and folders: CVS, svn directories and dump files
# Disallow: /CVS
# Disallow: /*.svn$
# Disallow: /*.idea$
# Disallow: /*.sql$
# Disallow: /*.tgz$
#
## GENERAL MAGENTO SETTINGS
#
## Do not crawl Magento admin page
Disallow: /admin/
#
## Do not crawl common Magento technical folders
Disallow: /app/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /lib/
Disallow: /pkginfo/
Disallow: /shell/
Disallow: /var/
#
## Do not crawl common Magento files
Disallow: /api.php
Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /get.php
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /README.txt
Disallow: /RELEASE_NOTES.txt
#
## MAGENTO SEO IMPROVEMENTS
#
## Do not crawl sub category pages that are sorted or filtered.
Disallow: /*?dir*
Disallow: /*?dir=desc
Disallow: /*?dir=asc
Disallow: /*?limit=all
Disallow: /*?mode*
#
## Do not crawl 2-nd home page copy (example.com/index.php/). Uncomment it only if you activated Magento SEO URLs.
## Disallow: /index.php/
#
## Do not crawl links with session IDs
Disallow: /*?SID=
#
## Do not crawl checkout and user account pages
Disallow: /checkout/
Disallow: /onestepcheckout/
Disallow: /customer/
Disallow: /customer/account/
Disallow: /customer/account/login/
#
## Do not crawl search pages and not-SEO optimized catalog links
Disallow: /catalogsearch/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
#
## SERVER SETTINGS
#
## Do not crawl common server technical folders and files
Disallow: /cgi-bin/
Disallow: /cleanup.php
Disallow: /apc.php
Disallow: /memcache.php
Disallow: /phpinfo.php
#
## IMAGE CRAWLERS SETTINGS
#
## Extra: Uncomment if you do not wish Google and Bing to index your images
# User-agent: Googlebot-Image
# Disallow: /
# User-agent: msnbot-media
# Disallow: /

Von Andy

Dipl. Web Publisher EB-Zürich