Правильный robots.txt

Correct robots.txt

Correct robots.txt

File robots.txt — this is the main file that describes processing rules pages of the search engines. This file is needed to specify the primary site name, site map (sitemap.xml), open and closed sections of the site.
File robots.txt includes the following directives:

  • User-agent — Directive pointing to a robot following rules
    • * - all robots
    • Yandex — main robot Yandex
    • Googlebot — Googlebot main
    • StackRambler — Googlebot Rambler
    • Aport — AltaVista search engine robot
    • Slurp — робот Yahoo
    • MSNBot — робот MSN
  • Disallow — Directive of the ban part of the site
  • Allow — the Directive permits part of the website
  • Host — the Directive specify the primary site name
  • Sitemap— Directive specifying a site map (sitemap.xml)
  • Crawl-delay — Directive specifies how many seconds the robot can wait for a response from the site (required for a heavily loaded resources, so that the robot does not found the site unavailable)
  • Clean-param — Directive describing dynamic parameters are affect the contents of the site

In addition to the directives in robots.txt use special characters:

  • * - lubi (in including empty) sequence of characters
  • $ — is the restriction rules

To compile robots.txt use the above guidelines and sung by the characters on the following principle:

  • Specifies the name of the robot for which a written list of rules
    (User-agent: * rule for all robots)
  • Spell a list of prohibited sections of the site for the specified robot
    ( Disallow: / - prevent indexing of the entire site)
  • Spell a list of the authorized sections of the site
    (Allow: /home/ — allowed home partition)
  • Specifies the name of the website
    (Host: crazysquirrel.ru — the main site name crazysquirrel.ru)
  • Specifies the absolute path to file sitemap.xml
    (Sitemap: http:// crazysquirrel.ru/sitemap.xml)

If the site is not prohibited sections, robots.txt must be at least 4 stitches:

User-Agent: *
Allow: /
Host: crazysquirrel.ru
Sitemap: http://crazysquirrel.ru/sitemap.xml

Check robots.txt and it affect the indexing of the website, you can tools Yandex