|
Here you will find a detailed list of web crawlers and their different user agents. La directive Disallow The second line of any directive block is the "Disallow" line. You can have multiple "Disallow" directives that specify which parts of your site the crawler cannot access. An empty "Disallow" line means that you are not disallowing anything, and that a crawler can access all sections of your site. For example, if you want to allow all search engines to crawl your entire site, your block would look like this: User-agent.
Allow: / On the other hand, from crawling your site, your block Middle East Mobile Number List would look like this: User-agent: * Disallow: / The "Allow" and "Disallow" directives are not case sensitive, so it's up to you whether to capitalize them or not. However, the values contained in each directive are. For example, /photo/ is not the same as /Photo/ . However, you will often find the "Allow" and "Disallow" directives in capital letters, as this makes the file easier for humans to read.

directive Allow The "Allow" directive allows search engines to crawl a specific subdirectory or page, even in an otherwise prohibited directory. For example, if you want to block Googlebot from accessing all but one of your blog posts, your directive might look like this: User-agent: Googlebot Disallow: /blog Allow: /blog/example-post Note : Not all search engines recognize this command. But Google and Bing take this directive into account. La directive Sitemap The Sitemap directive tells search engines, including Bing, Yandex, and Google, where to find your XML sitemap. Sitemaps typically include the pages you want search engines to crawl and index.
|
|