{"id":124203,"date":"2022-03-28T17:11:52","date_gmt":"2022-03-29T00:11:52","guid":{"rendered":"https:\/\/www.bruceclay.com\/?p=124203"},"modified":"2023-08-07T12:44:12","modified_gmt":"2023-08-07T19:44:12","slug":"robots-txt-guide","status":"publish","type":"post","link":"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/","title":{"rendered":"What Is robots.txt? A Beginner\u2019s Guide to Nailing It with Examples"},"content":{"rendered":"<p><a href=\"https:\/\/www.bruceclay.com\/wp-content\/uploads\/2022\/03\/robot-800.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-124205 size-full\" src=\"https:\/\/www.bruceclay.com\/wp-content\/uploads\/2022\/03\/robot-800.jpg\" alt=\"Wooden robot figure stands on a patch of grass.\" width=\"800\" height=\"420\" srcset=\"https:\/\/www.bruceclay.com\/wp-content\/uploads\/2022\/03\/robot-800.jpg 800w, https:\/\/www.bruceclay.com\/wp-content\/uploads\/2022\/03\/robot-800-300x158.jpg 300w, https:\/\/www.bruceclay.com\/wp-content\/uploads\/2022\/03\/robot-800-768x403.jpg 768w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><\/a><br \/>\nAh, robots.txt \u2014 one teeny tiny file with big implications. This is one technical SEO element you don\u2019t want to get wrong, folks.<\/p>\n<p>In this article, I will explain why every website needs a robots.txt and how to create one (without causing problems for SEO). I\u2019ll answer common FAQs and include examples of how to execute it properly for your website. I\u2019ll also give you a downloadable guide that covers all the details.<\/p>\n<p>Contents:<\/p>\n<ul>\n<li><a href=\"#what-is-robots-txt\"><strong>What is robots.txt?<\/strong><\/a><\/li>\n<li><a href=\"#why-is-robots-important\"><strong>Why is robots.txt important?<\/strong><\/a><\/li>\n<li><a href=\"#but-is-robots-txt-necessary\"><strong>But, is robots.txt necessary?<\/strong><\/a><\/li>\n<li><a href=\"#what-problems-can-occur-with-robots-txt\"><strong>What problems can occur with robots.txt?<\/strong><\/a><\/li>\n<li><a href=\"#how-does-robots-txt-work\"><strong>How does robots.txt work?<\/strong><\/a><\/li>\n<li><a href=\"#tips-for-creating-robots-txt-without-errors\"><strong>Tips for creating a robots.txt without errors<\/strong><\/a><\/li>\n<li><a href=\"#the-robots-txt-tester\"><strong>The robots.txt Tester<\/strong><\/a><\/li>\n<li><a href=\"#robots-exclusion-protocol-guide\"><strong>Robots Exclusion Protocol Guide (free download)<\/strong><\/a><\/li>\n<li><a href=\"#faq-how-can-I-optimize-websites-performance-effective-robots-txt-file\"><strong>FAQ: How can I optimize my website&#8217;s performance with an effective robots.txt file?<\/strong><\/a><\/li>\n<\/ul>\n<h2><a id=\"what-is-robots-txt\"><\/a>What Is robots.txt?<\/h2>\n<p>Robots.txt is a text file that website publishers create and save at the root of their website. Its purpose is to tell automated web crawlers, such as search engine bots which pages not to crawl on the website. This is also known as robots exclusion protocol.<\/p>\n<p>Robots.txt does not guarantee that excluded URLs won\u2019t be indexed for search. That\u2019s because search engine spiders can still find out those pages exist via other webpages that are linking to them. Or, the pages may still be indexed from the past (more on that later).<\/p>\n<p>Robots.txt also does not absolutely guarantee a bot won\u2019t crawl an excluded page, since this is a voluntary system. It would be rare for major search engine bots not to adhere to your directives. But others that are bad web robots, like spambots, malware, and spyware, often do not follow orders.<\/p>\n<p>Remember, the robots.txt file is publicly accessible. You can just add \/robots.txt to the end of a domain URL to see its robots.txt file (like ours <a href=\"https:\/\/www.bruceclay.com\/robots.txt\">here<\/a>). So do not include any files or folders that may include business-critical information. And do not rely on the robots.txt file to protect private or sensitive data from search engines.<\/p>\n<p>OK, with those caveats out of the way, let\u2019s go on\u2026<\/p>\n<h2><a id=\"why-is-robots-important\"><\/a>Why Is robots.txt Important?<\/h2>\n<p>Search engine bots have the directive to crawl and index webpages. With a robots.txt file, you can selectively exclude pages, directories, or the entire site from being crawled.<\/p>\n<p>This can be handy in many different situations. Here are some situations you\u2019ll want to use your robots.txt:<\/p>\n<ul>\n<li>To block certain pages or files that should not be crawled\/indexed (such as unimportant or similar pages)<\/li>\n<li>To stop crawling certain parts of the website while you\u2019re updating them<\/li>\n<li>To tell the search engines the location of your sitemap<\/li>\n<li>To tell the search engines to ignore certain files on the site, like videos, audio files, images, PDFs, etc., and not have them show up in the search results<\/li>\n<li>To help ensure your server is not overwhelmed with requests*<\/li>\n<\/ul>\n<p>*Using robots.txt to block off unnecessary crawling is one way to reduce the strain on your server and help bots more efficiently find your good content. Google provides a handy chart <a href=\"https:\/\/developers.google.com\/search\/docs\/advanced\/robots\/intro\" target=\"_blank\" rel=\"noopener\">here<\/a>. Also, Bing supports the crawl-delay directive, which can help to prevent too many requests and avoid overwhelming the server.<\/p>\n<p>Of course, there are many applications of robots.txt, and I\u2019ll outline more of them in this article.<\/p>\n<h2><a id=\"but-is-robots-txt-necessary\"><\/a>But, Is robots.txt Necessary?<\/h2>\n<p>Every website should have a robots.txt file, even if it is blank. When search engine bots come to your website, the first thing they look for is a robots.txt file.<\/p>\n<p>If none exists, then the spiders are served a 404 (not found) error. Although Google says that Googlebot can go on and crawl the site even if there\u2019s no robots.txt file, we believe that it is better to have the first file that a bot requests load rather than produce a 404 error.<\/p>\n<h2><a id=\"what-problems-can-occur-with-robots-txt\"><\/a>What Problems Can Occur with robots.txt?<\/h2>\n<p>This simple little file can cause problems for SEO if you\u2019re not careful. Here are a couple of situations to watch out for.<\/p>\n<h3>1. Blocking your whole site by accident<\/h3>\n<p>This gotcha happens more often than you\u2019d think. Developers can use robots.txt to hide a new or redesigned section of the site while they\u2019re developing it, but then forget to <em>unblock<\/em> it after launch. If it\u2019s an existing site, this mistake can cause search engine rankings to suddenly tank.<\/p>\n<p>It\u2019s handy to be able to turn off crawling while you\u2019re preparing a new site or site section for launch. Just remember to change that command in your robots.txt when the site goes live.<\/p>\n<h3>2. Excluding pages that are already indexed<\/h3>\n<p>Blocking in robots.txt pages that are indexed causes them to be stuck in Google\u2019s index.<\/p>\n<p>If you exclude pages that are already in the search engine\u2019s index, they\u2019ll stay there. In order to actually remove them from the index, you should set a meta robots &#8220;noindex&#8221; tag on the pages themselves and let Google crawl and process that. Once the pages are dropped from the index, then block them in robots.txt to prevent Google from requesting them in the future.<\/p>\n<h2><a id=\"how-does-robots-txt-work\"><\/a>How Does robots.txt Work?<\/h2>\n<p>To create a robots.txt file, you can use a simple application like Notepad or TextEdit. Save it with the filename <strong>robots.txt<\/strong> and upload it to the root of your website as www.domain.com\/robots.txt \u2014\u2014 this is where spiders will look for it.<\/p>\n<p>A simple robots.txt file would look something like this:<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace\">User-agent: *<\/span><br \/>\n<span style=\"font-family: 'courier new', courier, monospace\">Disallow: \/directory-name\/<\/span><\/p>\n<p>Google gives a good explanation of what the different lines in a group mean within the robots.txt file in its <a href=\"https:\/\/developers.google.com\/search\/docs\/advanced\/robots\/create-robots-txt\" target=\"_blank\" rel=\"noopener\">help file on creating robots.txt<\/a>:<\/p>\n<blockquote><p>Each group consists of multiple rules or directives (instructions), one directive per line.<\/p>\n<p>A group gives the following information:<\/p>\n<ul>\n<li>Who the group applies to (the user agent)<\/li>\n<li>Which directories or files that agent can access<\/li>\n<li>Which directories or files that agent cannot access<\/li>\n<\/ul>\n<\/blockquote>\n<p>I\u2019ll explain more about the different directives in a robots.txt file next.<\/p>\n<h3>Robots.txt Directives<\/h3>\n<p>Common syntax used within robots.txt includes the following:<\/p>\n<h4>User-agent<\/h4>\n<p>User-agent refers to the bot in which you are giving the commands (for example, Googlebot or Bingbot). You can have multiple directives for different user agents. But when you use the * character (as shown in the previous section), that is a catch-all that means all user agents. You can see a list of user agents <a href=\"https:\/\/developers.whatismybrowser.com\/useragents\/explore\/\" target=\"_blank\" rel=\"noopener\">here<\/a>.<\/p>\n<h4>Disallow<\/h4>\n<p>The Disallow rule specifies the folder, file or even an entire directory to exclude from Web robots access. Examples include the following:<\/p>\n<p>Allow robots to spider the entire website:<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace\">User-agent: *<\/span><br \/>\n<span style=\"font-family: 'courier new', courier, monospace\">Disallow:<\/span><\/p>\n<p>Disallow all robots from the entire website:<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace\">User-agent: *<\/span><br \/>\n<span style=\"font-family: 'courier new', courier, monospace\">Disallow: \/<\/span><\/p>\n<p>Disallow all robots from \u201c\/myfolder\/\u201d and all subdirectories of \u201cmyfolder\u201d:<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace\">User-agent: *<\/span><br \/>\n<span style=\"font-family: 'courier new', courier, monospace\">Disallow: \/myfolder\/<\/span><\/p>\n<p>Disallow all robots from accessing any file beginning with \u201cmyfile.html\u201d:<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace\">User-agent: *<\/span><br \/>\n<span style=\"font-family: 'courier new', courier, monospace\">Disallow: \/myfile.html<\/span><\/p>\n<p>Disallow Googlebot from accessing files and folders beginning with \u201cmy\u201d:<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace\">User-agent: googlebot<\/span><br \/>\n<span style=\"font-family: 'courier new', courier, monospace\">Disallow: \/my<\/span><\/p>\n<h4>Allow<\/h4>\n<p>This command is only applicable to Googlebot and tells it that it can access a subdirectory folder or webpage even when its parent directory or webpage is disallowed.<\/p>\n<p>Take the following example: Disallow all robots from the \/scripts\/folder except page.php:<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace\">Disallow: \/scripts\/<\/span><br \/>\n<span style=\"font-family: 'courier new', courier, monospace\">Allow: \/scripts\/page.php<\/span><\/p>\n<h4>Crawl-delay<\/h4>\n<p>This tells bots how long to wait to crawl a webpage. Websites might use this to preserve server bandwidth. Googlebot does not recognize this command, and Google asks that you <a href=\"https:\/\/support.google.com\/webmasters\/answer\/48620\" target=\"_blank\" rel=\"noopener\">change the crawl rate via Search Console<\/a>. Avoid Crawl-delay if possible or use it with care as it can significantly impact the timely and effective crawling of a website.<\/p>\n<h4>Sitemap<\/h4>\n<p>Tell search engine bots where to find your XML sitemap in your robots.txt file. Example:<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace\">User-agent: *<\/span><br \/>\n<span style=\"font-family: 'courier new', courier, monospace\">Disallow: \/directory-name\/<\/span><br \/>\n<span style=\"font-family: 'courier new', courier, monospace\">Sitemap: https:\/\/www.domain.com\/sitemap.xml<\/span><\/p>\n<p>To learn more about creating XML sitemaps, see this: <strong><a href=\"https:\/\/www.bruceclay.com\/blog\/what-is-xml-sitemap\/\">What Is an XML Sitemap and How do I Make One?<\/a><\/strong><\/p>\n<h3>Wildcard Characters<\/h3>\n<p>There are two characters that can help direct robots on how to handle specific URL types:<\/p>\n<p><strong>The * character.<\/strong> As mentioned earlier, it can apply directives to multiple robots with one set of rules. The other use is to match a sequence of characters in a URL to disallow those URLs.<\/p>\n<p>For example, the following rule would disallow Googlebot from accessing any URL containing \u201cpage\u201d:<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace\">User-agent: googlebot<\/span><br \/>\n<span style=\"font-family: 'courier new', courier, monospace\">Disallow: \/*page<\/span><\/p>\n<p><strong>The $ character.<\/strong> The $ tells robots to match any sequence at the end of a URL. For example, you might want to block the crawling of all PDFs on the website:<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace\">User-agent: *<\/span><br \/>\n<span style=\"font-family: 'courier new', courier, monospace\">Disallow: \/*.pdf$<\/span><\/p>\n<p>Note that you can combine $ and * wildcard characters, and they can be combined for allow and disallow directives.<\/p>\n<p>For example, Disallow all asp files:<\/p>\n<p><span style=\"font-family: 'courier new', courier, monospace\">User-agent: *<\/span><br \/>\n<span style=\"font-family: 'courier new', courier, monospace\">Disallow: \/*asp$<\/span><\/p>\n<ul>\n<li>This will not exclude files with query strings or folders due to the $ which designates the end<\/li>\n<li>Excluded due to the wildcard preceding asp \u2013 \/pretty-wasp<\/li>\n<li>Excluded due to the wildcard preceding asp \u2013 \/login.asp<\/li>\n<li>Not excluded due to the $ and the URL including a query string (?forgotten-password=1) \u2013 \/login.asp?forgotten-password=1<\/li>\n<\/ul>\n<h3>Not Crawling vs. Not Indexing<\/h3>\n<p>If you do not want Google to index a page, there are other remedies for that other than the robots.txt file. As Google points out <a href=\"https:\/\/developers.google.com\/search\/docs\/advanced\/robots\/robots-faq#h02\" target=\"_blank\" rel=\"noopener\">here<\/a>:<\/p>\n<blockquote><p>Which method should I use to block crawlers?<\/p>\n<ul>\n<li>robots.txt: Use it if crawling of your content is causing issues on your server. For example, you may want to disallow crawling of infinite calendar scripts. You should not use the robots.txt to block private content (use server-side authentication instead), or <a href=\"https:\/\/developers.google.com\/search\/docs\/advanced\/crawling\/consolidate-duplicate-urls\" target=\"_blank\" rel=\"noopener\">handle canonicalization<\/a>. To make sure that a URL is not indexed, use the robots meta tag or X-Robots-Tag HTTP header instead.<\/li>\n<li>robots meta tag: Use it if you need to control how an individual HTML page is shown in search results (or to make sure that it&#8217;s not shown).<\/li>\n<li>X-Robots-Tag HTTP header: Use it if you need to control how non-HTML content is shown in search results (or to make sure that it&#8217;s not shown).<\/li>\n<\/ul>\n<\/blockquote>\n<p>And here is more guidance from Google:<\/p>\n<blockquote><p>Blocking Google from crawling a page is likely to remove the page from Google&#8217;s index.<br \/>\nHowever, robots.txt Disallow does not guarantee that a page will not appear in results: Google may still decide, based on external information such as incoming links, that it is relevant. If you wish to explicitly block a page from being indexed, you should instead use the noindex robots meta tag or X-Robots-Tag HTTP header. In this case, you should not disallow the page in robots.txt, because the page must be crawled in order for the tag to be seen and obeyed.<\/p><\/blockquote>\n<h2><a id=\"tips-for-creating-robots-txt-without-errors\"><\/a>Tips for Creating a robots.txt without Errors<\/h2>\n<p>Here are some tips to keep in mind as you create your robots.txt file:<\/p>\n<ul>\n<li>Commands are case-sensitive. You need a capital \u201cD\u201d in <span style=\"font-family: 'courier new', courier, monospace\">Disallow<\/span>, for example.<\/li>\n<li>Always include a space after the colon in the command.<\/li>\n<li>When excluding an entire directory, put a forward slash before and after the directory name, like so: <span style=\"font-family: 'courier new', courier, monospace\">\/directory-name\/<\/span><\/li>\n<li>All files not specifically excluded will be included for bots to crawl.<\/li>\n<\/ul>\n<h2><a id=\"the-robots-txt-tester\"><\/a>The robots.txt Tester<\/h2>\n<p>Always test your robots.txt file. It is more common that you might think for website publishers to get this wrong, which can destroy your SEO strategy (like if you disallow the crawling of important pages or the entire website).<\/p>\n<p>Use Google\u2019s robots.txt Tester tool. You can find information about that <a href=\"https:\/\/support.google.com\/webmasters\/answer\/6062598\" target=\"_blank\" rel=\"noopener\">here<\/a>.<\/p>\n<h2><a id=\"robots-exclusion-protocol-guide\"><\/a>Robots Exclusion Protocol Guide<\/h2>\n<p>If you need a deeper dive than this article, download our <strong><a href=\"https:\/\/www.bruceclay.com\/wp-content\/uploads\/2022\/03\/bruceclayinc-robots-exclusion-guide-2022.pdf\">Robots Exclusion Protocol Guide<\/a><\/strong>. It\u2019s a free PDF that you can save and print for reference to give you lots of specifics on how to build your robots.txt.<\/p>\n<h2>Closing Thoughts<\/h2>\n<p>The robots.txt file is a seemingly simple file, but it allows website publishers to give complex directives on how they want bots to crawl a website. Getting this file right is critical, as it could obliterate your SEO program if done wrong.<\/p>\n<p>Because there are so many nuances on how to use robots.txt, be sure to read <a href=\"https:\/\/developers.google.com\/search\/docs\/advanced\/robots\/intro\" target=\"_blank\" rel=\"noopener\">Google\u2019s introduction to robots.txt<\/a>.<\/p>\n<p><em><strong>Do you have indexing problems or other issues that need technical SEO expertise? If you\u2019d like a free consultation and services quote, <a href=\"https:\/\/www.bruceclay.com\/quoteform\/\">contact us today<\/a>.<\/strong><\/em><\/p>\n<h3><strong><a id=\"faq-how-can-I-optimize-websites-performance-effective-robots-txt-file\"><\/a>FAQ: How can I optimize my website&#8217;s performance with an effective robots.txt file?<\/strong><\/h3>\n<p>Ensuring your website&#8217;s optimal performance is paramount to success. A key aspect often overlooked is the strategic use of a robots.txt file. This unassuming text document wields the power to significantly impact your site&#8217;s search engine optimization (SEO) and overall performance.<\/p>\n<p>At its core, a robots.txt file is a gatekeeper for search engine bots, guiding them on which parts of your website to crawl and index. By skillfully crafting this file, you can strategically control how search engines interact with your content. This optimization technique is vital for preventing unnecessary strain on your server, ensuring that valuable resources are allocated efficiently.<\/p>\n<p>One essential application of robots.txt optimization is the ability to exclude specific pages or directories from being crawled. This is particularly useful for hiding unimportant or redundant pages, preventing search engines from wasting resources on irrelevant content. For instance, you can avoid video or audio files from being crawled, preserving your server&#8217;s bandwidth for more critical components.<\/p>\n<p>Updating your website can be delicate, often requiring temporary withdrawal of specific pages. By utilizing robots.txt optimization, you can gracefully handle this situation without affecting SEO rankings. Temporarily blocking crawling on pages undergoing updates ensures that search engines won&#8217;t index incomplete or inconsistent content, maintaining your site&#8217;s credibility.<\/p>\n<p>Moreover, robots.txt optimization empowers you to guide search engines toward your sitemap&#8217;s location. This simple step helps search engine bots navigate your site&#8217;s structure efficiently, ensuring no valuable content is overlooked. Strategically placing your sitemap in robots.txt enhances the discoverability of your most important pages.<\/p>\n<p>While the benefits of robots.txt optimization are substantial, it&#8217;s crucial to proceed cautiously. Improper configuration can inadvertently block important pages, leading to declining search engine rankings. Therefore, seeking the guidance of SEO experts or referring to reputable resources, such as Google&#8217;s guidelines, is highly recommended before implementing changes.<\/p>\n<p>A practical robots.txt file is a powerful tool in your SEO arsenal. By optimizing this seemingly unassuming element, you can exert control over how search engines interact with your website, ultimately enhancing performance, resource allocation, and overall user experience.<\/p>\n<p><strong>Step-by-Step Procedure for robots.txt Optimization:<\/strong><\/p>\n<ol>\n<li>Understand the role of robots.txt in SEO and website performance.<\/li>\n<li>Select any pages or directories from which you would like to exclude crawling.<\/li>\n<li>Create a robots.txt file using any plain-text editor like Notepad or TextEdit.<\/li>\n<li>Specify user-agent directives to target search engine bots (e.g., User-agent: Googlebot).<\/li>\n<li>Utilize the Disallow directive to block access to pages or directories you want to exclude (e.g., Disallow: \/videos\/).<\/li>\n<li>Implement the Allow directive for specific pages within blocked directories (e.g., Allow: \/videos\/index.html).<\/li>\n<li>Use the Crawl-delay directive to control the rate at which bots crawl your site, if necessary.<\/li>\n<li>Include the Sitemap directive to guide search engines to your XML sitemap (e.g., Sitemap: https:\/\/www.domain.com\/sitemap.xml).<\/li>\n<li>Test your robots.txt file using Google&#8217;s robots.txt Tester tool to identify any issues or errors.<\/li>\n<li>Upload the robots.txt file to the root directory of your website via FTP or your content management system (CMS).<\/li>\n<li>Monitor your website&#8217;s performance and search engine rankings after implementing robots.txt optimization.<\/li>\n<li>Regularly update and refine your robots.txt file as your website&#8217;s structure and content evolve.<\/li>\n<li>Consult SEO experts or reputable resources for guidance on best practices and advanced optimization techniques.<\/li>\n<li>Review and analyze your website&#8217;s crawl and index statistics to ensure effective robots.txt optimization.<\/li>\n<li>Adjust directives as needed based on changes in your website&#8217;s content and goals.<\/li>\n<li>Avoid blocking critical pages that are essential for search engine visibility and user experience.<\/li>\n<li>Continuously stay informed about updates and changes to search engine algorithms that may impact robots.txt optimization.<\/li>\n<li>Prioritize user experience and ensure that any exclusions align with your website&#8217;s content strategy.<\/li>\n<li>Regularly audit and maintain your robots.txt file to ensure ongoing optimization and performance.<\/li>\n<li>Keep abreast of emerging trends and best practices in SEO and robots.txt optimization for sustained success.<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>The one technical SEO element you don\u2019t want to get wrong is robots.txt. So here&#8217;s a handy guide that explains why every website needs it and how to create one.<\/p>\n","protected":false},"author":35,"featured_media":124205,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[4],"tags":[1589,1590,1232,1579,1588],"acf":[],"yoast_head":"<title>BruceClay - What Is robots.txt? A Beginner\u2019s Guide with Examples<\/title>\n<meta name=\"description\" content=\"Find out why every website needs a robots.txt and how to create one with our guide and examples.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"What Is robots.txt? A Beginner\u2019s Guide with Examples\" \/>\n<meta property=\"og:description\" content=\"Find out why every website needs a robots.txt and how to create one with our guide and examples.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/\" \/>\n<meta property=\"og:site_name\" content=\"Bruce Clay, Inc.\" \/>\n<meta property=\"article:published_time\" content=\"2022-03-29T00:11:52+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-08-07T19:44:12+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.bruceclay.com\/wp-content\/uploads\/2022\/03\/robot-1200-630.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"630\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Bruce Clay\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/www.bruceclay.com\/wp-content\/uploads\/2022\/03\/robot-1200-675.jpg\" \/>\n<meta name=\"twitter:creator\" content=\"@bruceclayinc\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Bruce Clay\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/\",\"url\":\"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/\",\"name\":\"What Is robots.txt? A Beginner\u2019s Guide with Examples\",\"isPartOf\":{\"@id\":\"https:\/\/www.bruceclay.com\/#website\"},\"datePublished\":\"2022-03-29T00:11:52+00:00\",\"dateModified\":\"2023-08-07T19:44:12+00:00\",\"author\":{\"@id\":\"https:\/\/www.bruceclay.com\/#\/schema\/person\/efe2253985cb608dfc329890aec6387b\"},\"description\":\"Find out why every website needs a robots.txt and how to create one with our guide and examples.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.bruceclay.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"What Is robots.txt? A Beginner\u2019s Guide to Nailing It with Examples\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.bruceclay.com\/#website\",\"url\":\"https:\/\/www.bruceclay.com\/\",\"name\":\"Bruce Clay, Inc.\",\"description\":\"SEO and Internet Marketing\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.bruceclay.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.bruceclay.com\/#\/schema\/person\/efe2253985cb608dfc329890aec6387b\",\"name\":\"Bruce Clay\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.bruceclay.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/6ee3958a961521adf7b1d2bfe928dcb5?s=96&d=retro&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/6ee3958a961521adf7b1d2bfe928dcb5?s=96&d=retro&r=g\",\"caption\":\"Bruce Clay\"},\"description\":\"Bruce Clay is founder and president of Bruce Clay Inc., a global digital marketing firm providing search engine optimization, pay-per-click, social media marketing, SEO-friendly web architecture, and SEO tools and education. Connect with him on LinkedIn or through the BruceClay.com website.\",\"sameAs\":[\"https:\/\/www.linkedin.com\/in\/bruce-clay\/\",\"https:\/\/twitter.com\/bruceclayinc\"],\"url\":\"https:\/\/www.bruceclay.com\/blog\/author\/bclay\/\"}]}<\/script>","yoast_head_json":{"title":"BruceClay - What Is robots.txt? A Beginner\u2019s Guide with Examples","description":"Find out why every website needs a robots.txt and how to create one with our guide and examples.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/","og_locale":"en_US","og_type":"article","og_title":"What Is robots.txt? A Beginner\u2019s Guide with Examples","og_description":"Find out why every website needs a robots.txt and how to create one with our guide and examples.","og_url":"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/","og_site_name":"Bruce Clay, Inc.","article_published_time":"2022-03-29T00:11:52+00:00","article_modified_time":"2023-08-07T19:44:12+00:00","og_image":[{"width":1200,"height":630,"url":"https:\/\/www.bruceclay.com\/wp-content\/uploads\/2022\/03\/robot-1200-630.jpg","type":"image\/jpeg"}],"author":"Bruce Clay","twitter_card":"summary_large_image","twitter_image":"https:\/\/www.bruceclay.com\/wp-content\/uploads\/2022\/03\/robot-1200-675.jpg","twitter_creator":"@bruceclayinc","twitter_misc":{"Written by":"Bruce Clay","Est. reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/","url":"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/","name":"What Is robots.txt? A Beginner\u2019s Guide with Examples","isPartOf":{"@id":"https:\/\/www.bruceclay.com\/#website"},"datePublished":"2022-03-29T00:11:52+00:00","dateModified":"2023-08-07T19:44:12+00:00","author":{"@id":"https:\/\/www.bruceclay.com\/#\/schema\/person\/efe2253985cb608dfc329890aec6387b"},"description":"Find out why every website needs a robots.txt and how to create one with our guide and examples.","breadcrumb":{"@id":"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.bruceclay.com\/blog\/robots-txt-guide\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.bruceclay.com\/"},{"@type":"ListItem","position":2,"name":"What Is robots.txt? A Beginner\u2019s Guide to Nailing It with Examples"}]},{"@type":"WebSite","@id":"https:\/\/www.bruceclay.com\/#website","url":"https:\/\/www.bruceclay.com\/","name":"Bruce Clay, Inc.","description":"SEO and Internet Marketing","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.bruceclay.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.bruceclay.com\/#\/schema\/person\/efe2253985cb608dfc329890aec6387b","name":"Bruce Clay","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.bruceclay.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/6ee3958a961521adf7b1d2bfe928dcb5?s=96&d=retro&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/6ee3958a961521adf7b1d2bfe928dcb5?s=96&d=retro&r=g","caption":"Bruce Clay"},"description":"Bruce Clay is founder and president of Bruce Clay Inc., a global digital marketing firm providing search engine optimization, pay-per-click, social media marketing, SEO-friendly web architecture, and SEO tools and education. Connect with him on LinkedIn or through the BruceClay.com website.","sameAs":["https:\/\/www.linkedin.com\/in\/bruce-clay\/","https:\/\/twitter.com\/bruceclayinc"],"url":"https:\/\/www.bruceclay.com\/blog\/author\/bclay\/"}]}},"_links":{"self":[{"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/posts\/124203"}],"collection":[{"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/users\/35"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/comments?post=124203"}],"version-history":[{"count":7,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/posts\/124203\/revisions"}],"predecessor-version":[{"id":197360,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/posts\/124203\/revisions\/197360"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/media\/124205"}],"wp:attachment":[{"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/media?parent=124203"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/categories?post=124203"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/tags?post=124203"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}