{"id":32700,"date":"2014-08-18T15:08:09","date_gmt":"2014-08-18T22:08:09","guid":{"rendered":"http:\/\/www.bruceclay.com\/blog\/?p=32700"},"modified":"2019-07-01T15:07:27","modified_gmt":"2019-07-01T22:07:27","slug":"how-and-when-to-block-content-from-search-engines","status":"publish","type":"post","link":"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/","title":{"rendered":"Nowhere Left to Hide: Blocking Content from Search Engine Spiders"},"content":{"rendered":"

TL;DR<\/h2>\n
    \n
  1. If you\u2019re considering excluding content from search engines, first make sure you\u2019re doing it for the right reasons.<\/li>\n
  2. Don\u2019t make the mistake of assuming you can hide content in a language or format the bots won\u2019t comprehend; that\u2019s a short-sighted strategy. Be up front with them by using the robots.txt file or Meta Robots tag.<\/li>\n
  3. Don\u2019t forget that just because you\u2019re using the recommended methods to block content you\u2019re safe. Understand how blocking content will make your site appear to the bots.<\/li>\n<\/ol>\n

    When and How to Exclude Content from a\u00a0Search Engine Index<\/h2>\n

    A major facet of SEO<\/a> is convincing search engines that your website is reputable and provides real value to searchers. And for search engines to determine the value and relevance of your content, they have to put themselves in the shoes of a user.<\/p>\n

    Now, the software that looks at your site has certain limitations which SEOs have traditionally exploited to keep certain resources hidden from the search engines. The bots continue to develop, however, and are continuously getting more sophisticated in their efforts to see your web page like a human user would on a browser. It\u2019s time to re-examine the content on your site that\u2019s unavailable to search engine bots, as well as the reasons why it\u2019s unavailable. There are still limitations in the bots and webmasters have legitimate reasons for blocking or externalizing certain pieces of content. Since the search engines are looking for sites that give quality content to users, let the user experience guide your projects and the rest will fall into place.<\/p>\n

    Why Block Content at All?<\/h2>\n
    \"when
    Photo by Steven Ferris (CC BY 2.0)<\/a>, modified<\/figcaption><\/figure>\n
      \n
    1. Private content. Getting pages indexed means that they are available to show up in search results, and are therefore visible to the public. If you have private pages (customers\u2019 account information, contact information for individuals, etc.) you want to keep them out of the index. (Some whois-type sites display registrant information in JavaScript to stop scraper bots from stealing personal info.)<\/li>\n
    2. Duplicated content. Whether snippets of text (trademark information, slogans or descriptions) or entire pages (e.g., custom search results within your site), if you have content that shows up on several URLs on your site, search engine spiders might see that as low-quality. You can use one of the available options to block those pages (or individual resources on a page) from being indexed. You can keep them visible to users but blocked from search results, which won\u2019t hurt your rankings for the content you do want showing up in search.<\/li>\n
    3. Content from other sources. Content, like ads, which are generated by third-party sources and duplicated several places throughout the web, aren\u2019t part of a page\u2019s primary content. If that ad content is duplicated many times throughout the web, a webmaster may want to keep ads from being viewed as part of the page.<\/li>\n<\/ol>\n

      That Takes Care of Why, How About How?<\/h2>\n

      I\u2019m so glad you asked. One method that\u2019s been used to keep content out of the index is to load the content from a blocked external source using a language that bots can\u2019t parse or execute; it\u2019s like when you spell out words to another adult because you don\u2019t want the toddler in the room to know what you\u2019re talking about. The problem is, the toddler in this situation is getting smarter. For a long time, if you wanted to hide something from the search engines, you could use JavaScript to load that content, meaning users get it, bots don\u2019t.<\/p>\n

      But Google is not being at all coy about their desire to parse JavaScript with their bots<\/a>. And they\u2019re beginning to do it; the Fetch as Google tool in Webmaster Tools allows you to see individual pages as Google\u2019s bots see them.<\/p>\n

      \"screenshot<\/a><\/p>\n

      If you\u2019re using JavaScript to block content on your site, you should check some pages in this tool; chances are, Google sees it.<\/p>\n

      Keep in mind, however, that just because Google can render content in JavaScript doesn\u2019t mean that content is being cached. The \u201cFetch and Render\u201d tool shows you what the bot can see; to find out what is being indexed you should still check the cached version of the page.<\/p>\n

      \"screenshot<\/a><\/p>\n

      There are plenty of other methods for externalizing content that people discuss: iframes, AJAX, jQuery. But as far back as 2012, experiments were showing that Google could crawl links placed in iframes; so there goes that technique. In fact, the days of speaking a language that bots couldn\u2019t understand are nearing an end.<\/p>\n

      But what if you politely ask the bots to avoid looking at certain things? Blocking or disallowing elements in your robots.txt<\/a>\u00a0or a Meta Robots tag<\/a>\u00a0is\u00a0the only certain way (short of password-protecting server directories<\/a>) of keeping elements or pages from being indexed.<\/p>\n

      John Mueller recently commented that content generated with AJAX\/JSON feeds would be \u201cinvisible to [Google] if you disallowed crawling of your JavaScript.\u201d He further goes on to clarify that simply blocking CSS or JavaScript will not necessarily hurt your ranking: \u201cThere\u2019s definitely no simple \u2018CSS or JavaScript is disallowed from crawling, therefore the quality algorithms view the site negatively\u2019 relationship.\u201d So the best way to keep content out of the index is simply asking the search engines not to index your content. This can be individual URLs, directories, or external files.<\/p>\n

      This, then, brings us back to the beginning: why. Before deciding to block any of your content, make sure you know why you\u2019re doing it, as well as the risks. First of all, blocking your CSS or JavaScript files (especially ones that contribute substantially to your site\u2019s layout) is risky; it can, among other things, prevent search engines from seeing if your pages are optimized for mobile<\/a>. Not only that, but after the rollout of Panda 4.0, some sites that got hit hard were able to rebound by unblocking their CSS and JavaScript<\/a> which would indicate that they were specifically targeted by Google\u2019s algorithm for blocking these elements from bots.<\/p>\n

      One more risk that you run when blocking content: search engine spiders may not be able to see what is being blocked, but they know that something<\/em> is being blocked, so they may be forced to make assumptions about what that content is. They know that ads, for instance, are often hidden in iframes or even CSS; so if you have too much blocked content near the top of a page, you run the risk of getting hit by the \u201cTop Heavy\u201d Page Layout Algorithm<\/a>. Any webmasters reading this who are considering using iframes should strongly consider consulting with a reputable SEO first. (Insert shameless BCI promo here<\/a>.)<\/p>\n","protected":false},"excerpt":{"rendered":"

      A major facet of SEO is convincing search engines that your website is reputable and provides real value to searchers. And for search engines to determine the value and relevance of your content, they have to put themselves in the shoes of a user.<\/p>\n

      Now, the software that looks at your site has certain limitations which SEOs have traditionally exploited to keep certain resources hidden from the search engines. The bots continue to develop, however, and are continuously getting more sophisticated in their efforts to see your web page like a human user would on a browser. It\u2019s time to re-examine the content on your site that\u2019s unavailable to search engine bots, as well as the reasons why it\u2019s unavailable. There are still limitations in the bots and webmasters have legitimate reasons for blocking or externalizing certain pieces of content. Since the search engines are looking for sites that give quality content to users, let the user experience guide your projects and the rest will fall into place.<\/p>\n

      Read why you might want to block content from search engine bots and the SEO recommended way to do so in Nowhere Left to Hide: Blocking Content from Search Engine Spiders<\/a>.<\/p>\n","protected":false},"author":168,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[4],"tags":[18],"acf":[],"yoast_head":"BruceClay - When and How to Block Content from Search Engine Spiders<\/title>\n<meta name=\"description\" content=\"There are legitimate reasons for excluding content on your site from a search engine's index. Understand the recommended ways to block content for SEO.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"When and How to Block Content from Search Engine Spiders\" \/>\n<meta property=\"og:description\" content=\"There are legitimate reasons for excluding content on your site from a search engine's index. Understand the recommended ways to block content for SEO.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/\" \/>\n<meta property=\"og:site_name\" content=\"Bruce Clay, Inc.\" \/>\n<meta property=\"article:author\" content=\"https:\/\/www.facebook.com\/john.alexander.79219754\" \/>\n<meta property=\"article:published_time\" content=\"2014-08-18T22:08:09+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2019-07-01T22:07:27+00:00\" \/>\n<meta name=\"author\" content=\"John Alexander\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/www.bruceclay.com\/blog\/wp-content\/uploads\/2014\/08\/search-engine-spider-green-light.jpg\" \/>\n<meta name=\"twitter:creator\" content=\"@CallMeLouzander\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"John Alexander\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/\",\"url\":\"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/\",\"name\":\"When and How to Block Content from Search Engine Spiders\",\"isPartOf\":{\"@id\":\"https:\/\/www.bruceclay.com\/#website\"},\"datePublished\":\"2014-08-18T22:08:09+00:00\",\"dateModified\":\"2019-07-01T22:07:27+00:00\",\"author\":{\"@id\":\"https:\/\/www.bruceclay.com\/#\/schema\/person\/7b419fa96e904661b0398614d27afd4b\"},\"description\":\"There are legitimate reasons for excluding content on your site from a search engine's index. Understand the recommended ways to block content for SEO.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.bruceclay.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Nowhere Left to Hide: Blocking Content from Search Engine Spiders\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.bruceclay.com\/#website\",\"url\":\"https:\/\/www.bruceclay.com\/\",\"name\":\"Bruce Clay, Inc.\",\"description\":\"SEO and Internet Marketing\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.bruceclay.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.bruceclay.com\/#\/schema\/person\/7b419fa96e904661b0398614d27afd4b\",\"name\":\"John Alexander\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.bruceclay.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/1576ee8ac37379619c1e7eb916c58e8d?s=96&d=retro&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/1576ee8ac37379619c1e7eb916c58e8d?s=96&d=retro&r=g\",\"caption\":\"John Alexander\"},\"description\":\"John is a former SEO analyst at Bruce Clay, Inc. His love of good writing led to a B.A. in philosophy and literature; his passion for education informs his own writing. John sees SEO as a way of bringing technical skills and solid marketing wisdom together to build a better, content-rich internet.\",\"sameAs\":[\"https:\/\/www.facebook.com\/john.alexander.79219754\",\"https:\/\/twitter.com\/CallMeLouzander\"],\"url\":\"https:\/\/www.bruceclay.com\/blog\/author\/jalexander\/\"}]}<\/script>","yoast_head_json":{"title":"BruceClay - When and How to Block Content from Search Engine Spiders","description":"There are legitimate reasons for excluding content on your site from a search engine's index. Understand the recommended ways to block content for SEO.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/","og_locale":"en_US","og_type":"article","og_title":"When and How to Block Content from Search Engine Spiders","og_description":"There are legitimate reasons for excluding content on your site from a search engine's index. Understand the recommended ways to block content for SEO.","og_url":"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/","og_site_name":"Bruce Clay, Inc.","article_author":"https:\/\/www.facebook.com\/john.alexander.79219754","article_published_time":"2014-08-18T22:08:09+00:00","article_modified_time":"2019-07-01T22:07:27+00:00","author":"John Alexander","twitter_card":"summary_large_image","twitter_image":"https:\/\/www.bruceclay.com\/blog\/wp-content\/uploads\/2014\/08\/search-engine-spider-green-light.jpg","twitter_creator":"@CallMeLouzander","twitter_misc":{"Written by":"John Alexander","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/","url":"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/","name":"When and How to Block Content from Search Engine Spiders","isPartOf":{"@id":"https:\/\/www.bruceclay.com\/#website"},"datePublished":"2014-08-18T22:08:09+00:00","dateModified":"2019-07-01T22:07:27+00:00","author":{"@id":"https:\/\/www.bruceclay.com\/#\/schema\/person\/7b419fa96e904661b0398614d27afd4b"},"description":"There are legitimate reasons for excluding content on your site from a search engine's index. Understand the recommended ways to block content for SEO.","breadcrumb":{"@id":"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.bruceclay.com\/blog\/how-and-when-to-block-content-from-search-engines\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.bruceclay.com\/"},{"@type":"ListItem","position":2,"name":"Nowhere Left to Hide: Blocking Content from Search Engine Spiders"}]},{"@type":"WebSite","@id":"https:\/\/www.bruceclay.com\/#website","url":"https:\/\/www.bruceclay.com\/","name":"Bruce Clay, Inc.","description":"SEO and Internet Marketing","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.bruceclay.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.bruceclay.com\/#\/schema\/person\/7b419fa96e904661b0398614d27afd4b","name":"John Alexander","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.bruceclay.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/1576ee8ac37379619c1e7eb916c58e8d?s=96&d=retro&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1576ee8ac37379619c1e7eb916c58e8d?s=96&d=retro&r=g","caption":"John Alexander"},"description":"John is a former SEO analyst at Bruce Clay, Inc. His love of good writing led to a B.A. in philosophy and literature; his passion for education informs his own writing. John sees SEO as a way of bringing technical skills and solid marketing wisdom together to build a better, content-rich internet.","sameAs":["https:\/\/www.facebook.com\/john.alexander.79219754","https:\/\/twitter.com\/CallMeLouzander"],"url":"https:\/\/www.bruceclay.com\/blog\/author\/jalexander\/"}]}},"_links":{"self":[{"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/posts\/32700"}],"collection":[{"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/users\/168"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/comments?post=32700"}],"version-history":[{"count":10,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/posts\/32700\/revisions"}],"predecessor-version":[{"id":68463,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/posts\/32700\/revisions\/68463"}],"wp:attachment":[{"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/media?parent=32700"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/categories?post=32700"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.bruceclay.com\/wp-json\/wp\/v2\/tags?post=32700"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}