Is Google about to Unplug Its Penguin?
Editor’s note, Oct. 6, 2016: Google spokesperson Gary Illyes made it official. The prediction posed in this post by Robert Ramirez came to pass with the September 2016 Penguin 4.0 update.
Inorganic links are not a negative ranking signal, but instead are ignored in Google’s ranking calculation. Therefore, this article’s theory currently stands as fact.
View the conversation where Illyes clarifies “demote” vs. “devalue”.
TL;DR – A theory: The next Google Penguin update will kill link spam outright by eliminating the signals associated with inorganic backlinks. Google will selectively pass link equity based on the topical relevance of linked sites, made possible by semantic analysis. Google will reward organic links and perhaps even mentions from authoritative sites in any niche. As a side effect, link-based negative SEO and Penguin “penalization” will be eliminated.
Is the End of Link Spam Upon Us?
Google’s Gary Illyes has recently gone on record regarding Google’s next Penguin update. What he’s saying has many in the SEO industry taking note:
- The Penguin update will launch before the end of 2015. Since it’s been more than a year since the last update, this would be a welcome release. (Editor’s note on Dec. 7, 2015: Since publication, a Google spokesperson said, “With the holidays upon us, it looks like the penguins won’t march until next year.”)
- The next Penguin will be a “real-time” version of the algorithm.
Many anticipate that once Penguin is rolled into the standard ranking algorithm, ranking decreases and increases will be doled out in near real-time as Google considers negative and positive backlink signals. Presumably, this would include a more immediate impact from disavow file submissions — a tool that has been the topic of much debate in the SEO industry.
But what if Google’s plan is to actually change the way Penguin works altogether? What if we lived in a world where inorganic backlinks didn’t penalize a site, but were instead simply ignored by Google’s algorithm and offered no value? What if the next iteration of Penguin, the one that is set to run as part of the algorithm, is actually Google’s opportunity to kill the Penguin algorithm altogether and change the way they consider links by leveraging their knowledge of authority and semantic relationships on the web?
We at Bruce Clay, Inc. have arrived at this theory after much discussion, supposition and, like any good SEO company, reverse engineering. Let’s start with the main problems that the Penguin penalty was designed to address, leading to our hypothesis on how a newly designed algorithm would deal with them more effectively.
Working Backwards: The Problems with Penguin
Of all of the algorithmic changes geared at addressing webspam, the Penguin penalty has been the most problematic for webmasters and Google alike.
It’s been problematic for webmasters because of how difficult it is to get out from under. If some webmasters knew just how difficult it would be to recover from Penguin penalties starting in April of 2012, they may have decided to scrap their sites and start from scratch. Unlike manual webspam penalties, where (we’re told) a Google employee reviews link pruning and disavow file work, algorithmic actions are reliant on Google refreshing their algorithm in order to see recovery. Refreshes have only happened four times since the original Penguin penalty was released, making opportunities for contrition few and far between.
Penguin has been problematic for Google because, at the end of the day, Penguin penalizations and the effects they have on businesses both large and small have been a PR nightmare for the search engine. Many would argue that Google could care less about negative sentiment among the digital marketing (specifically SEO) community, but the ire toward Google doesn’t stop there; many major mainstream publications like The Wall Street Journal, Forbes and CNBC have featured articles that highlight Penguin penalization and its negative effect on small businesses.
Dealing with Link Spam & Negative SEO Problems
Because of the effectiveness that link building had before 2012 (and to a degree, since) Google has been dealing with a huge link spam problem. Let’s be clear about this; Google created this monster when it rewarded inorganic links in the first place. For quite some time, link building worked like a charm. If I can borrow a quote from my boss, Bruce Clay: “The old way of thinking was he who dies with the most links wins.”
This tactic was so effective that it literally changed the face of the Internet. Blog spam, comment spam, scraper sites – none of them would exist if Google’s algorithm didn’t, for quite some time, reward the acquisition of links (regardless of source) with higher rankings.
And then there’s negative SEO — the problem that Google has gone on record as saying is not a problem, while there have been many documented examples that indicate otherwise. Google even released the disavow tool, designed in part to address the negative SEO problem they deny exists.
The Penguin algorithm, intended to address Google’s original link spam issues, has fallen well short of solving the problem of link spam; when you add in the PR headache that Penguin has become, you could argue that Penguin has been an abject failure, ultimately causing more problems than it has solved. All things considered, Google is highly motivated to rethink how they handle link signals. Put simply, they need to build a better mousetrap – and the launch of a “new Penguin” is an opportunity to do just that.
A Solution: Penguin Reimagined
Given these problems, what is the collection of PhDs in Mountain View, CA, to do? What if, rather than policing spammers, they could change the rules and disqualify spammers from the game altogether?
By changing their algorithm to no longer penalize nor reward inorganic linking, Google can, in one fell swoop, solve their link problem once and for all. The motivation for spammy link building would be removed because it simply would not work any longer. Negative SEO based on building spammy backlinks to competitors would no longer work if inorganic links cease to pass negative trust signals.
| Search Engine Technologies DefinedKnowledge Graph, Hummingbird and RankBrain — Oh My!What is the Knowledge Graph? What is Google Hummingbird? What is RankBrain? | 
What prevents Google from accomplishing this is that it requires the ability to accurately judge which links are relevant for any site or, as the case may be, subject. Developing this ability to judge link relevance is easier said than done, you say – and I agree. But, looking at the most recent changes that Google has made to their algorithm, we see that the groundwork for this type of algorithmic framework may already be in place. In fact, one could infer that Google has been working towards this solution for quite some time now.
The Semantic Web, Hummingbird & Machine Learning
In case you haven’t noticed, Google has made substantial investments to increase their understanding of the semantic relationships between entities on the web.
With the introduction of the Knowledge Graph in May of 2012, the launch of Hummingbird in September of 2013 and the recent confirmation of the RankBrain machine learning algorithm, Google has recently taken quantum leaps forward in their ability to recognize the relationships between objects and their attributes.
Google understands semantic relationships by examining and extracting data from existing web pages and by leveraging insights from the queries that searchers use on their search engine.
Google’s search algorithm has been getting “smarter” for quite some time now, but as far as we know, these advances are not being applied to one of Google’s core ranking signals – external links. We’ve had no reason to suspect that the main tenets of PageRank have changed since they were first introduced by Sergey Brin and Larry Page back in 1998.
Why not now?
What if Google could leverage their semantic understanding of the web to not only identify the relationships between keywords, topics and themes, but also the relationships between the websites that discuss them? Now take things a step further; is it possible that Google could identify whether a link should pass equity (link juice) to its target based on topic relevance and authority?
Bill Slawski, the SEO industry’s foremost Google patent analyzer, has written countless articles about the semantic web, detailing Google’s process for extracting and associating facts and entities from web pages. It is fascinating (and complicated) analysis with major implications for SEO.
For our purposes, we will simplify things a bit. We know that Google has developed a method for understanding entities and the relationship that they have to specific web pages. An entity, in this case, is “a specifically named person, place, or thing (including ideas and objects) that could be connected to other entities based upon relationships between them.” This sounds an awful lot like the type of algorithmic heavy lifting that would need to be done if Google intended to leverage its knowledge of the authoritativeness of websites in analyzing the value of backlinks based on their relevance and authority to a subject.
Moving Beyond Links
SEOs are hyper-focused on backlinks, and with good reason; correlation studies that analyze ranking factors continue to score quality backlinks as one of Google’s major ranking influences. It was this correlation that started the influx of inorganic linking that landed us in our current state of affairs.
But, what if Google could move beyond building or earning links to a model that also rewarded mentions from authoritative sites in any niche? De-emphasizing links while still rewarding references from pertinent sources would expand the signals that Google relied on to gauge relevance and authority and help move them away from their dependence on links as a ranking factor. It would also, presumably, be harder to “game” as true authorities on any subject would be unlikely to reference brands or sites that weren’t worthy of the mention.
This is an important point. In the current environment, websites have very little motivation to link to outside sources. This has been a problem that Google has never been able to solve. Authorities have never been motivated to link out to potential competitors, and the lack of organic links in niches has led to a climate where the buying and selling of links can seem to be the only viable link acquisition option for some websites. Why limit the passage of link equity to a hyperlink? Isn’t a mention from a true authority just as strong a signal?
There is definitely precedent for this concept. “Co-occurrence” and “co-citation” are terms that have been used by SEOs for years now, but Google has never confirmed that they are ranking factors. Recently however, Google began to list unlinked mentions in the “latest links” report in Search Console. John Mueller indicated in a series of tweets that Google does in fact pick up URL mentions from text, but that those mentions do not pass PageRank.
What’s notable here is not only that Google is monitoring text-only domain mentions, but also that they are associating those mentions with the domain that they reference. If Google can connect the dots in this fashion, can they expand beyond URLs that appear as text on a page to entity references, as well? The same references that trigger Google’s Knowledge Graph, perhaps?
In Summary
We’ve built a case based on much supposition and conjecture, but we certainly hope that this is the direction in which Google is taking their algorithm. Whether Google acknowledges it or not, the link spam problem has not yet been resolved. Penguin penalties are punitive in nature and exceedingly difficult to escape from, and the fact of the matter is that penalizing wrongdoers doesn’t address the problem at its source. The motivation to build inorganic backlinks will exist as long as the tactic is perceived to work. Under the current algorithm, we can expect to continue seeing shady SEOs selling snake oil, and unsuspecting businesses finding themselves penalized.
Google’s best option is to remove the negative signals attached to inorganic links and only reward links that they identify as relevant. By doing so, they immediately eviscerate spam link builders, whose only quick, scalable option for building links is placing them on websites that have little to no real value.
By tweaking their algorithm to only reward links that have expertness, authority and trust in the relevant niche, Google can move closer than ever before to solving their link spam problem.
Editor’s note, Dec. 7, 2015: On Dec. 3, we learned that, despite previous comments by Google suggesting otherwise, the next Penguin update will not happen before the year’s end. We’ve updated this article to reflect this.
33 Replies to “Is Google about to Unplug Its Penguin?”
You’ve hit the nail on the head with your call on semantic relevance linked websites. We’ve done a lot of testing in regards to this in the past 2 years and it’s no surprise which websites pass on more link juice
I’ve tested it this year by getting rid of thousands of suspicious links and to be honest rankings,
Instead of algorithmic refreshes and updates, it would make sense for Google to devalue all links which have no relevance. Compliment this concept with the new live Penguin algorithm and this will surely shake up the SEO landscape in a positive way
I agree with what Daniel has mentioned above. I would think that Google will use the disavow database to assess domains and occurrences in submitted files, then devalue the outbound links to not pass any link juice. Will be interesting to see what happens over the next few months and which websites will drop.
You’ve hit the nail on the head with your call on semantic relevance linked websites. We’ve done a lot of testing in regards to this in the past 2 years and it’s no surprise which websites pass on more link juice
It pays to be ahead of these things. Contrary to popular belief the sky is only falling if you believe that.
The only problem there is that people disavow good domains all the time. It’s part of the larger problem with the whole process. Link audits aren’t nearly as thorough as they should be and a lot of good links get disavowed in the process.
I would think that Google will use the disavow database to assess domains and occurrences in submitted files, then devalue the outbound links to not pass any link juice.
A good read. It did not happen till now but i was watching a video on moz and seems like they were expecting the same thing.
BTW very informative blog, its been two hours i still cant get off.
Although the TLDR is a lofty prediction it is not totally without merit.
A more accurate statement would be it “may kill currnet link spam strategies outright” not “kill link spam outright”.
The chess game plays on… until link earning becomes the professional tactic of choice of brands that are serious about Search Engine Optimization.
Great article, Robert. I’m waiting for penguin real time update. It’s gonna roll out by end of the year or maybe early 2016.
Good article but I dont think Google can do away or lessen the value of link building. If that happens, there will be lot of challenge for the new websites to make a space out for themselves which might be future authoritative.
 Lets see what happens in next update!
Not really but it may comes with some upgrade which helps to make you search properly. Recently google announce the algorithm update coming soon so be ready for it.
Penguin did its intended task , not really sure about hummingbird. Nothing but garbage sites like angies list , yelp and aggregate content , poor local results and now with 3 pack local search is even worse. Google needs to fix this before anything else
Informative article. I think we should wait for penguin real time update gonna roll out by end of the year, lets see its impacts. Nothing to say more at all!
Penguin will be roll out within few days or weeks. And your article clarifies my most of the doubts about Google Penguin. Really appreciate your efforts for this write up. Thanks for sharing your views about Google’s updates.
The problem I see: Even if all spam is devalued, spammers would still try. Spam is incredibly cheap and can be done on autopilot, so why not build a million links, or two million, or ten. Throw enough shit to the wall, some will stick. It was the approach many used before penguin and I don’t think people wouldn’t try. If there are no penalties, people will try. Even if just one out of 10k links has a slight effect, it’s just a question of scaling up. Imagine the nightmare for blogs, forums, all places where people can register, post content and/or a link. Would be worse than the time before 2012.
I think the whole link weighting system needs a massive overhaul, but I wonder if Google will assault sitewide links again with this update. Gaming SERPs isn’t particularly difficult but shouldn’t be the case. It seems like a logical progression for Google; is it going to work?
Hopefully, the days of ranking for a term simply by building dodgy links is almost over!
I severely hope the end of link spam is upon us. I see so many wrong doers, building thousands of dodgy links, and they are sadly ranking highly.
Robert Very Nice and informative article. Waiting for the google updates let see what they bring interesting. Thanks for sharing
They make it nearly impossible for a new site to gain traction. Back linking should not be the single largest factor. Does the site provide a value to the web should. All sites that provide no value, that rank well but are merely there for ad revenue, should rank lower while sites that provide a real service, not aggregated services, should rank higher. Original content should be king…
Very interesting read Robert.
While I would love to see a Penguin algorithm that is not punitive and simply ignores inorganic links, I would be quite surprised if we are close to that day.
I think that in many cases Google is indeed simply ignoring ultra spammy links. But, I think that no matter how good the algorithm is, SEOs will be finding ways to game the system via links for quite some time. If Penguin simply devalued unnatural links then there would be no impunity for those who want to keep pushing the limits of Google to see what they can get away with.
I do think that links will (perhaps have) become less important as Google learns how to understand websites and user intent. I’ve seen sites ranking in some local verticals that have almost no links, but have incredible content compared to their SEO’d competition. In those cases I feel like Google is recognizing which content is the best and they’re getting better and better at it.
But I think we’re a long way away from links being removed from ranking calculations. As long as links are used for rankings, SEOs will be trying to manipulate the system. If there is no fear of a potential ranking hit, then there is more incentive to build more sophisticated links that move the needle.
I’m guessing that future iterations of Penguin will get better and better at detecting the type of unnatural link that is currently considered white hat by many but really is a manipulative link. For example, right now I don’t think that Penguin pays much attention to guest posting on relatively decent sites. But, I’ve seen *manual* penalties given for excessive guest posting so we know that this is on Google’s radar. So many large brands are pushing the envelope as far as they can in regards to guest posting. They’re no longer publishing guest posts riddled with keyword anchored links, but they’re pushing out a large number of guest posts each month with the sole intention of manipulating Google.
I believe that Penguin will adapt to find that kind of link. And sites that have used any sort of wide scale tactic in order to create their own links are going to have a risk of being held back to some degree by Penguin until they clean up the mess.
But, I could be wrong. It sure will be interesting to see what the next couple of years look like in regards to Google’s ability to determine what an inorganic link is.
Thank you for your insight Marie. Your work in this space is among the best in the industry, which was why I pinged you when I had published the article. I do think that even in my best-case dream scenario in which Penguin is kind of sent out to pasture, the end result would be a whole new set of tactics that spammers would employ to try and manipulate the new system. It’s why us SEOs can’t have nice things and I’m sure it’s also the reason why we’ve lost any kind of transparency or communication from Google – nothing good comes of them telling SEOs anything about their algorithm.
I guess one of my main points was that I’ve just had enough of Penguin. I think it’s been a failure by pretty much any measure and Google needs to come up with a better way of judging the value and relevancy of websites.
Part of me knows that is much easier said than done, but we can dream, can’t we?
Thanks again for your input.
Very interesting article – a backlink shake up is inevitable – let’s hope they get this one right.
I agree with what Daniel has mentioned above. I would think that Google will use the disavow database to assess domains and occurrences in submitted files, then devalue the outbound links to not pass any link juice. Will be interesting to see what happens over the next few months and which websites will drop.
Thanks for the comment Nick. It has occurred to me that Google could use disavow data to help “train” it’s machine learning algorithms. The only problem there is that people disavow good domains all the time. It’s part of the larger problem with the whole process. Link audits aren’t nearly as thorough as they should be and a lot of good links get disavowed in the process. We’ll see what happens, as you said, should be interesting!
Great write up, Robert.
You’ve hit the nail on the head with your call on semantic relevance linked websites. We’ve done a lot of testing in regards to this in the past 2 years and it’s no surprise which websites pass on more link juice.
Instead of algorithmic refreshes and updates, it would make sense for Google to devalue all links which have no relevance. Compliment this concept with the new live Penguin algorithm and this will surely shake up the SEO landscape in a positive way.
My observations show that Google has known for long (read prior to Penguin) whether a link is organic or artificially-gained, and thus helped in gaining traffic or not. Even in 2009 they (Google) were able to tell that thousands of i.e. web directory links must not count (passed zero value). OK, these days their algos must be at a higher level than in the pre-2010 era, but they were quite good at this for a looong time before Penguin.
I’ve tested it this year by getting rid of thousands of suspicious links and to be honest rankings, without further actions, haven’t changed at all (so the bad links really didn’t count). All those spammy links were created in 2006-2008.
Still, the Penguin penalty can be gotten rid of even without a data refresh with the correct approach – I know lots of SEO’s would disagree, but you have to think out of the box to find the solution.
But perhaps I’m wrong and Google will get back to its former approach when spammy links just didn’t count. That would be great in my opinion. I think getting slammed for an activity done 10 years ago is quite weird.
Thanks for the comment Jan! I do agree that to a degree, Google has been ignoring the worst link signals algorithmically for some time now – it’s the penalizations that can come from Penguin and all of those inorganic links that is the real issue. And for every example you can find of sites “recovering” from Penguin penalties (presumably from acquiring new organic links and cleaning up spammy signals) there are 10 stories of sites that have never been able to recover from the algorithmic penalties.
Let’s hope things improve, no matter what Google decides to do, with the next Penguin update.