What is co-citation? What about co-occurrence? The future of link building


Google is developing co-citation, and also co-occurrence, as alternatives to anchor text. Are we facing an imminent 360º turnaround in link building?

Co-citation and its importance in SEO and link buildingThat’s a lot of SEO trends out there every year, and from 2015, we dropped a couple of ideas that will be important for the near future: link-free brand mentions; and co-citation/co-occurrence. Actually, it could be said that there are not two, but three ideas. The time has come to sink our teeth into these topics, which are a bit technical, but which, when explained step by step, are not so difficult to understand.

In the context of the semantic web, search engines such as Yahoo, Bing, and especially Google, have been acquiring greater “intelligence”. To understand what SEO and link building will look like now, and in years to come, you have to understand that Google will continue to go after the techniques we employ to disrupt the SERPs. Among them, maneuvers such as buying and selling or link exchanges do not amuse him 😉

Replacing anchor text with another way of determining which keywords are relevant to a page is probably what Google has been doing for some time now. Why? For the simple reason that, by doing so, the difference between those who do link building campaigns and those who leave it all to natural positioning will diminish . Glup! 🙂

In order to reduce the relevance of links and anchor text in SEO, Google is implementing co-citation as a way to associate a page to certain keywords (or rather, “topics”) and measure authority. However, to achieve this, Google needs another algorithmic ingenuity, and that is co-occurrence. “Isn’t that the same as co-citation?” No! But it is closely related.

Disclaimer: links with anchor text still work, and very well 🤫

Definition of co-citation and co-occurrence

Some SEOs think that co-citation and co-occurrence are one and the same thing. Others, however, draw distinctions. In our opinion, there are differences.


In bibliometrics, it is used when a document cites two other documents, showing the probability that both cited sources are related by their content.

The most basic scheme is as follows; a document A cites other documents B and C. When this happens, there is a possibility that B and C deal with related topics. The probability increases or decreases depending on the distance between citations or references.

Applied to the SERPs, co-citation would serve to establish a map of sites related by their contents, and determine which pages are relevant to each query in the search results.


It is the relationship of proximity of two or more terms in a text unit (sentence, paragraph…).

If the terms E and F co-occur in a sentence, i.e., they appear together in the sentence, they are likely to be semantically related.

Its importance for rankings lies in the fact that it could potentially replace part of the function that the anchor text fulfills today. Thus, Google could use it to deduce the relevant keywords of a website by the words that form the context of the link.

Why is it, then, that for some people co-occurrence and co-citation mean the same thing?

The origin of the discussion

The confusion between co-citation and co-occurrence is probably due to a post with a video by Rand Fishkin that appeared in MOZ on November 16, 2012. But, as Jack the Ripper said, let’s take it one step at a time: let’s first look at what Rand said.

The co-founder of MOZ, in his section “Friday’s Slate”, analyzed the case of three websites that ranked very well without having spent any effort on their on page SEO, nor on their backlinks. Faced with the disconcerting results, Rand ventured a prediction: Anchor Text is Weakening…And May Be Replaced by Co-Occurrence. And May Be Replaced by Co-Occurrence”). However, in the video, Rand speaks of “co-citation”, not “co-occurrence”, a term he changed when writing the post, himself acknowledging his mistake. This undoubtedly contributed to the confusion.

Rand’s thesis is that Google provided a good positioning to sites that, despite not following any of the classic SEO techniques, were cited in numerous websites or blogs, with links without an optimized anchor text, or directly, without a link (only as a mention). And the search engine could determine that these sites were relevant for some searches or keywords, because it would have developed the ability to read the important words in the context of the links or mentions, and not in the anchor text. In other words, Google would read the words that accompany the mention of a website (with or without a link), in order to determine the topics and keywords to which it is related.

Rand’s predictions caused a small revolt in the SEO world. And not surprisingly, many specialists in this field began to investigate the subject themselves. I find particularly interesting a post by Joshua Giardino, who shows his disagreement with Rand, claiming that, in order for Google to be able to achieve something like co-occurrence, it would have to have: a large ability to process natural languageand a ontology (recognize entities such as people, places and things), very advanced. Joshua says that while Google is developing all these potentialities, as demonstrated by the performance of Google knowledge, the anomalous results detected by Rand can be explained by analyzing them from the point of view of classical SEO. Giardino does acknowledge, however, that co-occurrence could be a part of the Penguin algorithm, which, as you may know, acts on an ad hoc basis.

Indicia of co-citation and co-occurrence in Google patents.

Whether or not Rand Fishkin’s predictions were hasty, the fact is that on November 27, 2012 (11 days after MOZ’s post), Google received approval of a patent entitled. Documents ranking using word relationships (“document ranking using word relationships”), which points to co-occurrence as another factor influencing rankings.

Image extracted from Google’s patent: co-occurrence consists of word relationships.

In addition, one should take into account the progress that Google has made with its algorithm since the possibility of co-occurrence began to be speculated. The key concept here is Hummingbird. Indeed, Google’s new algorithm, which made its appearance on August 30, 2013, was a major breakthrough in natural language processing that made it possible, among other things, for users to engage with the search engine in question/answer terms.

What about co-citation? Curiously enough, ten days before the launch of Hummingbird, Google published a patent that proposed the grouping of close links to classify them by topic.

Co-citation between documents
Image extracted from Google’s patent: various document models with and without co-citation.

Everything seems to indicate that both co-citation and co-occurrence are in full swing and will play an important role for SEO, and link building in particular.

What could link building look like from now on?

Link profile building may become more complicated, but it will not disappear as a technique for positioning a website. Some of the points of the future manual of good practices for link building could be:

  • Increased number of links with (apparently) casual or unoptimized text. (We already do this).
  • Higher keyword density in the text surrounding a link, rather than in the anchor text.
  • Mentions of brand or domain without link.
  • Links in pairs, combining the website to be positioned with another of great authority in the sector.

We will keep a close eye on the evolution of these positioning factors.

Even so, John Mueller himself has recently acknowledged that anchor texts still help Google understand the page being linked to, and he also previously commented that they are useful for to give more contextNothing new that we do not know, but sometimes SEOs complicate our lives thinking that everything is much more complex and it really is not.

