Understanding Relationships such as Entity Assocations

When we talk about the relationships between websites, it’s not unusual for us to talk about links between sites and pages. Google pays a lot of attention to such links. They are at the heart of one of its most well-known ranking signals – PageRank. PageRank is more than 15 years old, predating the origin of Google itself in the BackRub search engine.

Google is exploring other signals used to rank pages in search results. These include social signals for reputation scores for authors. They may also look at relationships between words that appear together on pages ranking for the same queries. Also relationships between pages in the same search results and in the same search sessions. A Google paper presented at an October 2013 natural language processing conference, Open-Domain Fine-Grained Class Extraction from Web Search Queries (pdf), provides some interesting hints at a possible Google of the future.

Entity Associations are Part of the Future of SEO

Google wants to build a knowledge base of concepts to better understand things like what different businesses or entities are ‘Known for’. The search engine is also interested in defining entities better in ‘is a’ relationships. Pages for specific entities may show up at the top of search results because they seem to be pages people are looking for when that entity is in a query. For example the first two results on a search for [Roald Dahl], as seen in the image below:

Search results showing authoritative results for Roald Dahl and then results for books he wrote.

Drawing Connections Between Different Named Entities with Entity Associations

A Google patent application on related entities published earlier this year also explores drawing connections between different named entities. These could be specific people, places, or things. It does this by looking at entity associations with specific websites and understanding “related entities” for those original entities. An entity association is when a specific entity connects with a particular website. This may be because a site is authoritative for that entity. Or because a page from the site is a navigational result for a query that includes that entity.

On a search for “John Wayne,” the official John Wayne website is the top result in Google and the second result is the John Wayne Wikipedia page. Those may rank well not because of traditional ranking signals such as PageRank and information retrieval scores based upon relevance. Instead, because they are pages that are authoritative on the entity “John Wayne,” and great responses to those queries as navigational results.

What is In A Knowledge Panel for An Entity?

While the Roald Dahl search result from the patent application shows books authored by Roald Dahl, the Knowledge Panel result for John Wayne shows movies that he has starred in and shows other people whom searchers also look for when they search for John Wayne, as related entities.

"Knowledge Panel at Google for John Wayne and related entities

How similar are the processes for including related entities within a set of search results and including related entities within a knowledge panel in Google Results? This patent application tells us that it looks at search results to try to identify related entities. At the same time, the knowledge panel results also appear to look at query log files to find things that people also search for when they search for an entity that triggers a knowledge panel result. The patent filing is:

Related Entities Invented by Peter Jin Hong, Pravir K. Gupta, Nathaniel J. Gaylinn, Ramakrishnan Kazhiyur-Mannar, Kavi J. Goel, Omer Bar-or, Jack W. Menzel, Christina R. Dhanaraj, Jared L. Levy, Shashidhar A. Thakur, Grace Chung, and Benson Tsai US Patent Application 20130238594 Published September 12, 2013 Filed: February 22, 2013

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, identify entities related to an entity to which a search query goes. One of the methods includes:

  • A search query, wherein the search query relates to the first entity of a first entity type, and where entities of a second entity type have a relationship with the first entity;
  • Search results for the search query;
  • A count of search results identifying a resource containing a reference to the first entity satisfies a first threshold value;
  • Search results identifying a resource having the second entity type as a relevant entity type satisfies a second threshold value
  • Transmitting information identifying one or more entities of the second entity type as part of the response to the search query.

A Look at the Entity Association Process

Here’s an abbreviated look at the entity associations process described in the patent filing. It uses images from the related entities patent application:

A flowchart from the patent showing the creation of an association between a query and a web page.

Are There Authoritative Resources for an Entity on the Web?

Search results from a query see whether there are authoritative resources for an entity within them. If so, then those results show for that entity.

Screenshot from the patent showing the identification of related entities for the query.

If the search result titles and snippets contain related entities, they may be within a related entity database.

Screenshot showing the ordering of related entities and their inclusion in a database.

The patent does tell us that these related entities might be in ranked order, and it provides some of the signals used to order the related entities. (Note that there’s not a link involved at all.)

These scores can be in part on:

  • Someone searching for related entities after submitting a query for the first entity.

  • whether a recognized reference to related entities co-occur in a same prior submitted query is a recognized reference to the original entity.

  • If there is data indicating that two or more of the related entities of the second entity type are members of a set of entities that has a specified order, and matching that order (For example, if the entity is a person with children and the children are usually listed in birth order.)

  • when data indicates that two or more of the related entities are better known as part of a broader entity and replacing them with the broader entity in ordering the related entities.

Entity Associations Take-Aways

When Google decides to associate an entity with a particular query, it may also identify whether related entities show up in those search results in places like titles and snippets. It may include those entities within the search results. Again, this wouldn’t need matching keywords with the original query or a PageRank analysis.

The patent application shows how this would work within search results, but it seems to apply to knowledge panel results.

As Google’s knowledge base grows, things like Entity Associations and related entities will continue to be a part of it.

I’ve written a few posts about named entities. These are some that I wanted to share:

Last Updated June 26, 2019