Google's Reasonable Surfer Model Updated

Systems and methods consistent with the principles of the invention may provide a reasonable surfer model that indicates that when a surfer accesses a document with a set of links, the surfer will follow some of the links with higher probability than others. This reasonable surfer model reflects that not all links associated with a document are equally likely to be followed. Examples of unlikely followed links may include “Terms of Service” links, banner advertisements, and links unrelated to the document.

PageRank under the Random Surfer Model

Google’s PageRank algorithm follows what its inventor called the Random Surfer Model. It ranked pages on the Web on a probability that a person following links at random might end up upon a particular page:

The page’s rank can be interpreted as the probability that a surfer will be at the page after following many forwarding links. The constant Î± in the formula is interpreted as the probability that the web surfer will jump randomly to any web page instead of following a forward link.

The Reasonable Surfer Model Replaced the Random Surfer Model

The Reasonable Surfer Model is an update to the original Random Surfer Model at Google. It looks at different probabilities involving the likelihood that a person might click upon specific links based on features associated with those links. Those probabilities can determine how likely it is someone might click upon those links. For example, the amount of PageRank a link might pass along is based upon the probability that someone might click on a link.

Those link features can include a wide range of factors. They can include the color, the size, the styles of fonts, the anchor text used in the links, and many other factors. The Reasonable Surfer Model told us that the average visitor to web pages does not click on links at random but is more likely to click upon certain links on pages.

The reasonable surfer model reflects the probability that someone will click on links based upon the features related to them.

I wrote about the Reasonable Surfer model in a post from 2010 which I titled Google’s Reasonable Surfer: How The Value Of A Link May Differ Based Upon Link And Document Features And User Data

How This Patent Changed

Patents do sometimes get updated. These updates often appear in the claims within the patents. That has happened to the Reasonable Surfer model.

These changes reflect a change in how the processes described within the patent operate.

It’s the claims section that changes with a continuation patent. That is because patent examiners from the patent office look at the claims and compare those to claims from other patents to ensure that the new claims don’t copy other granted patents and could infringe those patents.

A continuation patent “continues” the protection given by the original version of the patent. It is given a date of coverage that begins with the original filing date of the original version of the patent.

The continuation Reasonable Surfer model patent is:

Ranking documents based on user behavior and/or feature data Inventors: Jeffrey A. Dean, Corin Anderson, and Alexis Battle Assigned to: Google US Patent 9,305,099 Granted April 5, 2016 Filed: January 10, 2012

Abstract

A system generates a model based on feature data relating to different features of a link from a linking document to a linked document and user behavior data relating to navigational actions associated with the link. The system also assigns a rank to a document based on the model.

The Probability that Someone May Click On Some Links May Change

As I pointed out in my original post about the Reasonable Surfer patent, it might change the amount of PageRank that might flow through a link. That would involve different features associated with a link. For example, if a link is in the page’s main content area, it uses a font and color that might make it stand out. It may use text that may make it likely that someone might click upon it. It could then pass along a fair amount of PageRank.

On the other hand, it could combine features that make it less likely to be clicked on. Those features include being in the footer of a page, in the same color text as the rest of the text, and the same font type. It also may use anchor text that doesn’t interest people. Because of those factors, it may not pass along as much PageRank.

So, how have the claims for this patent changed? How has it changed the Reasonable surfer model?

The New Claims Refer to Anchor Text More Often When Referring to the Reasonable Surfer Model.

The new claims refer to anchor text in those claims more frequently. They also look at the probability that people might click upon a link. This is the language that stands out to me. This is the first new claim in this new Reasonable Surfer patent:

… a rank for a particular document, generating the rank including determining particular feature data associated with a link to the particular document, the particular feature data identifying one or more attributes of the link, determining a weight indicating a probability of the link being selected, the weight is determined based on the particular feature data and selection data, the selection data identifying user behavior relating to links to other documents …the weight indicating a higher probability of the link being selected when the particular feature data corresponds to feature data associated with the one or more links than when the particular feature data corresponds to feature data associated with the one or more other links…words in anchor text associated with the links, and a quantity of the words in the anchor text

The claims in the original version of Ranking documents based on user behavior and/or feature data are different. Newer claims emphasize that weight passed along by links involves the probability of clicking upon a specific link.