How much might the usability of a web page matter to a search engine? If that search engine were to look at an approximation of the layout of a web page, it could try to understand how good of a user experience visiting that page might be, and evaluate the page based upon certain characteristics that it finds upon the page.

A patent application from Yahoo provides a long list of factors that it might look at to determine how usable a web page might be.

So why would a search engine be interested in determining the usability of a web page?

The authors of the document tell us that:

It can be important to make web pages easy and pleasing to use, which can be particularly important for web pages it is desired to monetize.

This may include, for example, advertisement-containing web pages (of a so-called “web portal,” for example), for which an advertiser pays money when a user views the web page and activates a link of the advertisement.

If such web pages are not easy and pleasing to use, the money-making potential of those web pages can be jeopardized. One conventional indication of whether a web page is easy and pleasing to use is called “clutter.”

How well can an algorithm determine how “cluttered” a web page is, as opposed to an actual person making the same determination? That’s hard to say for certain. With so many pages on the web, asking people to review pages for clutter is impractical.

Using a program that can help a site owner make their pages less “cluttered” and therefore “more” usable may make it more likely that a site owner could run more effective advertisements on those pages.

The Yahoo patent application is:

Quantitative Analysis of Web Page Clutter that Accounts for Subjective Preferences Invented by Koushik Deepak Narayana, John Nathan Boyd, Paul Sokha Kim Assigned to Yahoo US Patent Application 20080040195 Published February 14, 2008 Filed August 11, 2006

Abstract

A method determines a usability measure for a web page. A representation of the web page is processed in view of a usability model.

The usability indication is determined based on the processing step. The representation of the web page may include an indication of at least one of structural and visual elements.

For example, the indication of structural elements may include a document object model of the web page.

The usability model may be a statistical model, such as a linear regression model, that provides an estimate of a statistical relationship between the usability measure and a plurality of characteristics discernible from the representation of the web page.

Creating a Usability Model for Web Pages

A search engine might survey users about how usable a sampling of web pages might be, to create a statistical usability model. That model might be applied to another web page to determine a usability indication for the page.

The structural characteristics of the layout of a page might be approximated as described in the Yahoo patent application that I wrote about in a post from yesterday, The Importance of Page Layout in SEO, to identify structural aspects of a page.

A separate visual look at the page itself may identify some other usability aspects of the page.

The patent application provides the following examples of structural characteristics that might be considered in determining how cluttered a page may be:

  1. Total number of links
  2. Total number of words
  3. Total number of images (non-ad images)
  4. Image area above the fold (non-ad images)
  5. Dimensions of page
  6. Page area (total)
  7. Page length
  8. Total number of tables
  9. Maximum table columns (per table)
  10. Maximum table rows (per table)
  11. Total rows
  12. Total columns
  13. Total cells
  14. Average cell padding (per table)
  15. Average cell spacing (per table)
  16. Dimensions of fold
  17. Fold area
  18. Location of center of fold relative to center of page
  19. Total number of font sizes used for links
  20. Total number of font sizes used for headings
  21. Total number of font sizes used for body text
  22. Total number of font sizes
  23. Presence of “tiny” text
  24. Total number of colors (excluding ads)
  25. Alignment of page elements
  26. Average page luminosity
  27. Fixed vs. relative page width
  28. Page weight (proxy for load time)
  29. Total number of ads
  30. Total ad area
  31. Area of individual ads
  32. Area of largest ad above the fold
  33. Largest ad area
  34. Total area of ads above the fold
  35. Page space allocated to ads
  36. Total number of external ads above the fold
  37. Total number of external ads below the fold
  38. Total number of external ads
  39. Total number of internal ads above the fold
  40. Total number of internal ads below the fold
  41. Total number of internal ads
  42. Number of sponsored link ads above the fold
  43. Number of sponsored link ads below the fold
  44. Total number of sponsored link ads
  45. Number of image ads above the fold
  46. Number of image ads below the fold
  47. Total number of image ads
  48. Number of text ads above the fold
  49. Number of text ads below the fold
  50. Total number of text ads
  51. Position of ads on page

Visual characteristics of the page could also be looked at, after a program converts the page to an image representation of the page. Then Yahoo might look at things like the following:

  1. Presence of animated/flashing ads
  2. Average ad luminosity
  3. Maximum ad luminosity

Other visual aspects of the page may also be reviewed.

The information provided by comparing a web page to the statistical usability model could be provided to the owner of the web page with an evaluation of how the usability of the page could be improved.

Conclusion

This patent application reminds me a little of one from Google – How Google Rejects Annoying Advertisements and Pages.

Yahoo appears to be focusing more upon providing a tool to help people determine how cluttered their pages might be, so that they can advertise more effectively.

What I found most interesting about this process is the way it approximates the layout of a page when determining many aspects of how usable the page might be.

We aren’t told how Yahoo might determine whether some links are sponsored link ads or some images are image ads.