Two new patent applications from Microsoft, originally filed June 29, 2006, and published January 3, 2008, describe how information collected offline in many different ways, such as credit card use, grocery discount membership cards, cell phone usage, interaction with digital television systems, and more might be used by Microsoft to target advertising to you online, and influence the search results that you see when you search.

Inventors: Gary W. Flake, William H. Gates, Eric J. Horvitz, Joshua T. Goodman, Bradly A. Brunell, Susan T. Dumais, Alexander G. Gounares, Trenholme J. Griffin, Xuedong D. Huang, Oliver Hurst-Hiller, Kenneth A. Moss, Kyle G. Peltonen, John C. Platt

Abstract

Architecture for targeted advertising using offline user behavior information. Information relating to offline behavior can be collected from cell phones, geolocation systems, credit card information, restaurants, grocery stores, etc., and this information is aggregated and employed in connection with selecting and displaying targeted advertising to a user when online.

Machine learning and reasoning can be employed to make inferences and dynamically tune advertisement processing. Offline user information can also be employed to enhance context-based searching when the user goes online.

The ranking of search results and content for display can be modified as a function of offline behavior.

A system is provided that facilitates online advertising based on at least offline activity using a profile component for aggregating offline behavior information of a user and generating a related user profile. An advertising component employs the user profile in connection with the delivery of an advertisement to the user when online.

Customer Profile Information

Many offline businesses attempt to capture information about their customers in several ways, such as from the use of personal checks and participation in surveys or customer feedback forms.

Businesses online also attempt to capture information about online user activity, like using cookies to track the activities of users, which can help them learn about “the buying habits, goals, intentions, and needs [of] large numbers of users.”

User activity through log files may also be sold from sites to others who want to learn about how people interact with the Web, and with those sites.

Other systems where information can be collected about consumers might include cellular phone records and digital television systems that enable people to interact with those systems, and the presentation of products and services.

Microsoft tells us that:

Users spend a significant amount of time offline and by monitoring such offline activity, an in-depth profile of the user can be obtained which can improve the selection and delivery of advertisements.

The disclosed architecture facilitates targeted advertising by employing offline user behavior information.

Information relating to offline behavior is collected from cell phones, geolocation systems, credit card information, restaurants, grocery stores, etc., and this information is aggregated and employed in connection with selecting and displaying targeted advertising to a user when online to increase click-through rate by providing relevant advertisements to the user.

An example of one way this might work:

For example, if the offline behavior indicates the user was watching a college football game, and thereafter, the user was watching television highlights of the game, if the user goes online during or just after such activity, then an inference could be made that the user is interested in seeing more information about the game as well as being receptive to advertisements selling college team memorabilia.

Reranking Search Results Based Upon Offline Activity

In addition to determining what ads someone might see, a system like this might influence search results shown to a searcher:

In another implementation, the processing of searches and ranking of search results and other content to be displayed can be performed as a function of offline behavior. In support thereof, a computer-implemented system is provided that facilitates online searching. A profile component aggregates offline behavior information of a user and generates a related user profile.

A search component employs the user profile in connection with generating and processing of a user search when the user is online. In yet another implementation, offline behavior ranking can also be used to facilitate the creation of personalized online yellow pages.

Privacy Implications?

This process involves a lot of collection of data from many different sources that could be collected manually or in an automated fashion. The patent filings describe some of the ways that information could be collected, but not all of them.

Information about offline activities such as cell phone usage, credit card transactions, banking information, and purchases from restaurants and grocery stores could be collected in a user profile that would also include information about a person’s online activities, such as what ads they click upon.

This information could be used to decide which ads to present to people online, or even offline in places like digital programming guides for television services.

Online browsing activity information could be collected about someone to determine things such as what kinds of coupons to provide to a person when they purchase at a brick and mortar store.

Identification of the person making the purchase might be done by associating the shopper with their credit card or “debit card, vendor loyalty card, discount card, or any input mechanism that provides the association with who is making the purchase.”

Buying a plane ticket to take a trip to Italy? Your credit card usage might result in travel brochures and other information about things to do in Italy being mailed to your home address.

Tying together offline and online information in a manner like the examples described seems pretty invasive.

Types of Data Collected

The patent filings list some broad categories of data, and then ventures into the collection of information about some other offline activities that sounds somewhat farfetched. Here are the larger categories:

User geolocation data — collected by technologies like a global positioning system,

Personal data — such as “personal financial data, person medical data, personal family data, and other data considered private,”

Purchase transaction data — purchases made online and offline, and;

System interaction data — what you watch on TV, your usage of cell phones, computers, and other systems that can be operated offline.

These examples were even more interesting:

Reactions of a user to information — Collected by cameras, microphones, and/or systems that sense biometric information. The look on your face, the sound of your voice, the emotions that you express when buying something or presented with information could be “recorded, processed, and fed back for analysis to affect the type of advertising presented to the user when s/he goes online.”

Reactions to Television Programming — In the future, when you’re watching TV, it appears that TV may also be watching you:

In the context of watching television programming, sensing systems can be employed to capture user reactions to ads and programming. This reaction information as well as the direct user interaction data associated with channel surfing, for example, can be utilized to formulate online advertising for targeting the user.

Conclusion

The patent filings discuss the use of machine learning models that might try to make sense of information gathered from both offline and online activity, to determine a person’s interests, both for advertising and for reranking search results and create a profile for that person.

The profile information, information about the context (a search immediately after watching a football game, for example), and information about online activities such as the bookmarking of pages might influence an individual’s search results.

They also discuss aspects of an advertising system based upon these models, that might take advantage of such information, and the brokering of user activity information to advertisers, for both individuals and groups of people.

This is an incredible amount of information about people being tied together. A frightening amount.