What are Named Entities?

Named Entities are specific people, places, or things, focusing on what Google might look for when returning information about queries. They got a lot smarter in answering questions about named entities with the acquisition of MetaWeb, which had developed a way of better understanding named entities in searches for them, which Google appears to have adopted.

Here is an example of how MetaWeb handled named entities, as described in one of the patents they had gotten granted:

You may know him by many names or titles – Governor of California, Terminator, Governator, Conan the Barbarian, Kindergarten Cop, Mr. Universe, Mr. Olympia, Arnold Strong, Arnie, The Austrian Oak.

To Metaweb, Arnold Schwarzenegger is referred to as 9202a8c04000641f8000000000006567.

Who is Metaweb?

Metaweb is a company recently acquired by Google, and they’ve created a system of indexing named entities that allow you to search for information in a new way. The idea sounds a little like a library’s Dewey Decimal system, but for named entities.

Why is this important, and what are Named Entities?

A named entity is a specific person, place, or thing. For example, named entities can include Barack Obama, the Commonwealth of Virginia, or the Great American Ballpark in Cincinnati. Associating unique identification numbers with named entities can make it easier to index them and find information about those named entities when they might be referred to by different names, like my example above about Arnold Schwarzenegger. They can also help with local search by allowing specific places, businesses, or landmarks to have unique identification numbers.

How often do named entities appear in Web searches? A recent paper from Microsoft, Building Taxonomy of Web Search Intents for Name Entity Queries (pdf) tells us that they are pretty common:

According to an internal study of Microsoft, at least 20-30% of queries submitted to Bing search are named entities, and it is reported 71% of queries contain name entities.

Google announced their acquisition of Metaweb in an Official Google Blog post, Deeper understanding with Metaweb. Metaweb also announced the acquistion in their post, Metaweb joins Google

Metaweb started a knowledgebase called Freebase, which had volunteer editors and contributors who added entity information. It became one of the significant sources of information behind Google’s Knowledge Graph.

Metaweb has several patent applications at the United States Patent and Trademark Office. They are worth diving into if you want to learn a little about some of the technology behind the company.

I’ve just started looking at them myself, beginning with the one below on “Query Optimization,” where I found the Metaweb ID number of Arnold Schwarzenegger. The patent filing describes how an ID number can collect and store data about named entities and information associated with them and how queries can be performed based on that collected information.

Here are the patent filings assigned to Metaweb

Automated online purchasing system Invented by W. Daniel Hillis, Bran Ferren US Patent Application 20030195834 Published October 16, 2003 Filed: September 18, 2002

Meta-Web Invented by W. Daniel Hillis, Bran Ferren US Patent Application 20040210602 Published October 21, 2004 Filed: December 15, 2003

Personalized profile for evaluating content Invented by W. Daniel Hillis and Bran Ferren US Patent Application 20050131918 Published June 16, 2005 Filed: May 24, 2004

Delegated authority evaluation system Invented by W. Daniel Hillis and Bran Ferren US Patent Application 20050131722 Published June 16, 2005 Filed: May 25, 2004

System and method to facilitate importation of user profile data over a network Invented by W. Daniel Hillis and Bran Ferren US Patent Application 20060095780 Published May 4, 2006 Filed: October 28, 2004

User Contributed Knowledge Database Invented by Timothy Sturge, Kurt Bollacker, Robert Cook, John Giannandrea, Nicholas Thompson, Edwin Taylor US Patent Application 20090024590 Published January 22, 2009 Filed: April 22, 2008

Graph Store Invented by Scott Meyer, Jutta Degener, Barak Michener, John Giannandrea US Patent Application 20100174692 Published July 8, 2010 Filed: January 20, 2010

Database Replication Invented by Scott Meyer, Jutta Degener, Barak Michener, John Giannandrea US Patent Application 20100121817 Published May 13, 2010 Filed: January 20, 2010

Query Optimization Invented by Scott Meyer, Jutta Degener, Barak Michener, John Giannandrea US Patent Application 20100121839 Published May 13, 2010 Filed: January 20, 2010

Knowledge Web Invented by W. Daniel Hillis and Bran Ferren Assigned to Metaweb Technologies, Inc. US Patent 7,502,770 Granted March 10, 2009 Filed April 10, 2002

Metaweb Conclusion

Metaweb operates the community-based site Freebase, a community-based source of data about different people, places, and things. For a great example of how they collect and display data, see their page on George Washington.

What will Metaweb bring to Google?

That remains to be seen, but Metaweb’s technology might help make it easier for Google to associate information with named entities. As the Microsoft paper I mentioned above noted, searches for named entities make up a good percentage of searches on their search engine. The chances are that searches for named entities are pretty popular on Google. So the impact of the Metaweb acquisition could potentially be a large one.

I’ve written a few posts about named entities. These are some that I wanted to share:

Last Updated June 26, 2019.