What are Named Entities?
Named Entities are specific people, places, or things, focusing on what Google might look for when returning information about queries. They got a lot smarter in answering questions about named entities with the acquisition of MetaWeb, which had developed a way of better understanding named entities in searches for them, which Google appears to have adopted.
Here is an example of how MetaWeb handled named entities, as described in one of the patents they had gotten granted:
You may know him by many names or titles – Governor of California, Terminator, Governator, Conan the Barbarian, Kindergarten Cop, Mr. Universe, Mr. Olympia, Arnold Strong, Arnie, The Austrian Oak.
To Metaweb, Arnold Schwarzenegger is referred to as 9202a8c04000641f8000000000006567.
Who is Metaweb?
Metaweb is a company recently acquired by Google, and they’ve created a system of indexing named entities that allow you to search for information in a new way. The idea sounds a little like a library’s Dewey Decimal system, but for named entities.
Why is this important, and what are Named Entities?
A named entity is a specific person, place, or thing. For example, named entities can include Barack Obama, the Commonwealth of Virginia, or the Great American Ballpark in Cincinnati. Associating unique identification numbers with named entities can make it easier to index them and find information about those named entities when they might be referred to by different names, like my example above about Arnold Schwarzenegger. They can also help with local search by allowing specific places, businesses, or landmarks to have unique identification numbers.
How often do named entities appear in Web searches? A recent paper from Microsoft, Building Taxonomy of Web Search Intents for Name Entity Queries (pdf) tells us that they are pretty common:
According to an internal study of Microsoft, at least 20-30% of queries submitted to Bing search are named entities, and it is reported 71% of queries contain name entities.
Google announced their acquisition of Metaweb in an Official Google Blog post, Deeper understanding with Metaweb. Metaweb also announced the acquistion in their post, Metaweb joins Google
Metaweb started a knowledgebase called Freebase, which had volunteer editors and contributors who added entity information. It became one of the significant sources of information behind Google’s Knowledge Graph.
Metaweb has several patent applications at the United States Patent and Trademark Office. They are worth diving into if you want to learn a little about some of the technology behind the company.
I’ve just started looking at them myself, beginning with the one below on “Query Optimization,” where I found the Metaweb ID number of Arnold Schwarzenegger. The patent filing describes how an ID number can collect and store data about named entities and information associated with them and how queries can be performed based on that collected information.
Here are the patent filings assigned to Metaweb
Automated online purchasing system Invented by W. Daniel Hillis, Bran Ferren US Patent Application 20030195834 Published October 16, 2003 Filed: September 18, 2002
Meta-Web Invented by W. Daniel Hillis, Bran Ferren US Patent Application 20040210602 Published October 21, 2004 Filed: December 15, 2003
Personalized profile for evaluating content Invented by W. Daniel Hillis and Bran Ferren US Patent Application 20050131918 Published June 16, 2005 Filed: May 24, 2004
Delegated authority evaluation system Invented by W. Daniel Hillis and Bran Ferren US Patent Application 20050131722 Published June 16, 2005 Filed: May 25, 2004
System and method to facilitate importation of user profile data over a network Invented by W. Daniel Hillis and Bran Ferren US Patent Application 20060095780 Published May 4, 2006 Filed: October 28, 2004
User Contributed Knowledge Database Invented by Timothy Sturge, Kurt Bollacker, Robert Cook, John Giannandrea, Nicholas Thompson, Edwin Taylor US Patent Application 20090024590 Published January 22, 2009 Filed: April 22, 2008
Graph Store Invented by Scott Meyer, Jutta Degener, Barak Michener, John Giannandrea US Patent Application 20100174692 Published July 8, 2010 Filed: January 20, 2010
Database Replication Invented by Scott Meyer, Jutta Degener, Barak Michener, John Giannandrea US Patent Application 20100121817 Published May 13, 2010 Filed: January 20, 2010
Query Optimization Invented by Scott Meyer, Jutta Degener, Barak Michener, John Giannandrea US Patent Application 20100121839 Published May 13, 2010 Filed: January 20, 2010
Knowledge Web Invented by W. Daniel Hillis and Bran Ferren Assigned to Metaweb Technologies, Inc. US Patent 7,502,770 Granted March 10, 2009 Filed April 10, 2002
Metaweb Conclusion
Metaweb operates the community-based site Freebase, a community-based source of data about different people, places, and things. For a great example of how they collect and display data, see their page on George Washington.
What will Metaweb bring to Google?
That remains to be seen, but Metaweb’s technology might help make it easier for Google to associate information with named entities. As the Microsoft paper I mentioned above noted, searches for named entities make up a good percentage of searches on their search engine. The chances are that searches for named entities are pretty popular on Google. So the impact of the Metaweb acquisition could potentially be a large one.
I’ve written a few posts about named entities. These are some that I wanted to share:
- Do You Have a Named Entity Strategy for Marketing Your Web Site?
- How I Came to Love Entities and Start Doing Entity Optimization
- How Google Uses Named Entity Disambiguation for Entities with the Same Names
- How Named Entities Connected to Trending Topics can be used to Address Real Time Search Results
- Not Brands but Entities: The Influence of Named Entities on Google and Yahoo Search Results
- How Knowledge Base Entities can be Used in Searches
- Finding Entity Names in Google’s Knowledge Graph
- Google Gets Smarter with Named Entities: Acquires MetaWeb
- Entity Associations with Websites and Related Entities
- How Google Might Identify Entity Synonyms Using Anchor Text
- Extracting Facts for Entities from Sources such as Wikipedia Titles and Infoboxes
- Extracting Semantic Classes and Corresponding Instances from Web Pages and Query Logs
- How Google May Identify Main Entities
- How Google’s Knowledge Graph Updates Itself by Answering Questions
Last Updated June 26, 2019.