Introducing the Diffbot Knowledge Graph

Meet the largest database of human knowledge ever created: Diffbot Knowledge Graph

Diffbot is pleased to announce the launch of a new product: Diffbot Knowledge Graph.

What is the Knowledge Graph?

Eight years ago, Diffbot revolutionized web data extraction with AI data extractors (AI:X). Now, Diffbot is set to disrupt how businesses interact with data from the web again with the all-new DKG (Diffbot Knowledge Graph).

“What we’ve built is the first Knowledge Graph that organizations can use to access the full breadth of information contained on the Web. Unlocking that data and giving organizations instant access to those deep connections completely changes knowledge-based work as we know it.”

– Mike Tung, founder and CEO of Diffbot.

Unlocking knowledge from the Web

Ever wished there was a search engine that gave you answers to your questions with data, rather than a list of links to URLs?

Using our trademark combination of machine learning and computer vision the DKG is curated by AI and built for enterprize, unlocking the entire Web as a source of searchable data. The DKG is a graph database of over 10 billion connected entities (people, companies, products, articles, and discussions) covering over 1+ trillion facts!

In contrast to other solutions marketed as Knowledge Graphs, the DKG is:

  • Fully autonomous and curated using Artificial Intelligence, unlike other knowledge graphs which are only partially autonomous and largely curated through manual labor.
  • Built specifically to provide knowledge as the end product, paid for and owned by the customer. No other company makes this available to their customers, as other knowledge graphs have been built to support ad-based search engine business models.
  • Web-wide, regardless of originating language. Diffbot technology can extract, understand, and make searchable any information in French, Chinese, and Cyrillic just as easily as in English.
  • Constantly rebuilt, from scratch, which is critical to the business value of the DKG. This rebuilding process ensures that DKG data is fresh, accurate, and comprehensive.


A Web-wide, comprehensive, and interconnected knowledge graph has the power to transform how enterprises do business. In our vision of the future, human beings won’t spend time sifting through mountains of data trying to determine what’s true. AI is so much better at doing that.

Right now, 30 percent of a knowledge worker’s job is data gathering. There’s a big opportunity in the market for a horizontal knowledge graph — a database of information about people, businesses, and things. Other knowledge graphs are little more than restructured Wikipedia facts with the simplest, most narrow connections drawn between. We knew we could do better.  So we’re building the first comprehensive map of human knowledge by analyzing every page on the Internet.

Knowledge is needed for AI

The other reason we’re building the DKG is to enable the next generation of AI to understand the relationships between the entities in the world it represents. True AI needs the ability to make informed decisions based on deep understanding and knowledge of how entities and concepts are linked together.

We’ve already seen some fantastic research from universities and industry built on top of the DKG – including the particularly interesting creation of a state-of-the-art Q&A AI, which has been very impressive.

Evolution from Data to Knowledge

There is a subtle but pivotal difference between data and knowledge. While data helps many businesses, knowledge has the power to be transformative for any business.

Define “Data”:

Facts and statistics collected together for reference or analysis.

Define “Knowledge”:

Facts, information, and skills acquired through experience or education; the theoretical or practical understanding of a subject.

– Oxford Dictionary

The key to the DKG’s value is how it encompasses the whole Web, and how it joins together all the data points from many sources into individual entities, and  – importantly – how it then connects those entities together according to their relationships.

By building a practical contextual understanding of all data online, the DKG is able to answer complex questions like: “How many people with the skill “JAVA” who used to work at IBM as a junior, now work at Facebook as a senior manager?” by providing you with a number and a list of people who meet the criteria.

To access the DKG, Diffbot created a search query language called Diffbot Query Language (DQL). It’s flexible enough to let you perform granular searches to find the one exact piece of information you need out of the trillions, or to gather massive datasets for broad analysis. DQL has all the tools you need to access the world’s largest knowledge source with highly accurate, precise searches.

Ready to Use Now

Now, any business that wants instant access to all of the world’s knowledge can simply sign up for the DKG and turn the entire Web into their personal database for business intelligence across:

  • People: skills, employment history, education, social profiles
  • Companies: rich profiles of companies and the workforce globally, from Fortune 500 to SMBs
  • Locations: mapping data, addresses, business types, zoning information
  • Articles: every news article, dateline, byline from anywhere on the Web, in any language
  • Products: pricing, specifications, and, reviews for every SKU across major ecommerce engines and individual retailers
  • Discussions: chats, social sharing, and conversations everywhere from article comments to web forums like Reddit
  • Images: billions of images on the web organized using image recognition and metadata collection

Want to learn more about the Diffbot Knowledge Graph?

Knowledge Graph in the Press