Inference Engines are a component of an artificial intelligence system that apply logical rules to a knowledge graph (or base) to surface new facts and relationships. Implementation of inference engines can proceed via induction or deduction. The process of inferring relationships between entities utilizing machine learning, machine vision, and natural language processing have exponentially increased the scale and value of knowledge graphs and relational databases in recent years.
Types of Inference Within Inference Engines
An example of deductive reasoning within an inference engine could be that San Francisco is in California. Therefore any entity (say a person or organization) located in San Francisco must also be located in California. Why does this matter? If you are searching for organizations in California, all organizations located in San Francisco should be included.
An example of inductive reasoning within an inference engine could be that all tech companies with more than 100 employees tend to have CTO’s, and therefore any company meeting this criteria without a CTO must be missing a record.
Many machine learning models rely on inductive inferences where rules about output are learned from historical examples. Diffbot’s Knowledge Graph surfaces fields using inference from natural language processing and machine vision inputs. Some of our machine learning-computed fields include estimated revenue for non-public companies. Additionally, we provide a similarity score for clustering of similar organizations that is machine learning computed.Historically, inference engines were included as components within expert systems, systems meant to emulate the problem solving ability of a human expert within a given domain.
Inference Engine Methods
Two methods employed within many inference engines to infer new knowledge include what are known as the backward chaining and forward chaining reasoning methods.
- Backward chaining reasoning methods begin with a list of hypotheses and work backwards to see if data, once plugged into rules support these hypotheses. In short, backward chaining highlights what facts must be true to support a hypothesis.
- Forward chaining reasoning methods start with available data and utilize rules to infer new data. In short, forward chaining starts with known facts and uses them to create new facts.
Both backward and forward chaining reasoning progress according to the modus ponens form of deductive reasoning. In other words, X implies Y is true. X is true, and therefore Y must be true.
An example of forward chaining would be to take the psuedocode example:
Rule1: Dog(x) => Mammal(x)
Which states that all dogs are mammals. And then to take every entity known to be a dog and create the facts that these are animals (e.g. Airbud is a mammal, Australian Shepherds are mammals, and so forth).
An example of backward chaining can be seen through a scenario in which the inference engine is aided by an interface for a human.
Assuming the same rule as above and the hypothesis to be checked “A Corgi is a mammal,” a backward chaining reasoning method could highlight the assertion needed to prove this hypothesis. This could occur by an interface asking a human “is a Corgi a dog?” When answered yes, the hypothesis “a Corgi is a mammal” would be validated.
In many contemporary AI applications, both backward and forward chaining are applied in what is referred to as opportunistic reasoning. As one may expect, opportunistic reasoning applies each method of inference when it is most opportune for expanding the knowledge base.
At it’s simplest form, the actions performed by inference engines tend to progress through three stages: match rules, select rules, execute rules.
- Match rules is an action in which an inference engine finds all rules triggered by the contents of a knowledge base.
- Select rules is an action which discerns which order rules should be applied in (this will differ for forward or backward chaining, or by other machine learning inputs)
- Execute rules applies rules to existing knowledge through forward or backward chaining.
Once execute rules is completed, match rules is re-started until there are no more opportunities for either forward or backward chaining deductions.
Pros and Cons of Inferred Fields In Knowledge Bases
There are distinct trade off’s to fields that are inferred rather than “known” within knowledge bases.
Pros
- Typically the only form of machine learning output (statistically significant learnings, not “known”)
- Can vastly augment the number of usable fields for applications like market intelligence, news monitoring, or risk scoring
- Can specify confidence thresholds to filter out unlikely inferences
Cons
- Lack of data provenance and explainability
- Reliant on quality of underlying data to infer from