Neurosymbolic AI: The 3rd Wave

The paper (arXiv 2020, also AI review 2023) opens up with discussing recent high-profile AI debates: the Montréal AI Debate and the AAAI 2020 fireside chat with Kahneman, Hinton, LeCun, and Bengio. A consensus seems to be emerging: for AI to be robust and trustworthy, it must combine learning with reasoning. Kahneman's "System 1 vs. System 2" dual framing of cognition maps well to deep learning and symbolic reasoning. And AI needs both.

Neurosymbolic AI promises to  combine data-driven learning with structured reasoning, and provide modularity, interpretability, and measurable explanations. The paper moves from philosophical context to representation, then to system design and technical challenges in neurosymbolic AI. 


Neurons and Symbols: Context and Current Debate

This section lays out the historic divide within symbolic AI and neural AI. Symbolic approach supports logic, reasoning, and explanation. Neural approach excels at perception and learning from data. Symbolic systems are good at thinking, but not learning. Deep learning is good at learning, but not thinking. Despite the great progress recently, deep learning still lacks transparency and remains energy-hungry. Critics like Gary Marcus argue that symbolic manipulation is needed for generalization and commonsense.

The authors here appeal to Valiant's call for a "semantics of knowledge" and say that neural-symbolic computing aims to answer this call. Symbolic logic can be embedded in neural systems, and neural representations can be interpreted in symbolic terms. Logic Tensor Networks (LTNs) are presented as a concrete solution. They embed first order logic formulas into tensors, and sneak logic into the loss function to help learn not just from data, but from rules. For this, logical formulas are relaxed into differentiable constraints. These are then used during training, guiding the model to satisfy logical relationships while learning from data. I was surprised to see some concrete work and software for LTNs on github. There is also a paper explaining the principles.


Distributed and Localist Representation

This section reframes the debate around representation. Neural networks use distributed representations: knowledge is encoded in continuous vectors, over which the concepts are smeared. This works well for learning and optimization. Symbolic systems use localist representations: discrete identifiers for concepts. These are better for reasoning and abstraction.

The challenge is to bridge the two. LTNs do this by grounding symbolic logic into tensor-based representations. Logic formulas are mapped to constraints over continuous embeddings. This enables symbolic querying over learned neural structures, while preserving the strengths of gradient-based learning. LTNs also allow symbolic structure to emerge during learning.

There is an interesting contrast here with the Neurosymbolic AI paper we reviewed yesterday. That paper favored Option2 approaches, which begins with the neural representation and lifts it into symbolic form. In other words, it advocates extracting structured symbolic patterns, explanations, or logical chains of reasoning from the output of neural systems. This paper, through advocating for LTNs, seems to favor Option1: embedding symbolic structures into neural vector spaces.


Neurosymbolic Computing Systems: Technical Aspects

Symbolic models use rules: decision trees, logic programs, structured knowledge. They are interpretable but brittle to change and new information. Deep nets, on the other hand, learn vector patterns using gradient descent. They are great with fuzz, but awful with generalizing rules. They speak linear algebra, not logic.

The first approach to combine them is to bake logic into the network's structure. The second approach is to encode logic in the loss function, but otherwise keep it separate from the network's architecture. The authors seem to lean toward the second approach for its flexibility, modularity, and scalability.

LTNs also seem to fall into the second approach. LTNs represent logical formulas as differentiable constraints, which are added to the loss function during training. The network learns to satisfy logic, but the logic is not hardwired into its structure. So the logic guides learning, but it is not embedded in the weights.


Challenges for the Principled Combination of Reasoning and Learning

Combining reasoning and learning introduces new challenges. One is how to handle quantifiers. Symbolic systems handle universal quantifiers (\forall) well. Neural networks are better at spotting existential patterns (\exists). This asymmetry makes hybrid systems attractive: let each side do what it does best.

Restricted Boltzmann Machines (RBMs) are discussed as early examples of hybrid models. They learn probability distributions over visible and hidden variables. With modular design, rules can be extracted from trained RBMs. But as models grow deeper, they lose modularity and interpretability. Autoencoders, GANs, and model-based reinforcement learning may offer ways to address this.

Comments

Popular posts from this blog

Hints for Distributed Systems Design

My Time at MIT

Advice to the young

Scalable OLTP in the Cloud: What’s the BIG DEAL?

Foundational distributed systems papers

Learning about distributed systems: where to start?

Distributed Transactions at Scale in Amazon DynamoDB

Making database systems usable

Looming Liability Machines (LLMs)

Analyzing Metastable Failures in Distributed Systems