XGNNs: Model-level Explanation of Graph Neural Networks with RL through Graph Generation
Copyright ยฉ 2024 Francesco Danese, Alessio Borgi
The project explores model-level interpretability for Graph Neural Networks (GNNs) using a graph generation approach to surface human-interpretable patterns and motifs that drive class decisions. The MUTAG dataset serves as the primary benchmark. This technique can be used with any GNN.
Link to presentation slides
Overview
Why Interpretability Matters
GNNs are powerful for structured data but can be opaque, especially in sensitive domains (chemistry, medicine, social sciences). XGNN provides model-level explanations, uncovering general patterns the GNN relies on:
- Validate alignment with domain knowledge.
- Build user trust via interpretable decision patterns.
- Identify biases or inconsistencies for improvement.
Architecture

Graph Neural Network (GNN)
Backbone: Graph Convolutional Network (GCN) for graph classification.
- Input: adjacency + node features (MUTAG nodes one-hot atom types).
- Feature Propagation: aggregate neighbors; ReLU nonlinearity.
- Graph Representation: global pooling to fixed graph embedding.
- Classification: FC layer maps embeddings to class probabilities.
Graph Generator for Explanations

RL-based graph generator builds graphs that maximize GNN confidence for a target class:
- Start from a single node (e.g., carbon for MUTAG).
- Actions: add edges or new nodes from a candidate set.
- Policy: GCN-based policy predicts actions.
- Reward: combines model feedback and graph validity (e.g., valency).
Reinforcement Learning for Graph Generation
Modeled as an MDP:
- States: partially constructed graphs.
- Actions: add nodes/edges under constraints.
- Rewards: valid, interpretable graphs maximizing target class score.
Features
Interpretability via Graph Generation
Human-intelligible motifs reveal model reasoning:
- Mutagenic graphs: carbon rings, NO2 groups.
- Non-mutagenic: halogens (Cl, Br, F).
Flexibility
Framework adaptable to other datasets (social, proteins, etc.).
Experimental Insights
MUTAG Dataset
- Nodes: atoms; Edges: bonds; Classes: mutagenic vs non-mutagenic.
- GNN attains strong accuracy; three GCN layers capture atom interactions.
Generating Explanations
- Generator uncovers motifs the GNN uses:
- Carbon rings โ mutagenicity.
- Chlorine-focused structures โ non-mutagenicity.
Results Overview

Top row: class (C=0) graphs with high (p_{c0}). Bottom row: class (C=1) graphs with high (p_{c1}). Structural patterns respect chemical rules and class-specific features.
Generalization Beyond MUTAG
- Modular generator handles different datasets/rules.
- Patterns can inform domain hypotheses and improvements.
Prerequisites
Install dependencies:
pip install torch numpy networkx matplotlib
