Heterogeneous Graphs: When Nodes and Edges Have Types

4 minute read

Published:

TL;DR: A heterogeneous graph has multiple node types and edge types. Standard GNNs use a single message function and aggregation โ€” they cannot differentiate a "cites" edge from an "is-authored-by" edge. Handling heterogeneity requires type-specific message functions, meta-path decomposition, or relation-aware aggregation.

What Is a Heterogeneous Graph?

A heterogeneous graph (or heterogeneous information network, HIN) is defined as:

G = (V, E, ฯ„, ฯ†)
where ฯ„: V โ†’ A maps each node to a node type (A> 1) and ฯ†: E โ†’ R maps each edge to an edge type (R> 1).

Examples:

Academic network:

  • Node types: Paper, Author, Venue
  • Edge types: cites, written-by, published-in, reviews

Recommender system:

  • Node types: User, Item, Category, Brand
  • Edge types: clicks, purchases, belongs-to, manufactured-by

Biomedical knowledge graph:

  • Node types: Gene, Disease, Drug, Protein
  • Edge types: associated-with, treats, inhibits, encodes

Why Standard GNNs Fail on Heterogeneous Graphs

Standard message passing:

h^{(k)}_v = UPDATE( h^{(k-1)}_v, AGG({ h^{(k-1)}_u : u โˆˆ N(v) }) )

applies the same message function to all neighbours, regardless of the edge type connecting them. This conflates semantically very different relationships:

  • โ€œUser A clicked Item Bโ€ and โ€œItem B belongs-to Category Cโ€ are both aggregated identically
  • The model cannot learn that โ€œcitesโ€ edges carry different information than โ€œco-authored-byโ€ edges
  • Node type differences are ignored โ€” a Gene node and a Drug node are processed identically

Solutions Overview

1. Type-specific message functions: learn a separate weight matrix W_r for each relation type r. Messages of type r are W_r h_u. Used in R-GCN.

2. Meta-path decomposition: define semantically meaningful paths through the graph (e.g., Author โ†’ Paper โ†’ Author = co-authorship). Run separate GNNs along each meta-path. Used in HAN.

3. Relation-aware attention: attend differentially to different relation types when aggregating. Used in HAN, HGT.

4. Type-specific projections: project all node types into a common embedding space with type-specific linear transforms before message passing. Used in HGT (Heterogeneous Graph Transformer).

Meta-Paths: Semantic Bridges

A meta-path is a sequence of node and edge types defining a composite relationship:

Author -[writes]โ†’ Paper -[written-by]โ†’ Author
= APA (Author-Paper-Author) = co-authorship

Paper -[cites]โ†’ Paper -[published-in]โ†’ Venue -[publishes]โ†’ Paper
= PCPC (complex multi-hop semantic relation)

Meta-paths allow encoding domain knowledge into the graph structure. A model operating on the APA meta-path captures co-authorship patterns; one on the APVPA meta-path (Author โ†’ Paper โ†’ Venue โ†’ Paper โ†’ Author) captures researchers working in the same venue.

Meta-paths as graph views: Each meta-path defines a new homogeneous graph (all nodes same type, all edges same type) where two nodes are connected if there exists a path of the given type between them. Running a standard GNN on each of these views, then combining, is one approach to heterogeneous GNN design.

Node Projection to Common Space

When node types have different feature dimensions (e.g., Papers have text embeddings, Authors have profile embeddings), we must first project all types to a common dimension d:

h^{(0)}_v = W_{ฯ„(v)} x_v + b_{ฯ„(v)}

A separate linear projection W_{ฯ„(v)} per node type ensures all nodes live in the same embedding space before message passing begins.

Heterogeneous Graph Benchmarks

  • OGB-MAG (Open Graph Benchmark: Microsoft Academic Graph): 736,389 papers, 59,965 authors, citation + authorship edges
  • IMDB (heterogeneous): Movies, Actors, Directors โ€” classify movie genre
  • ACM: Papers, Authors, Subjects โ€” classify research area
  • DBLP: Authors, Papers, Venues, Terms โ€” author classification

Summary

ApproachHow it handles heterogeneityExample
Type-specific weightsSeparate W_r per relationR-GCN
Meta-path aggregationRun GNNs on meta-path subgraphsHAN
Relation-aware attentionAttention over relation typesHAN, HGT
Type projectionMap all types to common spaceHGT

Heterogeneous GNNs extend the MPNN framework to handle the multi-relational, multi-typed structure of real knowledge graphs, recommendation systems, and biomedical networks โ€” domains where the type structure is often as important as the graph topology.

References