sheaf_mpnn#

class BaseSheafConv[source]

Bases: MessagePassing

Shared base for all NSD and NSP sheaf convolution layers.

Factors out the parameterization and utilities that are identical across every variant in both model families:

  • W1 / W2 – bilateral stalk transforms (left d*d, right f*f).

  • sigma – activation function (Tanh).

  • reset_parameters() – Xavier init for W1, W2, and any map_generator.

  • _apply_stalk_transform(x) – computes W1 @ x @ W2.

  • _apply_norm(...) – abstract; each concrete subclass delegates to the appropriate apply_*_norm function from sheaf_mpnn.utils.

Concrete subclasses must implement:

get_map_products(x_feat, edge_index) -> (self_map, cross_map) _apply_norm(self_map, cross_map, edge_index, num_nodes) forward(x_feat, x_stalk, edge_index) -> updated stalk message(...)

Parameters:
  • stalk_dim (int)

  • in_channels (int)

  • hidden_dim (int)

  • context_dim (int | None, default: None)

  • add_self_loops (bool, default: True)

__init__(stalk_dim, in_channels, hidden_dim, context_dim=None, add_self_loops=True)[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

Parameters:
  • stalk_dim (int)

  • in_channels (int)

  • hidden_dim (int)

  • context_dim (int | None, default: None)

  • add_self_loops (bool, default: True)

reset_parameters()[source]

Resets all learnable parameters of the module.

message(z_dst, z_src, self_map, cross_map)[source]

Builds per-edge sheaf Laplacian messages.

Parameters:
  • z_dst – Destination-node transformed stalks [E, d, f].

  • z_src – Source-node transformed stalks [E, d, f].

  • self_map – Normalized F_dst^T F_dst per edge [E, d, d].

  • cross_map – Normalized F_dst^T F_src per edge [E, d, d].

Returns:

Per-edge messages [E, d, f].

Return type:

Tensor

class DiagonalNSDConv[source]

Bases: BaseNSDConv

Diagonal NSD convolution layer.

Parameters:
  • stalk_dim (int)

  • in_channels (int)

  • hidden_dim (int)

  • alpha (float, default: 1.0)

  • context_dim (int | None, default: None)

  • add_self_loops (bool, default: True)

__init__(stalk_dim, in_channels, hidden_dim, alpha=1.0, context_dim=None, add_self_loops=True)[source]

Initializes the shared NSD convolution parameters.

Parameters:
  • stalk_dim (int) – Stalk dimension. Each node state handled by the layer has shape [stalk_dim, in_channels].

  • in_channels (int) – Feature dimension inside each stalk channel (f).

  • hidden_dim (int) – Hidden width of the restriction-map generator MLP.

  • alpha (float, default: 1.0) – Initial residual diffusion step size.

  • context_dim (int | None, default: None) – Width of each node context vector x_feat.

  • add_self_loops (bool, default: True) – Whether to add self-loops for degree normalization.

get_map_products(x_feat, edge_index)[source]

Precompute self_map and cross_map restriction-map products per edge.

message(z_dst, z_src, self_map, cross_map)[source]

Builds per-edge sheaf Laplacian messages.

Parameters:
  • z_dst (Tensor) – Destination-node transformed stalks [E, d, f].

  • z_src (Tensor) – Source-node transformed stalks [E, d, f].

  • self_map (Tensor) – Normalized F_dst^T F_dst per edge [E, d, d].

  • cross_map (Tensor) – Normalized F_dst^T F_src per edge [E, d, d].

Returns:

Per-edge messages [E, d, f].

Return type:

Tensor

class GeneralNSDConv[source]

Bases: BaseNSDConv

Generalized NSD convolution layer.

Parameters:
  • stalk_dim (int)

  • in_channels (int)

  • hidden_dim (int)

  • alpha (float, default: 1.0)

  • context_dim (int | None, default: None)

  • add_self_loops (bool, default: True)

  • use_attention (bool, default: False)

__init__(stalk_dim, in_channels, hidden_dim, alpha=1.0, context_dim=None, add_self_loops=True, use_attention=False)[source]

Initializes the shared NSD convolution parameters.

Parameters:
  • stalk_dim (int) – Stalk dimension. Each node state handled by the layer has shape [stalk_dim, in_channels].

  • in_channels (int) – Feature dimension inside each stalk channel (f).

  • hidden_dim (int) – Hidden width of the restriction-map generator MLP.

  • alpha (float, default: 1.0) – Initial residual diffusion step size.

  • context_dim (int | None, default: None) – Width of each node context vector x_feat.

  • add_self_loops (bool, default: True) – Whether to add self-loops for degree normalization.

  • use_attention (bool, default: False)

get_map_products(x_feat, edge_index)[source]

Precompute self_map and cross_map restriction-map products per edge.

class OrthogonalNSDConv[source]

Bases: BaseNSDConv

Orthogonal NSD convolution layer.

Parameters:
  • stalk_dim (int)

  • in_channels (int)

  • hidden_dim (int)

  • alpha (float, default: 1.0)

  • context_dim (int | None, default: None)

  • add_self_loops (bool, default: True)

  • clamp_val (float, default: 10.0)

  • use_attention (bool, default: False)

  • orth_strategy (Literal['cayley', 'fasth'], default: "cayley")

__init__(stalk_dim, in_channels, hidden_dim, alpha=1.0, context_dim=None, add_self_loops=True, clamp_val=10.0, use_attention=False, orth_strategy='cayley')[source]

Initializes an orthogonal NSD convolution layer.

The map generator outputs parameters for either: (1) entries of a skew-symmetric matrix (cayley), (2) Householder vectors (fasth), or (3) attention-based mappings. All parameterisations produce orthogonal d x d restriction maps.

Parameters:
  • stalk_dim (int) – Stalk dimension and orthogonal restriction-map matrix size.

  • in_channels (int) – Feature dimension inside each stalk channel.

  • hidden_dim (int) – Hidden width of the restriction-map generator MLP.

  • alpha (float, default: 1.0) – Initial learnable diffusion step size. Defaults to 1.0.

  • context_dim (int | None, default: None) – Width of x_feat. Defaults to d * in_channels when omitted.

  • add_self_loops (bool, default: True) – If True, self-loops augment the degree used for normalization. Defaults to True.

  • clamp_val (float, default: 10.0) – Maximum absolute value for clamping Cayley-transform parameters. Defaults to 10.0.

  • use_attention (bool, default: False) – If True, uses the attention-based Cayley initialization from main. Defaults to False.

  • orth_strategy (Literal['cayley', 'fasth'], default: "cayley") – “cayley” or “fasth”. Defaults to “cayley”.

get_map_products(x_feat, edge_index)[source]

Precompute self_map and cross_map restriction-map products per edge.

message(z_dst, z_src, self_map, cross_map)[source]

Builds per-edge sheaf Laplacian messages.

Parameters:
  • z_dst (Tensor) – Destination-node transformed stalks [E, d, f].

  • z_src (Tensor) – Source-node transformed stalks [E, d, f].

  • self_map (Tensor) – Normalized F_dst^T F_dst per edge [E, d, d].

  • cross_map (Tensor) – Normalized F_dst^T F_src per edge [E, d, d].

Returns:

Per-edge messages [E, d, f].

Return type:

Tensor

class NSDModel[source]

Bases: Module

End-to-end Neural Sheaf Diffusion (NSD) model.

The wrapper lifts raw node features into stalk features, applies a stack of NSD convolution layers, and decodes the flattened stalk representation back to the requested output dimension.

Parameters:
  • in_channels (int)

  • out_channels (int)

  • stalk_dim (int, default: 4)

  • hidden_dim (int, default: 16)

  • num_layers (int, default: 2)

  • variant (NSDVariant, default: NSDVariant.GENERAL)

  • alpha (float, default: 1.0)

  • add_self_loops (bool, default: True)

  • orth_strategy (str, default: "cayley")

  • rank (int, default: 1)

  • input_dropout (float, default: 0.0)

  • dropout (float, default: 0.0)

  • normalize_output (bool, default: True)

  • jknet (bool, default: False)

__init__(in_channels, out_channels, stalk_dim=4, hidden_dim=16, num_layers=2, variant=NSDVariant.GENERAL, alpha=1.0, add_self_loops=True, orth_strategy='cayley', rank=1, input_dropout=0.0, dropout=0.0, normalize_output=True, jknet=False)[source]

Initializes an NSD model for node-level prediction.

Parameters:
  • in_channels (int) – Number of raw input features per node.

  • out_channels (int) – Number of output channels per node (e.g. num classes).

  • stalk_dim (int, default: 4) – Stalk dimension. Each node is represented internally as a matrix with shape [stalk_dim, hidden_dim].

  • hidden_dim (int, default: 16) – Feature dimension inside each stalk channel. The encoded node state has size d * hidden_dim.

  • num_layers (int, default: 2) – Number of NSD convolution layers. Must be positive.

  • variant (NSDVariant, default: NSDVariant.GENERAL) – Restriction-map family. DIAGONAL is cheapest, GENERAL is most expressive, ORTHOGONAL uses orthogonal maps (via Cayley or Householder parameterisation). GENERAL_ATTENTION and ORTHOGONAL_ATTENTION use an attention-based map initialisation.

  • alpha (float, default: 1.0) – Initial learnable diffusion step size per layer.

  • add_self_loops (bool, default: True) – If True, self-loops are added to the graph before computing degree normalization in each layer. Defaults to True.

  • orth_strategy (str, default: "cayley") – Orthogonality strategy for the ORTHOGONAL variant: “cayley” or “fasth”. Defaults to “cayley”.

  • rank (int, default: 1) – Rank of each restriction map for the LOW_RANK variant. Must be positive. Ignored for other variants. Defaults to 1.

  • input_dropout (float, default: 0.0) – Dropout probability applied to raw input features before encoding. Defaults to 0.0.

  • dropout (float, default: 0.0) – Dropout probability applied to stalk features between layers. Defaults to 0.0.

  • normalize_output (bool, default: True) – If True, L2-normalise the representation before the decoder (Lv et al., 2021). If jknet is True, each layer’s output is also normalised before concatenation. Defaults to True.

  • jknet (bool, default: False) – If True, collect hidden states from every layer and concatenate them before the decoder (Xu et al., 2018). Normalization is controlled by normalize_output. Intended for link prediction. Defaults to False.

reset_parameters()[source]
forward(x, edge_index)[source]

Runs the NSD encoder, diffusion layers, and decoder.

Parameters:
  • x (Tensor) – Raw node features with shape [num_nodes, in_channels].

  • edge_index (Tensor) – Graph connectivity in COO format with shape [2, num_edges].

Returns:

Node outputs with shape [num_nodes, out_channels].

Return type:

Tensor

class NSDVariant[source]

Bases: Enum

DIAGONAL = 1
GENERAL = 2
ORTHOGONAL = 3
GENERAL_ATTENTION = 4
ORTHOGONAL_ATTENTION = 5
LOW_RANK = 6
property layer_class
property layer_kwargs: dict[str, Any]
build_kwargs(orth_strategy='cayley', rank=1)[source]

Build the full layer keyword-argument dict for this variant.

Parameters:
  • orth_strategy (Literal['cayley', 'fasth'], default: "cayley")

  • rank (int, default: 1)

Return type:

dict[str, Any]

NSD layers#

DiagonalNSDConv

Diagonal NSD convolution layer.

GeneralNSDConv

Generalized NSD convolution layer.

OrthogonalNSDConv

Orthogonal NSD convolution layer.

NSD model#

NSDModel

End-to-end Neural Sheaf Diffusion (NSD) model.

NSDVariant