sheaf_mpnn#

class BaseSheafConv[source]

Bases: MessagePassing

Shared base for all NSD and NSP sheaf convolution layers.

Factors out the parameterization and utilities that are identical across every variant in both model families:

W1 / W2 – bilateral stalk transforms (left d*d, right f*f).
sigma – activation function (Tanh).
reset_parameters() – Xavier init for W1, W2, and any map_generator.
_apply_stalk_transform(x) – computes W1 @ x @ W2.
_apply_norm(...) – abstract; each concrete subclass delegates to the appropriate apply_*_norm function from sheaf_mpnn.utils.

Concrete subclasses must implement:: get_map_products(x_feat, edge_index) -> (self_map, cross_map) _apply_norm(self_map, cross_map, edge_index, num_nodes) forward(x_feat, x_stalk, edge_index) -> updated stalk message(...)

Parameters:

stalk_dim (int)
in_channels (int)
hidden_dim (int)
context_dim (int | None, default: None)
add_self_loops (bool, default: True)

__init__(stalk_dim, in_channels, hidden_dim, context_dim=None, add_self_loops=True)[source]

Initialize internal Module state, shared by both nn.Module and ScriptModule.

Parameters:

stalk_dim (int)
in_channels (int)
hidden_dim (int)
context_dim (int | None, default: None)
add_self_loops (bool, default: True)

reset_parameters()[source]: Resets all learnable parameters of the module.

message(z_dst, z_src, self_map, cross_map)[source]

Builds per-edge sheaf Laplacian messages.

Parameters:

z_dst – Destination-node transformed stalks [E, d, f].
z_src – Source-node transformed stalks [E, d, f].
self_map – Normalized F_dst^T F_dst per edge [E, d, d].
cross_map – Normalized F_dst^T F_src per edge [E, d, d].

Returns:

Per-edge messages [E, d, f].

Return type:

Tensor

class DiagonalNSDConv[source]

Bases: BaseNSDConv

Diagonal NSD convolution layer.

Parameters:

stalk_dim (int)
in_channels (int)
hidden_dim (int)
alpha (float, default: 1.0)
context_dim (int | None, default: None)
add_self_loops (bool, default: True)

__init__(stalk_dim, in_channels, hidden_dim, alpha=1.0, context_dim=None, add_self_loops=True)[source]

Initializes the shared NSD convolution parameters.

Parameters:

stalk_dim (int) – Stalk dimension. Each node state handled by the layer has shape [stalk_dim, in_channels].
in_channels (int) – Feature dimension inside each stalk channel (f).
hidden_dim (int) – Hidden width of the restriction-map generator MLP.
alpha (float, default: 1.0) – Initial residual diffusion step size.
context_dim (int | None, default: None) – Width of each node context vector x_feat.
add_self_loops (bool, default: True) – Whether to add self-loops for degree normalization.

get_map_products(x_feat, edge_index)[source]: Precompute self_map and cross_map restriction-map products per edge.

message(z_dst, z_src, self_map, cross_map)[source]

Builds per-edge sheaf Laplacian messages.

Parameters:

z_dst (Tensor) – Destination-node transformed stalks [E, d, f].
z_src (Tensor) – Source-node transformed stalks [E, d, f].
self_map (Tensor) – Normalized F_dst^T F_dst per edge [E, d, d].
cross_map (Tensor) – Normalized F_dst^T F_src per edge [E, d, d].

Returns:

Per-edge messages [E, d, f].

Return type:

Tensor

class GeneralNSDConv[source]

Bases: BaseNSDConv

Generalized NSD convolution layer.

Parameters:

stalk_dim (int)
in_channels (int)
hidden_dim (int)
alpha (float, default: 1.0)
context_dim (int | None, default: None)
add_self_loops (bool, default: True)
use_attention (bool, default: False)

__init__(stalk_dim, in_channels, hidden_dim, alpha=1.0, context_dim=None, add_self_loops=True, use_attention=False)[source]

Initializes the shared NSD convolution parameters.

Parameters:

stalk_dim (int) – Stalk dimension. Each node state handled by the layer has shape [stalk_dim, in_channels].
in_channels (int) – Feature dimension inside each stalk channel (f).
hidden_dim (int) – Hidden width of the restriction-map generator MLP.
alpha (float, default: 1.0) – Initial residual diffusion step size.
context_dim (int | None, default: None) – Width of each node context vector x_feat.
add_self_loops (bool, default: True) – Whether to add self-loops for degree normalization.
use_attention (bool, default: False)

get_map_products(x_feat, edge_index)[source]: Precompute self_map and cross_map restriction-map products per edge.

class OrthogonalNSDConv[source]

Bases: BaseNSDConv

Orthogonal NSD convolution layer.

Parameters:

stalk_dim (int)
in_channels (int)
hidden_dim (int)
alpha (float, default: 1.0)
context_dim (int | None, default: None)
add_self_loops (bool, default: True)
clamp_val (float, default: 10.0)
use_attention (bool, default: False)
orth_strategy (Literal['cayley', 'fasth'], default: "cayley")

__init__(stalk_dim, in_channels, hidden_dim, alpha=1.0, context_dim=None, add_self_loops=True, clamp_val=10.0, use_attention=False, orth_strategy='cayley')[source]

Initializes an orthogonal NSD convolution layer.

The map generator outputs parameters for either: (1) entries of a skew-symmetric matrix (cayley), (2) Householder vectors (fasth), or (3) attention-based mappings. All parameterisations produce orthogonal d x d restriction maps.

Parameters:

stalk_dim (int) – Stalk dimension and orthogonal restriction-map matrix size.
in_channels (int) – Feature dimension inside each stalk channel.
hidden_dim (int) – Hidden width of the restriction-map generator MLP.
alpha (float, default: 1.0) – Initial learnable diffusion step size. Defaults to 1.0.
context_dim (int | None, default: None) – Width of x_feat. Defaults to d * in_channels when omitted.
add_self_loops (bool, default: True) – If True, self-loops augment the degree used for normalization. Defaults to True.
clamp_val (float, default: 10.0) – Maximum absolute value for clamping Cayley-transform parameters. Defaults to 10.0.
use_attention (bool, default: False) – If True, uses the attention-based Cayley initialization from main. Defaults to False.
orth_strategy (Literal['cayley', 'fasth'], default: "cayley") – “cayley” or “fasth”. Defaults to “cayley”.

get_map_products(x_feat, edge_index)[source]: Precompute self_map and cross_map restriction-map products per edge.

message(z_dst, z_src, self_map, cross_map)[source]

Builds per-edge sheaf Laplacian messages.

Parameters:

z_dst (Tensor) – Destination-node transformed stalks [E, d, f].
z_src (Tensor) – Source-node transformed stalks [E, d, f].
self_map (Tensor) – Normalized F_dst^T F_dst per edge [E, d, d].
cross_map (Tensor) – Normalized F_dst^T F_src per edge [E, d, d].

Returns:

Per-edge messages [E, d, f].

Return type:

Tensor

class NSDModel[source]

Bases: Module

End-to-end Neural Sheaf Diffusion (NSD) model.

The wrapper lifts raw node features into stalk features, applies a stack of NSD convolution layers, and decodes the flattened stalk representation back to the requested output dimension.

Parameters:

in_channels (int)
out_channels (int)
stalk_dim (int, default: 4)
hidden_dim (int, default: 16)
num_layers (int, default: 2)
variant (NSDVariant, default: NSDVariant.GENERAL)
alpha (float, default: 1.0)
add_self_loops (bool, default: True)
orth_strategy (str, default: "cayley")
rank (int, default: 1)
input_dropout (float, default: 0.0)
dropout (float, default: 0.0)
normalize_output (bool, default: True)
jknet (bool, default: False)

__init__(in_channels, out_channels, stalk_dim=4, hidden_dim=16, num_layers=2, variant=NSDVariant.GENERAL, alpha=1.0, add_self_loops=True, orth_strategy='cayley', rank=1, input_dropout=0.0, dropout=0.0, normalize_output=True, jknet=False)[source]

Initializes an NSD model for node-level prediction.

Parameters:

in_channels (int) – Number of raw input features per node.
out_channels (int) – Number of output channels per node (e.g. num classes).
stalk_dim (int, default: 4) – Stalk dimension. Each node is represented internally as a matrix with shape [stalk_dim, hidden_dim].
hidden_dim (int, default: 16) – Feature dimension inside each stalk channel. The encoded node state has size d * hidden_dim.
num_layers (int, default: 2) – Number of NSD convolution layers. Must be positive.
variant (NSDVariant, default: NSDVariant.GENERAL) – Restriction-map family. DIAGONAL is cheapest, GENERAL is most expressive, ORTHOGONAL uses orthogonal maps (via Cayley or Householder parameterisation). GENERAL_ATTENTION and ORTHOGONAL_ATTENTION use an attention-based map initialisation.
alpha (float, default: 1.0) – Initial learnable diffusion step size per layer.
add_self_loops (bool, default: True) – If True, self-loops are added to the graph before computing degree normalization in each layer. Defaults to True.
orth_strategy (str, default: "cayley") – Orthogonality strategy for the ORTHOGONAL variant: “cayley” or “fasth”. Defaults to “cayley”.
rank (int, default: 1) – Rank of each restriction map for the LOW_RANK variant. Must be positive. Ignored for other variants. Defaults to 1.
input_dropout (float, default: 0.0) – Dropout probability applied to raw input features before encoding. Defaults to 0.0.
dropout (float, default: 0.0) – Dropout probability applied to stalk features between layers. Defaults to 0.0.
normalize_output (bool, default: True) – If True, L2-normalise the representation before the decoder (Lv et al., 2021). If jknet is True, each layer’s output is also normalised before concatenation. Defaults to True.
jknet (bool, default: False) – If True, collect hidden states from every layer and concatenate them before the decoder (Xu et al., 2018). Normalization is controlled by normalize_output. Intended for link prediction. Defaults to False.

reset_parameters()[source]

forward(x, edge_index)[source]

Runs the NSD encoder, diffusion layers, and decoder.

Parameters:

x (Tensor) – Raw node features with shape [num_nodes, in_channels].
edge_index (Tensor) – Graph connectivity in COO format with shape [2, num_edges].

Returns:

Node outputs with shape [num_nodes, out_channels].

Return type:

Tensor

class NSDVariant[source]

Bases: Enum

DIAGONAL = 1

GENERAL = 2

ORTHOGONAL = 3

GENERAL_ATTENTION = 4

ORTHOGONAL_ATTENTION = 5

LOW_RANK = 6

property layer_class

property layer_kwargs: dict[str, Any]

build_kwargs(orth_strategy='cayley', rank=1)[source]

Build the full layer keyword-argument dict for this variant.

Parameters:

orth_strategy (Literal['cayley', 'fasth'], default: "cayley")
rank (int, default: 1)

Return type:

dict[str, Any]

NSD layers#

`DiagonalNSDConv`	Diagonal NSD convolution layer.
`GeneralNSDConv`	Generalized NSD convolution layer.
`OrthogonalNSDConv`	Orthogonal NSD convolution layer.

NSD model#

`NSDModel`	End-to-end Neural Sheaf Diffusion (NSD) model.
`NSDVariant`