GNNs for Social Networks: Influence, Communities, and Misinformation

5 minute read

Published: May 23, 2024

TL;DR: Social networks are massive sparse graphs where structure carries as much signal as content. GNNs unify both: node features (posts, profile) and graph structure (followers, retweets) are jointly processed. Key applications: fake news detection (exploit propagation tree structure), community detection (cluster embedding space), influence prediction, and friend recommendation.

Social influence is inherently relational:

A user’s political views are correlated with their friends’ views (homophily)
Misinformation spreads along retweet chains — the propagation tree matters
Community structure (echo chambers, polarisation) is a global graph property
Influence of an account cannot be measured by its own features alone

GNNs capture these relational patterns — structure that content-only models (text classification, user attribute prediction) miss.

Task 1: Fake News and Misinformation Detection

The propagation graph approach: when a news article is shared, it creates a propagation tree (root → shares → reshares). Each node is a user; each edge is a retweet.

Key observations:

Fake news propagates differently from real news: faster initial spread, shallower tree (bot amplification), then dies out
Real news: slower spread, deeper tree, more diverse users

GNN-FakeNews (Bian et al., 2020): builds two propagation graphs — top-down (spread direction) and bottom-up (source tracing). Runs GNN on both, then combines embeddings with a claim encoder for final classification.

Advantage over content-only methods: two articles with identical content but different propagation patterns → different predictions. Structure provides signal that text alone cannot.

Task 2: Community Detection

Traditional methods: spectral clustering, Louvain algorithm (modularity optimisation). These use only graph structure.

GNN approach: combine node features + graph structure for richer community embeddings.

SEAL (Learning from Subgraphs for Link Prediction): learns community structure implicitly during link prediction training — communities are nodes that tend to be mutually linked.

Graph Autoencoders (GAE/VGAE, Kipf & Welling, 2016):

Z = GCN(A, X) (encode) Â = σ(Z Z^T) (decode: reconstruct adjacency)

Train to reconstruct A from Z. The latent Z captures community structure — nodes in the same community cluster together in latent space. Communities are found by clustering Z.

Why graph autoencoders work for community detection: Two nodes in the same community share many common neighbours. The GCN encoder propagates these shared neighbourhood patterns into similar embeddings. The decoder reconstructing A from Z^T forces Z to encode the block structure of the adjacency (community structure). Clustering the resulting Z recovers communities.

Task 3: Influence Estimation and Viral Prediction

Influence maximisation: which K users to seed to maximise information spread? A combinatorial problem (NP-hard).

GNN approach (Chen et al., 2021): train a GNN to predict the expected spread from a seed set. The GNN takes the seed set (as initial node activations) and propagates via the graph, simulating the cascade. Output: expected reach after T steps.

This replaces expensive Monte Carlo simulation (10,000 cascade simulations per seed set) with a single GNN forward pass.

Viral content prediction: given a post’s initial shares (first 1 hour), predict total reach at 24 hours. The post’s propagation subgraph at 1 hour → GNN → reach prediction. Structure of early spread is highly predictive of final virality.

Task 4: Friend and Follow Recommendation

Link prediction on social graphs: predict (u, v) edge probability = will user u follow user v?

GraphSAGE for link prediction:

Sample neighbourhoods for u and v
Compute embeddings h_u, h_v via GNN
Score = σ(h_u^T h_v) or concat + MLP

On Twitter/Instagram-scale graphs (billions of nodes), neighbourhood sampling (PinSage-style) is necessary.

Scale: Facebook has 3B users, 1T+ edges. Full-graph GNNs are infeasible. Must use minibatch training with neighbourhood sampling.

Heterophily: political/social networks are often heterophilic (users follow people with opposite views to monitor them, debate, or due to bot-following patterns).

Temporal dynamics: social graphs evolve rapidly. Static GNNs must be retrained; TGN-style dynamic models are preferable.

Adversarial manipulation: spammers and bots create synthetic edges to boost influence. GNNs trained on observed graphs may encode these manipulated patterns. Adversarially robust GNNs (GNN-Guard, RobustGCN) add graph cleaning or certified training.

Summary

Task	Graph structure used	Key model
Fake news detection	Propagation tree structure	GNN-FakeNews
Community detection	Adjacency + features	VGAE, node clustering
Influence estimation	Full social graph	GNN cascade simulator
Friend recommendation	User-user graph	GraphSAGE, LightGCN
Bot detection	Follow/retweet graph	GCN + temporal features

Social networks demonstrate that GNNs are not just machine learning tools — they are instruments for understanding and intervening in sociotechnical systems. The structural patterns they capture determine how information, influence, and misinformation propagate through society.

References

Bian, T., Xiao, X., Xu, T., Zhao, P., Huang, W., Rong, Y., & Huang, J. (2020). Rumor Detection on Social Media with Bi-Directional Graph Convolutional Networks. AAAI 2020 (GNN-FakeNews: bidirectional propagation tree GCN for rumour and fake news detection on Twitter).
Kipf, T. N., & Welling, M. (2016). Variational Graph Auto-Encoders. arXiv 2016 (VGAE: variational autoencoder on graphs for unsupervised community detection and link prediction).
Hamilton, W. L., Ying, R., & Leskovec, J. (2017). Inductive Representation Learning on Large Graphs. NeurIPS 2017 (GraphSAGE: inductive node embedding by neighbourhood sampling, widely used for social network tasks including friend recommendation).

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)

Alessio Borgi

GNNs for Social Networks: Influence, Communities, and Misinformation

Task 1: Fake News and Misinformation Detection

Task 2: Community Detection

Task 3: Influence Estimation and Viral Prediction

Task 4: Friend and Follow Recommendation

Summary

References

Share on

You May Also Enjoy

Open Problems and Future Directions in Sheaf Neural Networks

Sheaf GNNs for Molecular Property Prediction

Sheaf GNNs for Heterophilic Node Classification: Benchmarks and Results

Sheaves Meet Attention: Transformer-Inspired Sheaf Models

Alessio Borgi

Why Graphs for Social Networks?

Task 1: Fake News and Misinformation Detection

Task 2: Community Detection

Task 3: Influence Estimation and Viral Prediction

Task 4: Friend and Follow Recommendation

Challenges Specific to Social Networks

Summary

References

Share on

You May Also Enjoy

Open Problems and Future Directions in Sheaf Neural Networks

Sheaf GNNs for Molecular Property Prediction

Sheaf GNNs for Heterophilic Node Classification: Benchmarks and Results

Sheaves Meet Attention: Transformer-Inspired Sheaf Models