top of page

GNNs and Transformers in AI

  • Writer: Samvar Shah
    Samvar Shah
  • Feb 20
  • 1 min read

GNNs & Transformers: 50th Post on this Blog!
GNNs & Transformers: 50th Post on this Blog!

In AI, Graph Neural Networks (GNNs) and Transformers are both widely used but they are designed for different types of data and problems.


GNNs are designed to work with graph-structured data. For example, recommendation systems (users connected to products). GNNs work by passing messages between connected nodes. Each node updates its representation (vector) based on information from its neighbors.


Transformers are neural networks designed for sequence data, especially text.

They rely on a mechanism called self-attention, which allows every token (word) in a sequence to interact with every other token. Instead of local neighbor passing (like GNNs), Transformers allow global connections.


However, mathematically they have similar guiding principles:

Both models represent:

  • Nodes (GNNs)

  • Tokens (Transformers)

as vectors in high-dimensional space


Both rely heavily on matrix multiplication.


Both models are trained using:

  • Loss functions

  • Gradients

  • Backpropagation

They rely on multivariable calculus to compute gradients.


Both use probability:

  • Softmax to create probability distributions

  • Modeling uncertainty in predictions


So they use the mathematical script for their working. Interestingly, attention in Transformers can be viewed as a fully connected graph where every token connects to every other token. This means a Transformer can be seen as a special type of graph neural network with dynamic edge weights.


Did you think of GNNs and Transformers as different and yet so similar? What do you think of Transformer as a GNN?


Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page