
The LLM Architecture Gallery: Mapping the Evolution of Open-Weight Models
A comprehensive technical reference gallery documenting the architectural evolution and specifications of modern open-weight large language models.

A comprehensive technical reference gallery documenting the architectural evolution and specifications of modern open-weight large language models.

BERT-style MLM is a single-step text diffusion process, and extending it to multiple masking steps turns RoBERTa into a workable text generator.
Models compose “seahorse + emoji,” but with no matching token the unembedding snaps to a nearby emoji, causing confident errors and occasional feedback loops.

A large-scale, transformer-only, flow-matching approach makes protein folding simpler while staying competitive and practical.

Embeddings got bigger with Transformers and APIs, but new efficiency techniques and infrastructure mean the future is about smarter—not just larger—dimensions.
A visual, end-to-end demo of a tiny GPT that turns tokens into embeddings, runs them through transformers, and autoregressively predicts the next token to solve a simple sorting task.