
TimesFM: Google's Foundation Model for Time-Series Forecasting
Google Research's TimesFM is a pretrained decoder-only foundation model that brings large-scale transformer efficiency to time-series forecasting.
Architecture and applications of transformer neural networks, including BERT, GPT, and their variants for various machine learning tasks.

Google Research's TimesFM is a pretrained decoder-only foundation model that brings large-scale transformer efficiency to time-series forecasting.

MSA is an end-to-end trainable framework that enables LLMs to process 100 million tokens efficiently using sparse attention and latent memory.

A comprehensive technical reference gallery documenting the architectural evolution and specifications of modern open-weight large language models.

BERT-style MLM is a single-step text diffusion process, and extending it to multiple masking steps turns RoBERTa into a workable text generator.
Models compose “seahorse + emoji,” but with no matching token the unembedding snaps to a nearby emoji, causing confident errors and occasional feedback loops.

A large-scale, transformer-only, flow-matching approach makes protein folding simpler while staying competitive and practical.

Embeddings got bigger with Transformers and APIs, but new efficiency techniques and infrastructure mean the future is about smarter—not just larger—dimensions.
A visual, end-to-end demo of a tiny GPT that turns tokens into embeddings, runs them through transformers, and autoregressively predicts the next token to solve a simple sorting task.