Large Concept Models (LCMs) are a novel approach to language modeling that operates in a sentence representation space rather than token-level processing. We will begin by discussing how LCMs abstract information hierarchically, processing concepts instead of raw text or speech. We will outline the architectures and training objectives, along with data preparation and segmentation techniques that enable LCMs to capture complex relationships between concepts. We will then explore three key LCM variants: Base-LCM, Diffusion-LCM, and Quantized-LCM. We will also discuss the potential applications of LCMs and limitations that leads to future research directions.

Additional resources:

LCM paper - https://arxiv.org/abs/2412.08821
SONAR paper - https://arxiv.org/abs/2308.11466
Diffusion Models - https://erdem.pl/2023/11/step-by-step-visual-introduction-to-diffusion-models